Launching Experiments Remotely
As a machine learning practitioner, you operate in a highly competitive landscape. Your success depends to a large extent on whether you can decrease the time to the next decision. In this section, we discuss one important approach, namely how to increase the number of experiments run in parallel.
Note
Imports in our scripts are absolute against the root package
transformer_wikitext2
, so that only the code in
benchmarking.nursery.odsc_tutorial
has to be present. In order to run
them, you need to append <abspath>/odsc_tutorial/
to the PYTHONPATH
environment variable. This is required even if you have installed Syne Tune
from source.
Launching our Study
Here is how we specified and ran experiments of our study. First, we specify a script for launching experiments locally:
from transformer_wikitext2.baselines import methods
from transformer_wikitext2.benchmark_definitions import benchmark_definitions
from syne_tune.experiments.launchers.hpo_main_local import main
if __name__ == "__main__":
main(methods, benchmark_definitions)
This is very simple, as most work is done by the generic
syne_tune.experiments.launchers.hpo_main_local.main()
. Note that hpo_main_local
needs to be chosen, since we use the local backend.
This local launcher script can be used to configure your experiment, given additional command line arguments, as is explained in detail here.
You can use hpo_main.py
to launch experiments locally, but they’ll run
sequentially, one after the other, and you need to have all dependencies
installed locally. A second script is needed in order to launch many
experiments in parallel:
from pathlib import Path
from transformer_wikitext2.baselines import methods
from transformer_wikitext2.benchmark_definitions import benchmark_definitions
from syne_tune.experiments.launchers.launch_remote_local import launch_remote
if __name__ == "__main__":
entry_point = Path(__file__).parent / "hpo_main.py"
source_dependencies = [str(Path(__file__).parent.parent)]
launch_remote(
entry_point=entry_point,
methods=methods,
benchmark_definitions=benchmark_definitions,
source_dependencies=source_dependencies,
)
Once more, all the hard work in done in
syne_tune.experiments.launchers.launch_remote_local.launch_remote()
, where
launch_remote_local
needs to be chosen for the local backend. Most important
is that our previous hpo_main.py
is specified as entry_point
here. Here is
the command to run all experiments of our study in parallel (replace ...
by the
absolute path to odsc_tutorial
):
export PYTHONPATH="${PYTHONPATH}:/.../odsc_tutorial/"
python transformer_wikitext2/local/launch_remote.py \
--experiment_tag odsc-1 --benchmark transformer_wikitext2 --num_seeds 10
This command launches 40 SageMaker training jobs, running 10 random repetitions (seeds) for each of the 4 methods specified in
baselines.py
.Each SageMaker training job uses one
ml.g4dn.12xlarge
AWS instance. You can only run all 40 jobs in parallel if your resource limit for this instance type is 40 or larger. Each training job will run a little longer than 5 hours, as specified bymax_wallclock_time
.You can use
--instance_type
and--max_wallclock_time
command line arguments to change these defaults. However, if you choose an instance type with less than 4 GPUs, the local backend will not be able to run 4 trials in parallel.If
benchmark_definitions.py
defines a single benchmark only, the--benchmark
argument can also be dropped.
When using remote launching, results of your experiments are written to S3, to the default bucket for your AWS account. Once all jobs have finished (which takes a little more than 5 hours if you have sufficient limits, and otherwise longer), you can create the comparative plot shown above, using this script:
from typing import Dict, Any, Optional
import logging
from transformer_wikitext2.baselines import methods
from transformer_wikitext2.benchmark_definitions import benchmark_definitions
from syne_tune.experiments import ComparativeResults, PlotParameters
SETUPS = list(methods.keys())
def metadata_to_setup(metadata: Dict[str, Any]) -> Optional[str]:
return metadata["algorithm"]
if __name__ == "__main__":
logging.getLogger().setLevel(logging.INFO)
experiment_names = ("odsc-1",)
num_runs = 10
download_from_s3 = False # Set ``True`` in order to download files from S3
# Plot parameters across all benchmarks
plot_params = PlotParameters(
xlabel="wall-clock time",
aggregate_mode="iqm_bootstrap",
grid=True,
)
# The creation of ``results`` downloads files from S3 (only if
# ``download_from_s3 == True``), reads the metadata and creates an inverse
# index. If any result files are missing, or there are too many of them,
# warning messages are printed
results = ComparativeResults(
experiment_names=experiment_names,
setups=SETUPS,
num_runs=num_runs,
metadata_to_setup=metadata_to_setup,
plot_params=plot_params,
download_from_s3=download_from_s3,
)
# Create comparative plot (single panel)
benchmark_name = "transformer_wikitext2"
benchmark = benchmark_definitions(sagemaker_backend=False)[benchmark_name]
# These parameters overwrite those given at construction
plot_params = PlotParameters(
metric=benchmark.metric,
mode=benchmark.mode,
ylim=(5, 8),
)
results.plot(
benchmark_name=benchmark_name,
plot_params=plot_params,
file_name=f"./odsc-comparison-local-{benchmark_name}.png",
)
For details about visualization of results in Syne Tune, please consider this tutorial. In a nutshell, this is what happens:
Collect and filter results from all experiments of a study
Group them according to setup (HPO method here), aggregate over seeds
Create plot in which each setup is represented by a curve and confidence bars