syne_tune.backend package
- class syne_tune.backend.LocalBackend(entry_point, delete_checkpoints=False, pass_args_as_json=False, rotate_gpus=True, num_gpus_per_trial=1, gpus_to_use=None)[source]
Bases:
TrialBackend
A backend running locally by spawning sub-process concurrently. Note that no resource management is done so the concurrent number of trials should be adjusted to the machine capacity.
Additional arguments on top of parent class
TrialBackend
:- Parameters:
entry_point (
str
) – Path to Python main file to be tunedrotate_gpus (
bool
) – In case several GPUs are present, each trial is scheduled on a different GPU. A new trial is preferentially scheduled on a free GPU, and otherwise the GPU with least prior assignments is chosen. IfFalse
, then all GPUs are used at the same time for all trials. Defaults toTrue
.num_gpus_per_trial (
int
) – Number of GPUs to be allocated to each trial. Must be not larger than the total number of GPUs available. Defaults to 1gpus_to_use (
Optional
[List
[int
]]) – If this is given, the backend only uses GPUs in this lists (non-negative ints). Entries must be inrange(get_num_gpus())
. Defaults to using all GPUs.
- trial_path(trial_id)[source]
- Parameters:
trial_id (
int
) – ID of trial- Return type:
Path
- Returns:
Directory where files related to trial are written to
- checkpoint_trial_path(trial_id)[source]
- Parameters:
trial_id (
int
) – ID of trial- Return type:
Path
- Returns:
Directory where checkpoints for trial are written to and read from
- copy_checkpoint(src_trial_id, tgt_trial_id)[source]
Copy the checkpoint folder from one trial to the other.
- Parameters:
src_trial_id (
int
) – Source trial ID (copy from)tgt_trial_id (
int
) – Target trial ID (copy to)
- delete_checkpoint(trial_id)[source]
Removes checkpoint folder for a trial. It is OK for the folder not to exist.
- Parameters:
trial_id (
int
) – ID of trial for which checkpoint files are deleted
- busy_trial_ids()[source]
Returns list of ids for currently busy trials
A trial is busy if its status is
in_progress
orstopping
. If the execution setup is able to runn_workers
jobs in parallel, then if this method returns a list of sizen
, the tuner may startn_workers - n
new jobs.- Return type:
List
[Tuple
[int
,str
]]- Returns:
List of
(trial_id, status)
- stdout(trial_id)[source]
Fetch
stdout
log for trial- Parameters:
trial_id (
int
) – ID of trial- Return type:
List
[str
]- Returns:
Lines of the log of the trial (stdout)
- stderr(trial_id)[source]
Fetch
stderr
log for trial- Parameters:
trial_id (
int
) – ID of trial- Return type:
List
[str
]- Returns:
Lines of the log of the trial (stderr)
- set_path(results_root=None, tuner_name=None)[source]
- Parameters:
results_root (
Optional
[str
]) – The local folder that should contain the results of the tuning experiment. Used byTuner
to indicate a desired path where the results should be written to. This is used to unify the location of backend files andTuner
results when possible (in the local backend). By default, the backend does not do anything since not all backends may be able to unify their file locations.tuner_name (
Optional
[str
]) – Name of the tuner, can be used for instance to save checkpoints on remote storage.
- class syne_tune.backend.PythonBackend(tune_function, config_space, rotate_gpus=True, delete_checkpoints=False)[source]
Bases:
LocalBackend
A backend that supports the tuning of Python functions (if you rather want to tune an endpoint script such as “train.py”, then you should use
LocalBackend
). The functiontune_function
should be serializable, should not reference any global variable or module and should have as arguments a subset of the keys ofconfig_space
. When deserializing, a md5 is checked to ensure consistency.For instance, the following function is a valid way of defining a backend on top of a simple function:
from syne_tune.backend import PythonBackend from syne_tune.config_space import uniform def f(x, epochs): import logging import time from syne_tune import Reporter root = logging.getLogger() root.setLevel(logging.DEBUG) reporter = Reporter() for i in range(epochs): reporter(epoch=i + 1, y=x + i) config_space = { "x": uniform(-10, 10), "epochs": 5, } backend = PythonBackend(tune_function=f, config_space=config_space)
See
examples/launch_height_python_backend.py
for a complete example.Additional arguments on top of parent class
LocalBackend
:- Parameters:
tune_function (
Callable
) – Python function to be tuned. The function must call Syne Tune reporter to report metrics and be serializable, imports should be performed inside the function body.config_space (
Dict
[str
,object
]) – Configuration space corresponding to arguments oftune_function
- property tune_function_path: Path
- set_path(results_root=None, tuner_name=None)[source]
- Parameters:
results_root (
Optional
[str
]) – The local folder that should contain the results of the tuning experiment. Used byTuner
to indicate a desired path where the results should be written to. This is used to unify the location of backend files andTuner
results when possible (in the local backend). By default, the backend does not do anything since not all backends may be able to unify their file locations.tuner_name (
Optional
[str
]) – Name of the tuner, can be used for instance to save checkpoints on remote storage.
- class syne_tune.backend.SageMakerBackend(sm_estimator, metrics_names=None, s3_path=None, delete_checkpoints=False, pass_args_as_json=False, **sagemaker_fit_kwargs)[source]
Bases:
TrialBackend
This backend executes each trial evaluation as a separate SageMaker training job, using
sm_estimator
as estimator.Checkpoints are written to and loaded from S3, using
checkpoint_s3_uri
of the estimator.Compared to
LocalBackend
, this backend can run any number of jobs in parallel (given sufficient resources), and any instance type can be used.This backend allows to select the instance type and count for a trial evaluation, by passing values in the configuration, using names
ST_INSTANCE_TYPE
andST_INSTANCE_COUNT
. If these are given in the configuration, they overwrite the default insm_estimator
. This allows for tuning instance type and count along with the hyperparameter configuration.Additional arguments on top of parent class
TrialBackend
:- Parameters:
sm_estimator (
Framework
) – SageMaker estimator for trial evaluations.metrics_names (
Optional
[List
[str
]]) – Names of metrics passed toreport
, used to plot live curve in SageMaker (optional, only used for visualization)s3_path (
Optional
[str
]) – S3 base path used for checkpointing. The full path also involves the tuner name and thetrial_id
. The default base path is the S3 bucket associated with the SageMaker accountsagemaker_fit_kwargs – Extra arguments that passed to
sagemaker.estimator.Framework
when fitting the job, for instance{'train': 's3://my-data-bucket/path/to/my/training/data'}
- property sm_client
- busy_trial_ids()[source]
Returns list of ids for currently busy trials
A trial is busy if its status is
in_progress
orstopping
. If the execution setup is able to runn_workers
jobs in parallel, then if this method returns a list of sizen
, the tuner may startn_workers - n
new jobs.- Return type:
List
[Tuple
[int
,str
]]- Returns:
List of
(trial_id, status)
- stdout(trial_id)[source]
Fetch
stdout
log for trial- Parameters:
trial_id (
int
) – ID of trial- Return type:
List
[str
]- Returns:
Lines of the log of the trial (stdout)
- stderr(trial_id)[source]
Fetch
stderr
log for trial- Parameters:
trial_id (
int
) – ID of trial- Return type:
List
[str
]- Returns:
Lines of the log of the trial (stderr)
- property source_dir: str | None
- set_entrypoint(entry_point)[source]
Update the entrypoint.
- Parameters:
entry_point (
str
) – New path of the entrypoint.
- copy_checkpoint(src_trial_id, tgt_trial_id)[source]
Copy the checkpoint folder from one trial to the other.
- Parameters:
src_trial_id (
int
) – Source trial ID (copy from)tgt_trial_id (
int
) – Target trial ID (copy to)
- delete_checkpoint(trial_id)[source]
Removes checkpoint folder for a trial. It is OK for the folder not to exist.
- Parameters:
trial_id (
int
) – ID of trial for which checkpoint files are deleted
Subpackages
- syne_tune.backend.python_backend package
- syne_tune.backend.sagemaker_backend package
- Submodules
- syne_tune.backend.sagemaker_backend.custom_framework module
- syne_tune.backend.sagemaker_backend.instance_info module
- syne_tune.backend.sagemaker_backend.sagemaker_backend module
- syne_tune.backend.sagemaker_backend.sagemaker_utils module
default_config()
default_sagemaker_session()
get_log()
decode_sagemaker_hyperparameter()
sagemaker_search()
metric_definitions_from_names()
add_metric_definitions_to_sagemaker_estimator()
add_syne_tune_dependency()
sagemaker_fit()
get_execution_role()
untar()
download_sagemaker_results()
map_identifier_limited_length()
s3_copy_objects_recursively()
s3_delete_objects_recursively()
s3_download_files_recursively()
backend_path_not_synced_to_s3()
- Submodules
- syne_tune.backend.simulator_backend package
Submodules
- syne_tune.backend.local_backend module
- syne_tune.backend.time_keeper module
- syne_tune.backend.trial_backend module
TrialBackend
TrialBackend.start_trial()
TrialBackend.copy_checkpoint()
TrialBackend.delete_checkpoint()
TrialBackend.resume_trial()
TrialBackend.pause_trial()
TrialBackend.stop_trial()
TrialBackend.new_trial_id()
TrialBackend.fetch_status_results()
TrialBackend.busy_trial_ids()
TrialBackend.stdout()
TrialBackend.stderr()
TrialBackend.stop_all()
TrialBackend.set_path()
TrialBackend.entrypoint_path()
TrialBackend.set_entrypoint()
TrialBackend.on_tuner_save()
- syne_tune.backend.trial_status module