syne_tune.backend package

class syne_tune.backend.LocalBackend(entry_point, delete_checkpoints=False, pass_args_as_json=False, rotate_gpus=True, num_gpus_per_trial=1, gpus_to_use=None)[source]

Bases: TrialBackend

A backend running locally by spawning sub-process concurrently. Note that no resource management is done so the concurrent number of trials should be adjusted to the machine capacity.

Additional arguments on top of parent class TrialBackend:

Parameters:
  • entry_point (str) – Path to Python main file to be tuned

  • rotate_gpus (bool) – In case several GPUs are present, each trial is scheduled on a different GPU. A new trial is preferentially scheduled on a free GPU, and otherwise the GPU with least prior assignments is chosen. If False, then all GPUs are used at the same time for all trials. Defaults to True.

  • num_gpus_per_trial (int) – Number of GPUs to be allocated to each trial. Must be not larger than the total number of GPUs available. Defaults to 1

  • gpus_to_use (Optional[List[int]]) – If this is given, the backend only uses GPUs in this lists (non-negative ints). Entries must be in range(get_num_gpus()). Defaults to using all GPUs.

trial_path(trial_id)[source]
Parameters:

trial_id (int) – ID of trial

Return type:

Path

Returns:

Directory where files related to trial are written to

checkpoint_trial_path(trial_id)[source]
Parameters:

trial_id (int) – ID of trial

Return type:

Path

Returns:

Directory where checkpoints for trial are written to and read from

copy_checkpoint(src_trial_id, tgt_trial_id)[source]

Copy the checkpoint folder from one trial to the other.

Parameters:
  • src_trial_id (int) – Source trial ID (copy from)

  • tgt_trial_id (int) – Target trial ID (copy to)

delete_checkpoint(trial_id)[source]

Removes checkpoint folder for a trial. It is OK for the folder not to exist.

Parameters:

trial_id (int) – ID of trial for which checkpoint files are deleted

busy_trial_ids()[source]

Returns list of ids for currently busy trials

A trial is busy if its status is in_progress or stopping. If the execution setup is able to run n_workers jobs in parallel, then if this method returns a list of size n, the tuner may start n_workers - n new jobs.

Return type:

List[Tuple[int, str]]

Returns:

List of (trial_id, status)

stdout(trial_id)[source]

Fetch stdout log for trial

Parameters:

trial_id (int) – ID of trial

Return type:

List[str]

Returns:

Lines of the log of the trial (stdout)

stderr(trial_id)[source]

Fetch stderr log for trial

Parameters:

trial_id (int) – ID of trial

Return type:

List[str]

Returns:

Lines of the log of the trial (stderr)

set_path(results_root=None, tuner_name=None)[source]
Parameters:
  • results_root (Optional[str]) – The local folder that should contain the results of the tuning experiment. Used by Tuner to indicate a desired path where the results should be written to. This is used to unify the location of backend files and Tuner results when possible (in the local backend). By default, the backend does not do anything since not all backends may be able to unify their file locations.

  • tuner_name (Optional[str]) – Name of the tuner, can be used for instance to save checkpoints on remote storage.

entrypoint_path()[source]
Return type:

Path

Returns:

Entrypoint path of script to be executed

set_entrypoint(entry_point)[source]

Update the entrypoint.

Parameters:

entry_point (str) – New path of the entrypoint.

class syne_tune.backend.PythonBackend(tune_function, config_space, rotate_gpus=True, delete_checkpoints=False)[source]

Bases: LocalBackend

A backend that supports the tuning of Python functions (if you rather want to tune an endpoint script such as “train.py”, then you should use LocalBackend). The function tune_function should be serializable, should not reference any global variable or module and should have as arguments a subset of the keys of config_space. When deserializing, a md5 is checked to ensure consistency.

For instance, the following function is a valid way of defining a backend on top of a simple function:

from syne_tune.backend import PythonBackend
from syne_tune.config_space import uniform

def f(x, epochs):
    import logging
    import time
    from syne_tune import Reporter
    root = logging.getLogger()
    root.setLevel(logging.DEBUG)
    reporter = Reporter()
    for i in range(epochs):
        reporter(epoch=i + 1, y=x + i)

config_space = {
    "x": uniform(-10, 10),
    "epochs": 5,
}
backend = PythonBackend(tune_function=f, config_space=config_space)

See examples/launch_height_python_backend.py for a complete example.

Additional arguments on top of parent class LocalBackend:

Parameters:
  • tune_function (Callable) – Python function to be tuned. The function must call Syne Tune reporter to report metrics and be serializable, imports should be performed inside the function body.

  • config_space (Dict[str, object]) – Configuration space corresponding to arguments of tune_function

property tune_function_path: Path
set_path(results_root=None, tuner_name=None)[source]
Parameters:
  • results_root (Optional[str]) – The local folder that should contain the results of the tuning experiment. Used by Tuner to indicate a desired path where the results should be written to. This is used to unify the location of backend files and Tuner results when possible (in the local backend). By default, the backend does not do anything since not all backends may be able to unify their file locations.

  • tuner_name (Optional[str]) – Name of the tuner, can be used for instance to save checkpoints on remote storage.

save_tune_function(tune_function)[source]
class syne_tune.backend.SageMakerBackend(sm_estimator, metrics_names=None, s3_path=None, delete_checkpoints=False, pass_args_as_json=False, **sagemaker_fit_kwargs)[source]

Bases: TrialBackend

This backend executes each trial evaluation as a separate SageMaker training job, using sm_estimator as estimator.

Checkpoints are written to and loaded from S3, using checkpoint_s3_uri of the estimator.

Compared to LocalBackend, this backend can run any number of jobs in parallel (given sufficient resources), and any instance type can be used.

This backend allows to select the instance type and count for a trial evaluation, by passing values in the configuration, using names ST_INSTANCE_TYPE and ST_INSTANCE_COUNT. If these are given in the configuration, they overwrite the default in sm_estimator. This allows for tuning instance type and count along with the hyperparameter configuration.

Additional arguments on top of parent class TrialBackend:

Parameters:
  • sm_estimator (Framework) – SageMaker estimator for trial evaluations.

  • metrics_names (Optional[List[str]]) – Names of metrics passed to report, used to plot live curve in SageMaker (optional, only used for visualization)

  • s3_path (Optional[str]) – S3 base path used for checkpointing. The full path also involves the tuner name and the trial_id. The default base path is the S3 bucket associated with the SageMaker account

  • sagemaker_fit_kwargs – Extra arguments that passed to sagemaker.estimator.Framework when fitting the job, for instance {'train': 's3://my-data-bucket/path/to/my/training/data'}

property sm_client
add_metric_definitions_to_sagemaker_estimator(metrics_names)[source]
busy_trial_ids()[source]

Returns list of ids for currently busy trials

A trial is busy if its status is in_progress or stopping. If the execution setup is able to run n_workers jobs in parallel, then if this method returns a list of size n, the tuner may start n_workers - n new jobs.

Return type:

List[Tuple[int, str]]

Returns:

List of (trial_id, status)

stdout(trial_id)[source]

Fetch stdout log for trial

Parameters:

trial_id (int) – ID of trial

Return type:

List[str]

Returns:

Lines of the log of the trial (stdout)

stderr(trial_id)[source]

Fetch stderr log for trial

Parameters:

trial_id (int) – ID of trial

Return type:

List[str]

Returns:

Lines of the log of the trial (stderr)

property source_dir: str | None
set_entrypoint(entry_point)[source]

Update the entrypoint.

Parameters:

entry_point (str) – New path of the entrypoint.

entrypoint_path()[source]
Return type:

Path

Returns:

Entrypoint path of script to be executed

initialize_sagemaker_session()[source]
copy_checkpoint(src_trial_id, tgt_trial_id)[source]

Copy the checkpoint folder from one trial to the other.

Parameters:
  • src_trial_id (int) – Source trial ID (copy from)

  • tgt_trial_id (int) – Target trial ID (copy to)

delete_checkpoint(trial_id)[source]

Removes checkpoint folder for a trial. It is OK for the folder not to exist.

Parameters:

trial_id (int) – ID of trial for which checkpoint files are deleted

set_path(results_root=None, tuner_name=None)[source]

For this backend, it is mandatory to call this method passing tuner_name before the backend is used. results_root is ignored here.

on_tuner_save()[source]

Called at the end of save().

Subpackages

Submodules