syne_tune package

class syne_tune.StoppingCriterion(max_wallclock_time=None, max_num_evaluations=None, max_num_trials_started=None, max_num_trials_completed=None, max_cost=None, max_num_trials_finished=None, min_metric_value=None, max_metric_value=None)[source]

Bases: object

Stopping criterion that can be used in a Tuner, for instance Tuner(stop_criterion=StoppingCriterion(max_wallclock_time=3600), ...).

If several arguments are used, the combined criterion is true whenever one of the atomic criteria is true.

In principle, stop_criterion for Tuner can be any lambda function, but this class should be used with remote launching in order to ensure proper serialization.

  • max_wallclock_time (Optional[float]) – Stop once this wallclock time is reached

  • max_num_evaluations (Optional[int]) – Stop once more than this number of metric records have been reported

  • max_num_trials_started (Optional[int]) – Stop once more than this number of trials have been started

  • max_num_trials_completed (Optional[int]) – Stop once more than this number of trials have been completed. This does not include trials which were stopped or failed

  • max_cost (Optional[float]) – Stop once total cost of evaluations larger than this value

  • max_num_trials_finished (Optional[int]) – Stop once more than this number of trials have finished (i.e., completed, stopped, failed, or stopping)

  • min_metric_value (Optional[Dict[str, float]]) – Dictionary with thresholds for selected metrics. Stop once an evaluation reports a metric value below a threshold

  • max_metric_value (Optional[Dict[str, float]]) – Dictionary with thresholds for selected metrics. Stop once an evaluation reports a metric value above a threshold

max_wallclock_time: float = None
max_num_evaluations: int = None
max_num_trials_started: int = None
max_num_trials_completed: int = None
max_cost: float = None
max_num_trials_finished: int = None
min_metric_value: Optional[Dict[str, float]] = None
max_metric_value: Optional[Dict[str, float]] = None
class syne_tune.Tuner(trial_backend, scheduler, stop_criterion, n_workers, sleep_time=5.0, results_update_interval=10.0, print_update_interval=30.0, max_failures=1, tuner_name=None, asynchronous_scheduling=True, wait_trial_completion_when_stopping=False, callbacks=None, metadata=None, suffix_tuner_name=True, save_tuner=True, start_jobs_without_delay=True, trial_backend_path=None)[source]

Bases: object

Controller of tuning loop, manages interplay between scheduler and trial backend. Also, stopping criterion and number of workers are maintained here.

  • trial_backend (TrialBackend) – Backend for trial evaluations

  • scheduler (TrialScheduler) – Tuning algorithm for making decisions about which trials to start, stop, pause, or resume

  • stop_criterion (Callable[[TuningStatus], bool]) – Tuning stops when this predicates returns True. Called in each iteration with the current tuning status. It is recommended to use StoppingCriterion.

  • n_workers (int) – Number of workers used here. Note that the backend needs to support (at least) this number of workers to be run in parallel

  • sleep_time (float) – Time to sleep when all workers are busy. Defaults to DEFAULT_SLEEP_TIME

  • results_update_interval (float) – Frequency at which results are updated and stored (in seconds). Defaults to 10.

  • print_update_interval (float) – Frequency at which result table is printed. Defaults to 30.

  • max_failures (int) – This many trial execution failures are allowed before the tuning loop is aborted. Defaults to 1

  • tuner_name (Optional[str]) – Name associated with the tuning experiment, default to the name of the entrypoint. Must consists of alpha-digits characters, possibly separated by ‘-’. A postfix with a date time-stamp is added to ensure uniqueness.

  • asynchronous_scheduling (bool) – Whether to use asynchronous scheduling when scheduling new trials. If True, trials are scheduled as soon as a worker is available. If False, the tuner waits that all trials are finished before scheduling a new batch of size n_workers. Default to True.

  • wait_trial_completion_when_stopping (bool) – How to deal with running trials when stopping criterion is met. If True, the tuner waits until all trials are finished. If False, all trials are terminated. Defaults to False.

  • callbacks (Optional[List[TunerCallback]]) – Called at certain times in the tuning loop, for example when a result is seen. The default callback stores results every results_update_interval.

  • metadata (Optional[dict]) – Dictionary of user-metadata that will be persisted in {tuner_path}/{ST_METADATA_FILENAME}, in addition to metadata provided by the user. SMT_TUNER_CREATION_TIMESTAMP is always included which measures the time-stamp when the tuner started to run.

  • suffix_tuner_name (bool) – If True, a timestamp is appended to the provided tuner_name that ensures uniqueness, otherwise the name is left unchanged and is expected to be unique. Defaults to True.

  • save_tuner (bool) – If True, the Tuner object is serialized at the end of tuning, including its dependencies (e.g., scheduler). This allows all details of the experiment to be recovered. Defaults to True.

  • start_jobs_without_delay (bool) –

    Defaults to True. If this is True, the tuner starts new jobs depending on scheduler decisions communicated to the backend. For example, if a trial has just been stopped (by calling backend.stop_trial), the tuner may start a new one immediately, even if the SageMaker training job is still busy due to stopping delays. This can lead to faster experiment runtime, because the backend is temporarily going over its budget.

    If set to False, the tuner always asks the backend for the number of busy workers, which guarantees that we never go over the n_workers budget. This makes a difference for backends where stopping or pausing trials is not immediate (e.g., SageMakerBackend). Not going over budget means that n_workers can be set up to the available quota, without running the risk of an exception due to the quota being exceeded. If you get such exceptions, we recommend to use start_jobs_without_delay=False. Also, if the SageMaker warm pool feature is used, it is recommended to set start_jobs_without_delay=False, since otherwise more than n_workers warm pools will be started, because existing ones are busy with stopping when they should be reassigned.

  • trial_backend_path (Optional[str]) –

    If this is given, the path of trial_backend (where logs and checkpoints of trials are stored) is set to this. Otherwise, it is set to self.tuner_path, so that per-trial information is written to the same path as tuning results.

    If the backend is LocalBackend and the experiment is run remotely, we recommend to set this, since otherwise checkpoints and logs are synced to S3, along with tuning results, which is costly and error-prone.


Launches the tuning.

static load(tuner_path)[source]

metric (Union[str, int, None]) – Indicates which metric to use, can be the index or a name of the metric. default to 0 - first metric defined in the Scheduler

Return type:

Tuple[int, Dict[str, Any]]


the best configuration found while tuning for the metric given and the associated trial-id

class syne_tune.Reporter(add_time=True, add_cost=True)[source]

Bases: object

Callback for reporting metric values from a training script back to Syne Tune. Example:

from syne_tune import Reporter

report = Reporter()
for epoch in range(1, epochs + 1):
    # ...
    report(epoch=epoch, accuracy=accuracy)
  • add_time (bool) – If True (default), the time (in secs) since creation of the Reporter object is reported automatically as ST_WORKER_TIME

  • add_cost (bool) – If True (default), estimated dollar cost since creation of Reporter object is reported automatically as ST_WORKER_COST. This is available for SageMaker backend only. Requires add_time=True.

add_time: bool = True
add_cost: bool = True