syne_tune.tuner module
- class syne_tune.tuner.Tuner(trial_backend, scheduler, stop_criterion, n_workers, sleep_time=5.0, results_update_interval=10.0, print_update_interval=30.0, max_failures=1, tuner_name=None, asynchronous_scheduling=True, wait_trial_completion_when_stopping=False, callbacks=None, metadata=None, suffix_tuner_name=True, save_tuner=True, start_jobs_without_delay=True, trial_backend_path=None)[source]
Bases:
object
Controller of tuning loop, manages interplay between scheduler and trial backend. Also, stopping criterion and number of workers are maintained here.
- Parameters:
trial_backend (
TrialBackend
) – Backend for trial evaluationsscheduler (
TrialScheduler
) – Tuning algorithm for making decisions about which trials to start, stop, pause, or resumestop_criterion (
Callable
[[TuningStatus
],bool
]) – Tuning stops when this predicates returnsTrue
. Called in each iteration with the current tuning status. It is recommended to useStoppingCriterion
.n_workers (
int
) – Number of workers used here. Note that the backend needs to support (at least) this number of workers to be run in parallelsleep_time (
float
) – Time to sleep when all workers are busy. Defaults toDEFAULT_SLEEP_TIME
results_update_interval (
float
) – Frequency at which results are updated and stored (in seconds). Defaults to 10.print_update_interval (
float
) – Frequency at which result table is printed. Defaults to 30.max_failures (
int
) – This many trial execution failures are allowed before the tuning loop is aborted. Defaults to 1tuner_name (
Optional
[str
]) – Name associated with the tuning experiment, default to the name of the entrypoint. Must consists of alpha-digits characters, possibly separated by ‘-’. A postfix with a date time-stamp is added to ensure uniqueness.asynchronous_scheduling (
bool
) – Whether to use asynchronous scheduling when scheduling new trials. IfTrue
, trials are scheduled as soon as a worker is available. IfFalse
, the tuner waits that all trials are finished before scheduling a new batch of sizen_workers
. Default toTrue
.wait_trial_completion_when_stopping (
bool
) – How to deal with running trials when stopping criterion is met. IfTrue
, the tuner waits until all trials are finished. IfFalse
, all trials are terminated. Defaults toFalse
.callbacks (
Optional
[List
[TunerCallback
]]) – Called at certain times in the tuning loop, for example when a result is seen. The default callback stores results everyresults_update_interval
.metadata (
Optional
[dict
]) – Dictionary of user-metadata that will be persisted in{tuner_path}/{ST_METADATA_FILENAME}
, in addition to metadata provided by the user.SMT_TUNER_CREATION_TIMESTAMP
is always included which measures the time-stamp when the tuner started to run.suffix_tuner_name (
bool
) – IfTrue
, a timestamp is appended to the providedtuner_name
that ensures uniqueness, otherwise the name is left unchanged and is expected to be unique. Defaults toTrue
.save_tuner (
bool
) – IfTrue
, theTuner
object is serialized at the end of tuning, including its dependencies (e.g., scheduler). This allows all details of the experiment to be recovered. Defaults toTrue
.start_jobs_without_delay (
bool
) –Defaults to
True
. If this isTrue
, the tuner starts new jobs depending on scheduler decisions communicated to the backend. For example, if a trial has just been stopped (by callingbackend.stop_trial
), the tuner may start a new one immediately, even if the SageMaker training job is still busy due to stopping delays. This can lead to faster experiment runtime, because the backend is temporarily going over its budget.If set to
False
, the tuner always asks the backend for the number of busy workers, which guarantees that we never go over then_workers
budget. This makes a difference for backends where stopping or pausing trials is not immediate (e.g.,SageMakerBackend
). Not going over budget means thatn_workers
can be set up to the available quota, without running the risk of an exception due to the quota being exceeded. If you get such exceptions, we recommend to usestart_jobs_without_delay=False
. Also, if the SageMaker warm pool feature is used, it is recommended to setstart_jobs_without_delay=False
, since otherwise more thann_workers
warm pools will be started, because existing ones are busy with stopping when they should be reassigned.trial_backend_path (
Optional
[str
]) –If this is given, the path of
trial_backend
(where logs and checkpoints of trials are stored) is set to this. Otherwise, it is set toself.tuner_path
, so that per-trial information is written to the same path as tuning results.If the backend is
LocalBackend
and the experiment is run remotely, we recommend to set this, since otherwise checkpoints and logs are synced to S3, along with tuning results, which is costly and error-prone.
- best_config(metric=0)[source]
- Parameters:
metric (
Union
[str
,int
,None
]) – Indicates which metric to use, can be the index or a name of the metric. default to 0 - first metric defined in the Scheduler- Return type:
Tuple
[int
,Dict
[str
,Any
]]- Returns:
the best configuration found while tuning for the metric given and the associated trial-id