syne_tune.backend.trial_backend module
- class syne_tune.backend.trial_backend.TrialBackend(delete_checkpoints=False, pass_args_as_json=False)[source]
Bases:
object
Interface for backend to execute evaluations of trials.
- Parameters:
delete_checkpoints (
bool
) – IfTrue
, the checkpoints written by a trial are deleted once the trial is stopped or is registered as completed. Checkpoints of paused trials may also be removed, if the scheduler supports early checkpoint removal. Also, as part ofstop_all()
called at the end of the tuning loop, all remaining checkpoints are deleted. Defaults toFalse
(no checkpoints are removed).pass_args_as_json (
bool
) – Normally, the hyperparameter configuration is passed as command line arguments to the trial evaluation script. This works if all hyperparameters have elementary types. Ifpass_args_as_json == True
, the configuration is instead written into a JSON file, whose name is passed as command line argumentST_CONFIG_JSON_FNAME_ARG
. The trial evaluation script then loads the configuration from this file. This allows the configuration to contain entries with complex types (e.g., lists or dictionaries), as long as they are JSON-serializable. Defaults toFalse
.
- start_trial(config, checkpoint_trial_id=None)[source]
Start new trial with new trial ID
- Parameters:
config (
Dict
[str
,Any
]) – Configuration for new trialcheckpoint_trial_id (
Optional
[int
]) – If given, the new trial starts from the checkpoint written by this previous trial
- Return type:
- Returns:
New trial, which includes new trial ID
- copy_checkpoint(src_trial_id, tgt_trial_id)[source]
Copy the checkpoint folder from one trial to the other.
- Parameters:
src_trial_id (
int
) – Source trial ID (copy from)tgt_trial_id (
int
) – Target trial ID (copy to)
- delete_checkpoint(trial_id)[source]
Removes checkpoint folder for a trial. It is OK for the folder not to exist.
- Parameters:
trial_id (
int
) – ID of trial for which checkpoint files are deleted
- resume_trial(trial_id, new_config=None)[source]
Resume paused trial
- Parameters:
trial_id (
int
) – ID of (paused) trial to be resumednew_config (
Optional
[dict
]) – If given, the config maintained intrial.config
is replaced bynew_config
- Return type:
- Returns:
Information for resumed trial
- pause_trial(trial_id, result=None)[source]
Pauses a running trial
Checks that the operation is valid and calls backend internal implementation to actually pause the trial. If the status is queried after this function, it should be
"paused"
.- Parameters:
trial_id (
int
) – ID of trial to pauseresult (
Optional
[dict
]) – Result dict based on which scheduler decided to pause the trial
- stop_trial(trial_id, result=None)[source]
Stops (and terminates) a running trial
Checks that the operation is valid and calls backend internal implementation to actually stop the trial. f the status is queried after this function, it should be
"stopped"
.- Parameters:
trial_id (
int
) – ID of trial to stopresult (
Optional
[dict
]) – Result dict based on which scheduler decided to stop the trial
- fetch_status_results(trial_ids)[source]
- Parameters:
trial_ids (
List
[int
]) – Trials whose information should be fetched.- Return type:
(
Dict
[int
,Tuple
[Trial
,str
]],List
[Tuple
[int
,dict
]])- Returns:
A tuple containing 1) a dictionary from trial-id to Trial and status information; 2) a list of (trial-id, results) pairs for each new result emitted since the last call. The list of results is sorted by the worker time-stamp.
- busy_trial_ids()[source]
Returns list of ids for currently busy trials
A trial is busy if its status is
in_progress
orstopping
. If the execution setup is able to runn_workers
jobs in parallel, then if this method returns a list of sizen
, the tuner may startn_workers - n
new jobs.- Return type:
List
[Tuple
[int
,str
]]- Returns:
List of
(trial_id, status)
- stdout(trial_id)[source]
Fetch
stdout
log for trial- Parameters:
trial_id (
int
) – ID of trial- Return type:
List
[str
]- Returns:
Lines of the log of the trial (stdout)
- stderr(trial_id)[source]
Fetch
stderr
log for trial- Parameters:
trial_id (
int
) – ID of trial- Return type:
List
[str
]- Returns:
Lines of the log of the trial (stderr)
- set_path(results_root=None, tuner_name=None)[source]
- Parameters:
results_root (
Optional
[str
]) – The local folder that should contain the results of the tuning experiment. Used byTuner
to indicate a desired path where the results should be written to. This is used to unify the location of backend files andTuner
results when possible (in the local backend). By default, the backend does not do anything since not all backends may be able to unify their file locations.tuner_name (
Optional
[str
]) – Name of the tuner, can be used for instance to save checkpoints on remote storage.