syne_tune.backend.trial_backend module

class syne_tune.backend.trial_backend.TrialBackend(delete_checkpoints=False, pass_args_as_json=False)[source]

Bases: object

Interface for backend to execute evaluations of trials.

Parameters:
  • delete_checkpoints (bool) – If True, the checkpoints written by a trial are deleted once the trial is stopped or is registered as completed. Checkpoints of paused trials may also be removed, if the scheduler supports early checkpoint removal. Also, as part of stop_all() called at the end of the tuning loop, all remaining checkpoints are deleted. Defaults to False (no checkpoints are removed).

  • pass_args_as_json (bool) – Normally, the hyperparameter configuration is passed as command line arguments to the trial evaluation script. This works if all hyperparameters have elementary types. If pass_args_as_json == True, the configuration is instead written into a JSON file, whose name is passed as command line argument ST_CONFIG_JSON_FNAME_ARG. The trial evaluation script then loads the configuration from this file. This allows the configuration to contain entries with complex types (e.g., lists or dictionaries), as long as they are JSON-serializable. Defaults to False.

start_trial(config, checkpoint_trial_id=None)[source]

Start new trial with new trial ID

Parameters:
  • config (Dict[str, Any]) – Configuration for new trial

  • checkpoint_trial_id (Optional[int]) – If given, the new trial starts from the checkpoint written by this previous trial

Return type:

TrialResult

Returns:

New trial, which includes new trial ID

copy_checkpoint(src_trial_id, tgt_trial_id)[source]

Copy the checkpoint folder from one trial to the other.

Parameters:
  • src_trial_id (int) – Source trial ID (copy from)

  • tgt_trial_id (int) – Target trial ID (copy to)

delete_checkpoint(trial_id)[source]

Removes checkpoint folder for a trial. It is OK for the folder not to exist.

Parameters:

trial_id (int) – ID of trial for which checkpoint files are deleted

resume_trial(trial_id, new_config=None)[source]

Resume paused trial

Parameters:
  • trial_id (int) – ID of (paused) trial to be resumed

  • new_config (Optional[dict]) – If given, the config maintained in trial.config is replaced by new_config

Return type:

TrialResult

Returns:

Information for resumed trial

pause_trial(trial_id, result=None)[source]

Pauses a running trial

Checks that the operation is valid and calls backend internal implementation to actually pause the trial. If the status is queried after this function, it should be "paused".

Parameters:
  • trial_id (int) – ID of trial to pause

  • result (Optional[dict]) – Result dict based on which scheduler decided to pause the trial

stop_trial(trial_id, result=None)[source]

Stops (and terminates) a running trial

Checks that the operation is valid and calls backend internal implementation to actually stop the trial. f the status is queried after this function, it should be "stopped".

Parameters:
  • trial_id (int) – ID of trial to stop

  • result (Optional[dict]) – Result dict based on which scheduler decided to stop the trial

new_trial_id()[source]
Return type:

int

fetch_status_results(trial_ids)[source]
Parameters:

trial_ids (List[int]) – Trials whose information should be fetched.

Return type:

(Dict[int, Tuple[Trial, str]], List[Tuple[int, dict]])

Returns:

A tuple containing 1) a dictionary from trial-id to Trial and status information; 2) a list of (trial-id, results) pairs for each new result emitted since the last call. The list of results is sorted by the worker time-stamp.

busy_trial_ids()[source]

Returns list of ids for currently busy trials

A trial is busy if its status is in_progress or stopping. If the execution setup is able to run n_workers jobs in parallel, then if this method returns a list of size n, the tuner may start n_workers - n new jobs.

Return type:

List[Tuple[int, str]]

Returns:

List of (trial_id, status)

stdout(trial_id)[source]

Fetch stdout log for trial

Parameters:

trial_id (int) – ID of trial

Return type:

List[str]

Returns:

Lines of the log of the trial (stdout)

stderr(trial_id)[source]

Fetch stderr log for trial

Parameters:

trial_id (int) – ID of trial

Return type:

List[str]

Returns:

Lines of the log of the trial (stderr)

stop_all()[source]

Stop all trials which are in progress.

set_path(results_root=None, tuner_name=None)[source]
Parameters:
  • results_root (Optional[str]) – The local folder that should contain the results of the tuning experiment. Used by Tuner to indicate a desired path where the results should be written to. This is used to unify the location of backend files and Tuner results when possible (in the local backend). By default, the backend does not do anything since not all backends may be able to unify their file locations.

  • tuner_name (Optional[str]) – Name of the tuner, can be used for instance to save checkpoints on remote storage.

entrypoint_path()[source]
Return type:

Path

Returns:

Entrypoint path of script to be executed

set_entrypoint(entry_point)[source]

Update the entrypoint.

Parameters:

entry_point (str) – New path of the entrypoint.

on_tuner_save()[source]

Called at the end of save().