syne_tune.callbacks.hyperband_remove_checkpoints_callback module

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.TrialStatus[source]

Bases: object

RUNNING = 'RUNNING'

PAUSED_WITH_CHECKPOINT = 'PAUSED-WITH-CP'

PAUSED_NO_CHECKPOINT = 'PAUSED-NO-CP'

STOPPED_OR_COMPLETED = 'STOPPED-COMPLETED'

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.BetaBinomialEstimator(beta_mean, beta_size)[source]

Bases: object

Estimator of the probability \(p = P(X = 1)\) for a variable \(X\) with Bernoulli distribution. This is using a Beta prior, which is conjugate to the binomial likelihood. The prior is parameterized by effective sample size beta_size (\(a + b\)) and mean beta_mean (\(a / (a + b)\)).

update(data)[source]

property num_one: int

property num_total

posterior_mean()[source]

Return type:: float

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.TrialInformation(trial_id, level, rank, rung_len, score_val=None)[source]

Bases: object

trial_id: str

level: int

rank: int

rung_len: int

score_val: Optional[float] = None

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.HyperbandRemoveCheckpointsCommon(max_num_checkpoints, max_wallclock_time, metric, resource_attr, mode)[source]

Bases: TunerCallback

Common base class for HyperbandRemoveCheckpointsCallback and HyperbandRemoveCheckpointsBaselineCallback.

on_tuning_start(tuner)[source]

Called at start of tuning loop

Parameters:: tuner – Tuner object

property num_checkpoints_removed: int

on_loop_end()[source]

Called at end of each tuning loop iteration

This is done before the loop stopping condition is checked and acted upon.

on_trial_complete(trial, result)[source]

Called when a trial completes (Status.completed)

The arguments here also have been passed to scheduler.on_trial_complete, before this call here.

Parameters:

trial (Trial) – Trial that just completed.
result (Dict[str, Any]) – Last result obtained.

on_trial_result(trial, status, result, decision)[source]

Called when a new result (reported by a trial) is observed

The arguments here are inputs or outputs of scheduler.on_trial_result (called just before).

Parameters:

trial (Trial) – Trial whose report has been received
status (str) – Status of trial before scheduler.on_trial_result has been called
result (Dict[str, Any]) – Result dict received
decision (str) – Decision returned by scheduler.on_trial_result

on_start_trial(trial)[source]

Called just after a new trials is started

Parameters:: trial (Trial) – Trial which has just been started

on_resume_trial(trial)[source]

Called just after a trial is resumed

Parameters:: trial (Trial) – Trial which has just been resumed

trials_resumed_without_checkpoint()[source]

Return type:: List[Tuple[str, int]]
Returns:: List of (trial_id, level) for trials which were resumed, even though their checkpoint was removed

extra_results()[source]

Return type:: Dict[str, Any]
Returns:: Dictionary containing information which can be appended to results written out

static extra_results_keys()[source]

Return type:: List[str]

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.HyperbandRemoveCheckpointsCallback(max_num_checkpoints, max_wallclock_time, metric, resource_attr, mode, approx_steps=25, prior_beta_mean=0.33, prior_beta_size=2, min_data_at_rung=5)[source]

Bases: HyperbandRemoveCheckpointsCommon

Implements speculative early removal of checkpoints of paused trials for HyperbandScheduler (only for types which pause trials at rung levels).

In this scheduler, any paused trial can in principle be resumed in the future, which is why we remove checkpoints speculatively. The idea is to keep the total number of checkpoints no larger than max_num_checkpoints. If this limit is reached, we rank all currently paused trials which still have a checkpoint and remove checkpoints for those with lowest scores. If a trial is resumed whose checkpoint has been removed, we have to train from scratch, at a cost proportional to the rung level the trial is paused at. The score is an approximation to this expected cost, the product of rung level and probability of getting resumed. This probability depends on the current rung size, the rank of the trial in the rung, and both the time spent and remaining for the experiment, so we need max_wallclock_time. Details are given in a technical report.

The probability of getting resumed also depends on the probability \(p_r\) that a new trial arriving at rung \(r\) ranks better than an existing paused one with a checkpoint. These probabilities are estimated here. For each new arrival at a rung, we obtain one datapoint for every paused trial with checkpoint there. We use Bayesian estimators with Beta prior given by mean prior_beta_mean and sample size prior_beta_size. The mean should be \(< 1/2\)). We also run an estimator for an overall probability \(p\), which is fed by all datapoints. This estimator is used as long as there are less than \(min_data_at_rung\) datapoints at rung \(r\).

Parameters:

max_num_checkpoints (int) – Once the total number of checkpoints surpasses this number, we remove some.
max_wallclock_time (int) – Maximum time of the experiment
metric (str) – Name of metric in result of on_trial_result()
resource_attr (str) – Name of resource attribute in result of on_trial_result()
mode (str) – “min” or “max”
approx_steps (int) – Number of approximation steps in score computation. Computations scale cubically in this number. Defaults to 25
prior_beta_mean (float) – Parameter of Beta prior for estimators. Defaults to 0.33
prior_beta_size (float) – Parameter of Beta prior for estimators. Defaults to 2
min_data_at_rung (int) – See above. Defaults to 5

on_tuning_start(tuner)[source]

Called at start of tuning loop

Parameters:: tuner – Tuner object

estimator_for_rung(level)[source]

Return type:: BetaBinomialEstimator

on_trial_result(trial, status, result, decision)[source]

Called when a new result (reported by a trial) is observed

The arguments here are inputs or outputs of scheduler.on_trial_result (called just before).

Parameters:

trial (Trial) – Trial whose report has been received
status (str) – Status of trial before scheduler.on_trial_result has been called
result (Dict[str, Any]) – Result dict received
decision (str) – Decision returned by scheduler.on_trial_result

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.HyperbandRemoveCheckpointsBaselineCallback(max_num_checkpoints, max_wallclock_time, metric, resource_attr, mode, baseline=None)[source]

Bases: HyperbandRemoveCheckpointsCommon

Implements some simple baselines to compare with HyperbandRemoveCheckpointsCallback.

Parameters:

max_num_checkpoints (int) – Once the total number of checkpoints surpasses this number, we remove some.
max_wallclock_time (int) – Maximum time of the experiment
metric (str) – Name of metric in result of on_trial_result()
resource_attr (str) – Name of resource attribute in result of on_trial_result()
mode (str) – “min” or “max”
baseline (Optional[str]) –
Type of baseline. Defaults to “by_level”
- ”random”: Select random paused trial with checkpoint
- ”by_level”: Select paused trial (with checkpoint) on lowest rung level,
  and then of worst rank