syne_tune.callbacks.hyperband_remove_checkpoints_callback module

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.TrialStatus[source]

Bases: object

RUNNING = 'RUNNING'
PAUSED_WITH_CHECKPOINT = 'PAUSED-WITH-CP'
PAUSED_NO_CHECKPOINT = 'PAUSED-NO-CP'
STOPPED_OR_COMPLETED = 'STOPPED-COMPLETED'
class syne_tune.callbacks.hyperband_remove_checkpoints_callback.BetaBinomialEstimator(beta_mean, beta_size)[source]

Bases: object

Estimator of the probability \(p = P(X = 1)\) for a variable \(X\) with Bernoulli distribution. This is using a Beta prior, which is conjugate to the binomial likelihood. The prior is parameterized by effective sample size beta_size (\(a + b\)) and mean beta_mean (\(a / (a + b)\)).

update(data)[source]
property num_one: int
property num_total
posterior_mean()[source]
Return type:

float

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.TrialInformation(trial_id, level, rank, rung_len, score_val=None)[source]

Bases: object

trial_id: str
level: int
rank: int
rung_len: int
score_val: Optional[float] = None
class syne_tune.callbacks.hyperband_remove_checkpoints_callback.HyperbandRemoveCheckpointsCommon(max_num_checkpoints, max_wallclock_time, metric, resource_attr, mode)[source]

Bases: TunerCallback

Common base class for HyperbandRemoveCheckpointsCallback and HyperbandRemoveCheckpointsBaselineCallback.

on_tuning_start(tuner)[source]

Called at start of tuning loop

Parameters:

tunerTuner object

property num_checkpoints_removed: int
on_loop_end()[source]

Called at end of each tuning loop iteration

This is done before the loop stopping condition is checked and acted upon.

on_trial_complete(trial, result)[source]

Called when a trial completes (Status.completed)

The arguments here also have been passed to scheduler.on_trial_complete, before this call here.

Parameters:
  • trial (Trial) – Trial that just completed.

  • result (Dict[str, Any]) – Last result obtained.

on_trial_result(trial, status, result, decision)[source]

Called when a new result (reported by a trial) is observed

The arguments here are inputs or outputs of scheduler.on_trial_result (called just before).

Parameters:
  • trial (Trial) – Trial whose report has been received

  • status (str) – Status of trial before scheduler.on_trial_result has been called

  • result (Dict[str, Any]) – Result dict received

  • decision (str) – Decision returned by scheduler.on_trial_result

on_start_trial(trial)[source]

Called just after a new trials is started

Parameters:

trial (Trial) – Trial which has just been started

on_resume_trial(trial)[source]

Called just after a trial is resumed

Parameters:

trial (Trial) – Trial which has just been resumed

trials_resumed_without_checkpoint()[source]
Return type:

List[Tuple[str, int]]

Returns:

List of (trial_id, level) for trials which were resumed, even though their checkpoint was removed

extra_results()[source]
Return type:

Dict[str, Any]

Returns:

Dictionary containing information which can be appended to results written out

static extra_results_keys()[source]
Return type:

List[str]

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.HyperbandRemoveCheckpointsCallback(max_num_checkpoints, max_wallclock_time, metric, resource_attr, mode, approx_steps=25, prior_beta_mean=0.33, prior_beta_size=2, min_data_at_rung=5)[source]

Bases: HyperbandRemoveCheckpointsCommon

Implements speculative early removal of checkpoints of paused trials for HyperbandScheduler (only for types which pause trials at rung levels).

In this scheduler, any paused trial can in principle be resumed in the future, which is why we remove checkpoints speculatively. The idea is to keep the total number of checkpoints no larger than max_num_checkpoints. If this limit is reached, we rank all currently paused trials which still have a checkpoint and remove checkpoints for those with lowest scores. If a trial is resumed whose checkpoint has been removed, we have to train from scratch, at a cost proportional to the rung level the trial is paused at. The score is an approximation to this expected cost, the product of rung level and probability of getting resumed. This probability depends on the current rung size, the rank of the trial in the rung, and both the time spent and remaining for the experiment, so we need max_wallclock_time. Details are given in a technical report.

The probability of getting resumed also depends on the probability \(p_r\) that a new trial arriving at rung \(r\) ranks better than an existing paused one with a checkpoint. These probabilities are estimated here. For each new arrival at a rung, we obtain one datapoint for every paused trial with checkpoint there. We use Bayesian estimators with Beta prior given by mean prior_beta_mean and sample size prior_beta_size. The mean should be \(< 1/2\)). We also run an estimator for an overall probability \(p\), which is fed by all datapoints. This estimator is used as long as there are less than \(min_data_at_rung\) datapoints at rung \(r\).

Parameters:
  • max_num_checkpoints (int) – Once the total number of checkpoints surpasses this number, we remove some.

  • max_wallclock_time (int) – Maximum time of the experiment

  • metric (str) – Name of metric in result of on_trial_result()

  • resource_attr (str) – Name of resource attribute in result of on_trial_result()

  • mode (str) – “min” or “max”

  • approx_steps (int) – Number of approximation steps in score computation. Computations scale cubically in this number. Defaults to 25

  • prior_beta_mean (float) – Parameter of Beta prior for estimators. Defaults to 0.33

  • prior_beta_size (float) – Parameter of Beta prior for estimators. Defaults to 2

  • min_data_at_rung (int) – See above. Defaults to 5

on_tuning_start(tuner)[source]

Called at start of tuning loop

Parameters:

tunerTuner object

estimator_for_rung(level)[source]
Return type:

BetaBinomialEstimator

on_trial_result(trial, status, result, decision)[source]

Called when a new result (reported by a trial) is observed

The arguments here are inputs or outputs of scheduler.on_trial_result (called just before).

Parameters:
  • trial (Trial) – Trial whose report has been received

  • status (str) – Status of trial before scheduler.on_trial_result has been called

  • result (Dict[str, Any]) – Result dict received

  • decision (str) – Decision returned by scheduler.on_trial_result

class syne_tune.callbacks.hyperband_remove_checkpoints_callback.HyperbandRemoveCheckpointsBaselineCallback(max_num_checkpoints, max_wallclock_time, metric, resource_attr, mode, baseline=None)[source]

Bases: HyperbandRemoveCheckpointsCommon

Implements some simple baselines to compare with HyperbandRemoveCheckpointsCallback.

Parameters:
  • max_num_checkpoints (int) – Once the total number of checkpoints surpasses this number, we remove some.

  • max_wallclock_time (int) – Maximum time of the experiment

  • metric (str) – Name of metric in result of on_trial_result()

  • resource_attr (str) – Name of resource attribute in result of on_trial_result()

  • mode (str) – “min” or “max”

  • baseline (Optional[str]) –

    Type of baseline. Defaults to “by_level”

    • ”random”: Select random paused trial with checkpoint

    • ”by_level”: Select paused trial (with checkpoint) on lowest rung level,

      and then of worst rank