syne_tune.callbacks.hyperband_remove_checkpoints_callback module
- class syne_tune.callbacks.hyperband_remove_checkpoints_callback.TrialStatus[source]
Bases:
object
- RUNNING = 'RUNNING'
- PAUSED_WITH_CHECKPOINT = 'PAUSED-WITH-CP'
- PAUSED_NO_CHECKPOINT = 'PAUSED-NO-CP'
- STOPPED_OR_COMPLETED = 'STOPPED-COMPLETED'
- class syne_tune.callbacks.hyperband_remove_checkpoints_callback.BetaBinomialEstimator(beta_mean, beta_size)[source]
Bases:
object
Estimator of the probability \(p = P(X = 1)\) for a variable \(X\) with Bernoulli distribution. This is using a Beta prior, which is conjugate to the binomial likelihood. The prior is parameterized by effective sample size
beta_size
(\(a + b\)) and meanbeta_mean
(\(a / (a + b)\)).- property num_one: int
- property num_total
- class syne_tune.callbacks.hyperband_remove_checkpoints_callback.TrialInformation(trial_id, level, rank, rung_len, score_val=None)[source]
Bases:
object
-
trial_id:
str
-
level:
int
-
rank:
int
-
rung_len:
int
-
score_val:
Optional
[float
] = None
-
trial_id:
- class syne_tune.callbacks.hyperband_remove_checkpoints_callback.HyperbandRemoveCheckpointsCommon(max_num_checkpoints, max_wallclock_time, metric, resource_attr, mode)[source]
Bases:
TunerCallback
Common base class for
HyperbandRemoveCheckpointsCallback
andHyperbandRemoveCheckpointsBaselineCallback
.- property num_checkpoints_removed: int
- on_loop_end()[source]
Called at end of each tuning loop iteration
This is done before the loop stopping condition is checked and acted upon.
- on_trial_complete(trial, result)[source]
Called when a trial completes (
Status.completed
)The arguments here also have been passed to
scheduler.on_trial_complete
, before this call here.- Parameters:
trial (
Trial
) – Trial that just completed.result (
Dict
[str
,Any
]) – Last result obtained.
- on_trial_result(trial, status, result, decision)[source]
Called when a new result (reported by a trial) is observed
The arguments here are inputs or outputs of
scheduler.on_trial_result
(called just before).- Parameters:
trial (
Trial
) – Trial whose report has been receivedstatus (
str
) – Status of trial beforescheduler.on_trial_result
has been calledresult (
Dict
[str
,Any
]) – Result dict receiveddecision (
str
) – Decision returned byscheduler.on_trial_result
- on_start_trial(trial)[source]
Called just after a new trials is started
- Parameters:
trial (
Trial
) – Trial which has just been started
- on_resume_trial(trial)[source]
Called just after a trial is resumed
- Parameters:
trial (
Trial
) – Trial which has just been resumed
- trials_resumed_without_checkpoint()[source]
- Return type:
List
[Tuple
[str
,int
]]- Returns:
List of
(trial_id, level)
for trials which were resumed, even though their checkpoint was removed
- class syne_tune.callbacks.hyperband_remove_checkpoints_callback.HyperbandRemoveCheckpointsCallback(max_num_checkpoints, max_wallclock_time, metric, resource_attr, mode, approx_steps=25, prior_beta_mean=0.33, prior_beta_size=2, min_data_at_rung=5)[source]
Bases:
HyperbandRemoveCheckpointsCommon
Implements speculative early removal of checkpoints of paused trials for
HyperbandScheduler
(only for types which pause trials at rung levels).In this scheduler, any paused trial can in principle be resumed in the future, which is why we remove checkpoints speculatively. The idea is to keep the total number of checkpoints no larger than
max_num_checkpoints
. If this limit is reached, we rank all currently paused trials which still have a checkpoint and remove checkpoints for those with lowest scores. If a trial is resumed whose checkpoint has been removed, we have to train from scratch, at a cost proportional to the rung level the trial is paused at. The score is an approximation to this expected cost, the product of rung level and probability of getting resumed. This probability depends on the current rung size, the rank of the trial in the rung, and both the time spent and remaining for the experiment, so we needmax_wallclock_time
. Details are given in a technical report.The probability of getting resumed also depends on the probability \(p_r\) that a new trial arriving at rung \(r\) ranks better than an existing paused one with a checkpoint. These probabilities are estimated here. For each new arrival at a rung, we obtain one datapoint for every paused trial with checkpoint there. We use Bayesian estimators with Beta prior given by mean
prior_beta_mean
and sample sizeprior_beta_size
. The mean should be \(< 1/2\)). We also run an estimator for an overall probability \(p\), which is fed by all datapoints. This estimator is used as long as there are less than \(min_data_at_rung\) datapoints at rung \(r\).- Parameters:
max_num_checkpoints (
int
) – Once the total number of checkpoints surpasses this number, we remove some.max_wallclock_time (
int
) – Maximum time of the experimentmetric (
str
) – Name of metric inresult
ofon_trial_result()
resource_attr (
str
) – Name of resource attribute inresult
ofon_trial_result()
mode (
str
) – “min” or “max”approx_steps (
int
) – Number of approximation steps in score computation. Computations scale cubically in this number. Defaults to 25prior_beta_mean (
float
) – Parameter of Beta prior for estimators. Defaults to 0.33prior_beta_size (
float
) – Parameter of Beta prior for estimators. Defaults to 2min_data_at_rung (
int
) – See above. Defaults to 5
- on_trial_result(trial, status, result, decision)[source]
Called when a new result (reported by a trial) is observed
The arguments here are inputs or outputs of
scheduler.on_trial_result
(called just before).- Parameters:
trial (
Trial
) – Trial whose report has been receivedstatus (
str
) – Status of trial beforescheduler.on_trial_result
has been calledresult (
Dict
[str
,Any
]) – Result dict receiveddecision (
str
) – Decision returned byscheduler.on_trial_result
- class syne_tune.callbacks.hyperband_remove_checkpoints_callback.HyperbandRemoveCheckpointsBaselineCallback(max_num_checkpoints, max_wallclock_time, metric, resource_attr, mode, baseline=None)[source]
Bases:
HyperbandRemoveCheckpointsCommon
Implements some simple baselines to compare with
HyperbandRemoveCheckpointsCallback
.- Parameters:
max_num_checkpoints (
int
) – Once the total number of checkpoints surpasses this number, we remove some.max_wallclock_time (
int
) – Maximum time of the experimentmetric (
str
) – Name of metric inresult
ofon_trial_result()
resource_attr (
str
) – Name of resource attribute inresult
ofon_trial_result()
mode (
str
) – “min” or “max”baseline (
Optional
[str
]) –Type of baseline. Defaults to “by_level”
”random”: Select random paused trial with checkpoint
- ”by_level”: Select paused trial (with checkpoint) on lowest rung level,
and then of worst rank