syne_tune.optimizer.schedulers.hyperband_promotion module

class syne_tune.optimizer.schedulers.hyperband_promotion.PromotionRungEntry(trial_id, metric_val, was_promoted=False)[source]

Bases: RungEntry

Appends was_promoted to the superclass. This is True iff the trial has been promoted from this rung. Otherwise, the trial is paused at this rung.

class syne_tune.optimizer.schedulers.hyperband_promotion.PromotionRungSystem(rung_levels, promote_quantiles, metric, mode, resource_attr, max_t)[source]

Bases: RungSystem

Implements the promotion logic for an asynchronous variant of Hyperband, known as ASHA:

Li etal
A System for Massively Parallel Hyperparameter Tuning

In ASHA, configs sit paused at milestones (rung levels) until they get promoted, which means that a free task picks up their evaluation until the next milestone.

The rule to decide whether a paused trial is promoted (or remains paused) is the same as in StoppingRungSystem, except that continues becomes gets promoted. If several paused trials in a rung can be promoted, the one with the best metric value is chosen.

Note: Say that an evaluation is resumed from level resume_from. If the trial evaluation function does not implement pause & resume, it needs to start training from scratch, in which case metrics are reported for every epoch, also those < resume_from. At least for some modes of fitting the searcher model to data, this would lead to duplicate target values for the same extended config \((x, r)\), which we want to avoid. The solution is to maintain resume_from in the data for the terminator (see _running). Given this, we can report in on_task_report() that the current metric data should not be used for the searcher model (ignore_data = True), namely as long as the evaluation has not yet gone beyond level resume_from.

on_task_schedule(new_trial_id)[source]

Used to implement _promote_trial(). Searches through rungs to find a trial which can be promoted. If one is found, we return the trial_id and other info (current milestone, milestone to be promoted to). We also mark the trial as being promoted at the rung level it sits right now.

Return type:

Dict[str, Any]

on_task_add(trial_id, skip_rungs, **kwargs)[source]

Called when new task is started. Depending on kwargs["new_config"], this could start an evaluation (True) or promote an existing config to the next milestone (False). In the latter case, kwargs contains additional information about the promotion (in “milestone”, “resume_from”).

Parameters:
  • trial_id (str) – ID of trial to be started

  • skip_rungs (int) – This number of the smallest rung levels are not considered milestones for this task

  • kwargs – Additional arguments

on_task_report(trial_id, result, skip_rungs)[source]

Decision on whether task may continue (task_continues=True), or should be paused (task_continues=False). milestone_reached is a flag whether resource coincides with a milestone. For this scheduler, we have that

task_continues == not milestone_reached,

since a trial is always paused at a milestone.

ignore_data is True if a result is received from a resumed trial at a level <= resume_from. This happens if checkpointing is not implemented (or not used), because resumed trials are started from scratch then. These metric values should in general be ignored.

Parameters:
  • trial_id (str) – ID of trial which reported results

  • result (Dict[str, Any]) – Reported metrics

  • skip_rungs (int) – This number of smallest rung levels are not considered milestones for this task

Return type:

Dict[str, Any]

Returns:

dict(task_continues, milestone_reached, next_milestone, ignore_data)

on_task_remove(trial_id)[source]

Called when task is removed.

Parameters:

trial_id (str) – ID of trial which is to be removed

static does_pause_resume()[source]
Return type:

bool

Returns:

Is this variant doing pause and resume scheduling, in the sense that trials can be paused and resumed later?

support_early_checkpoint_removal()[source]
Return type:

bool

Returns:

Do we support early checkpoint removal via paused_trials()?

paused_trials(resource=None)[source]

Only for pause and resume schedulers (does_pause_resume() returns True), where trials can be paused at certain rung levels only. If resource is not given, returns list of all paused trials (trial_id, rank, metric_val, level), where level is the rung level, and rank is the rank of the trial in the rung (0 for the best metric value). If resource is given, only the paused trials in the rung of this level are returned. If resource is not a rung level, the returned list is empty.

Parameters:

resource (Optional[int]) – If given, paused trials of only this rung level are returned. Otherwise, all paused trials are returned

Return type:

List[Tuple[str, int, float, int]]

Returns:

See above