syne_tune.optimizer.schedulers.hyperband_promotion module
- class syne_tune.optimizer.schedulers.hyperband_promotion.PromotionRungEntry(trial_id, metric_val, was_promoted=False)[source]
Bases:
RungEntry
Appends
was_promoted
to the superclass. This isTrue
iff the trial has been promoted from this rung. Otherwise, the trial is paused at this rung.
- class syne_tune.optimizer.schedulers.hyperband_promotion.PromotionRungSystem(rung_levels, promote_quantiles, metric, mode, resource_attr, max_t)[source]
Bases:
RungSystem
Implements the promotion logic for an asynchronous variant of Hyperband, known as ASHA:
In ASHA, configs sit paused at milestones (rung levels) until they get promoted, which means that a free task picks up their evaluation until the next milestone.
The rule to decide whether a paused trial is promoted (or remains paused) is the same as in
StoppingRungSystem
, except that continues becomes gets promoted. If several paused trials in a rung can be promoted, the one with the best metric value is chosen.Note: Say that an evaluation is resumed from level
resume_from
. If the trial evaluation function does not implement pause & resume, it needs to start training from scratch, in which case metrics are reported for every epoch, also those< resume_from
. At least for some modes of fitting the searcher model to data, this would lead to duplicate target values for the same extended config \((x, r)\), which we want to avoid. The solution is to maintainresume_from
in the data for the terminator (see_running
). Given this, we can report inon_task_report()
that the current metric data should not be used for the searcher model (ignore_data = True
), namely as long as the evaluation has not yet gone beyond levelresume_from
.- on_task_schedule(new_trial_id)[source]
Used to implement
_promote_trial()
. Searches through rungs to find a trial which can be promoted. If one is found, we return thetrial_id
and other info (current milestone, milestone to be promoted to). We also mark the trial as being promoted at the rung level it sits right now.- Return type:
Dict
[str
,Any
]
- on_task_add(trial_id, skip_rungs, **kwargs)[source]
Called when new task is started. Depending on
kwargs["new_config"]
, this could start an evaluation (True
) or promote an existing config to the next milestone (False
). In the latter case,kwargs
contains additional information about the promotion (in “milestone”, “resume_from”).- Parameters:
trial_id (
str
) – ID of trial to be startedskip_rungs (
int
) – This number of the smallest rung levels are not considered milestones for this taskkwargs – Additional arguments
- on_task_report(trial_id, result, skip_rungs)[source]
Decision on whether task may continue (
task_continues=True
), or should be paused (task_continues=False
).milestone_reached
is a flag whether resource coincides with a milestone. For this scheduler, we have thattask_continues == not milestone_reached
,since a trial is always paused at a milestone.
ignore_data
is True if a result is received from a resumed trial at a level<= resume_from
. This happens if checkpointing is not implemented (or not used), because resumed trials are started from scratch then. These metric values should in general be ignored.- Parameters:
trial_id (
str
) – ID of trial which reported resultsresult (
Dict
[str
,Any
]) – Reported metricsskip_rungs (
int
) – This number of smallest rung levels are not considered milestones for this task
- Return type:
Dict
[str
,Any
]- Returns:
dict(task_continues, milestone_reached, next_milestone, ignore_data)
- on_task_remove(trial_id)[source]
Called when task is removed.
- Parameters:
trial_id (
str
) – ID of trial which is to be removed
- static does_pause_resume()[source]
- Return type:
bool
- Returns:
Is this variant doing pause and resume scheduling, in the sense that trials can be paused and resumed later?
- support_early_checkpoint_removal()[source]
- Return type:
bool
- Returns:
Do we support early checkpoint removal via
paused_trials()
?
- paused_trials(resource=None)[source]
Only for pause and resume schedulers (
does_pause_resume()
returnsTrue
), where trials can be paused at certain rung levels only. Ifresource
is not given, returns list of all paused trials(trial_id, rank, metric_val, level)
, wherelevel
is the rung level, andrank
is the rank of the trial in the rung (0 for the best metric value). Ifresource
is given, only the paused trials in the rung of this level are returned. Ifresource
is not a rung level, the returned list is empty.- Parameters:
resource (
Optional
[int
]) – If given, paused trials of only this rung level are returned. Otherwise, all paused trials are returned- Return type:
List
[Tuple
[str
,int
,float
,int
]]- Returns:
See above