syne_tune.optimizer.schedulers.synchronous package

class syne_tune.optimizer.schedulers.synchronous.SynchronousHyperbandScheduler(config_space, bracket_rungs, **kwargs)[source]

Bases: SynchronousHyperbandCommon, DefaultRemoveCheckpointsSchedulerMixin

Synchronous Hyperband. Compared to HyperbandScheduler, this is also scheduling jobs asynchronously, but decision-making is synchronized, in that trials are only promoted to the next milestone once the rung they are currently paused at, is completely occupied.

Our implementation never delays scheduling of a job. If the currently active bracket does not accept jobs, we assign the job to a later bracket. This means that at any point in time, several brackets can be active, but jobs are preferentially assigned to the first one (the “primary” active bracket).

Parameters:
  • config_space (Dict[str, Any]) – Configuration space for trial evaluation function

  • bracket_rungs (List[List[Tuple[int, int]]]) – Determines rung level systems for each bracket, see SynchronousHyperbandBracketManager

  • metric (str) – Name of metric to optimize, key in result’s obtained via on_trial_result()

  • searcher (str, optional) – Searcher for get_config decisions. Passed to searcher_factory() along with search_options and extra information. Supported values: SUPPORTED_SEARCHERS_HYPERBAND. Defaults to “random” (i.e., random search)

  • search_options (Dict[str, Any], optional) – Passed to searcher_factory().

  • mode (str, optional) – Mode to use for the metric given, can be “min” (default) or “max”

  • points_to_evaluate (List[dict], optional) – List of configurations to be evaluated initially (in that order). Each config in the list can be partially specified, or even be an empty dict. For each hyperparameter not specified, the default value is determined using a midpoint heuristic. If None (default), this is mapped to [dict()], a single default config determined by the midpoint heuristic. If [] (empty list), no initial configurations are specified.

  • random_seed (int, optional) – Master random seed. Generators used in the scheduler or searcher are seeded using RandomSeedGenerator. If not given, the master random seed is drawn at random here.

  • max_resource_attr (str, optional) – Key name in config for fixed attribute containing the maximum resource. If given, trials need not be stopped, which can run more efficiently.

  • max_resource_level (int, optional) – Largest rung level, corresponds to max_t in FIFOScheduler. Must be positive int larger than grace_period. If this is not given, it is inferred like in FIFOScheduler. In particular, it is not needed if max_resource_attr is given.

  • resource_attr (str, optional) – Name of resource attribute in results obtained via ``on_trial_result(). The type of resource must be int. Default to “epoch”

  • searcher_data (str, optional) –

    Relevant only if a model-based searcher is used. Example: For NN tuning and resource_attr == "epoch", we receive a result for each epoch, but not all epoch values are also rung levels. searcher_data determines which of these results are passed to the searcher. As a rule, the more data the searcher receives, the better its fit, but also the more expensive get_config may become. Choices:

    • ”rungs” (default): Only results at rung levels. Cheapest

    • ”all”: All results. Most expensive

    Note: For a Gaussian additive learning curve surrogate model, this has to be set to “all”.

property rung_levels: List[int]
Returns:

Rung levels (positive int; increasing), may or may not include max_resource_level

property num_brackets: int
Returns:

Number of brackets (i.e., rung level systems). If the scheduler does not use brackets, it has to return 1

on_trial_result(trial, result)[source]

Called on each intermediate result reported by a trial.

At this point, the trial scheduler can make a decision by returning one of SchedulerDecision.CONTINUE, SchedulerDecision.PAUSE, or SchedulerDecision.STOP. This will only be called when the trial is currently running.

Parameters:
  • trial (Trial) – Trial for which results are reported

  • result (Dict[str, Any]) – Result dictionary

Return type:

str

Returns:

Decision what to do with the trial

on_trial_error(trial)[source]

Given the trial is currently pending, we send a result at its milestone for metric value NaN. Such trials are ranked after all others and will most likely not be promoted.

metric_names()[source]
Return type:

List[str]

Returns:

List of metric names. The first one is the target metric optimized over, unless the scheduler is a genuine multi-objective metric (for example, for sampling the Pareto front)

metric_mode()[source]
Return type:

str

Returns:

“min” if target metric is minimized, otherwise “max”. Here, “min” should be the default. For a genuine multi-objective scheduler, a list of modes is returned

trials_checkpoints_can_be_removed()[source]

Supports the general case (see header comment). This method returns IDs of paused trials for which checkpoints can safely be removed. These trials either cannot be resumed anymore, or it is very unlikely they will be resumed. Any trial ID needs to be returned only once, not over and over. If a trial gets stopped (by returning SchedulerDecision.STOP in on_trial_result()), its checkpoint is removed anyway, so its ID does not have to be returned here.

Return type:

List[int]

Returns:

IDs of paused trials for which checkpoints can be removed

class syne_tune.optimizer.schedulers.synchronous.SynchronousGeometricHyperbandScheduler(config_space, **kwargs)[source]

Bases: SynchronousHyperbandScheduler

Special case of SynchronousHyperbandScheduler with rung system defined by geometric sequences (see SynchronousHyperbandRungSystem.geometric()). This is the most frequently used case.

Parameters:
  • config_space (Dict[str, Any]) – Configuration space for trial evaluation function

  • metric (str) – Name of metric to optimize, key in result’s obtained via on_trial_result()

  • grace_period (int, optional) – Smallest (resource) rung level. Must be positive int. Defaults to 1

  • reduction_factor (float, optional) – Approximate ratio of successive rung levels. Must be >= 2. Defaults to 3

  • brackets (int, optional) – Number of brackets to be used. The default is to use the maximum number of brackets per iteration. Pass 1 for successive halving.

  • searcher (str, optional) – Selects searcher. Passed to searcher_factory(). Defaults to “random”

  • search_options (Dict[str, Any], optional) – Passed to searcher_factory().

  • mode (str, optional) – Mode to use for the metric given, can be “min” (default) or “max”

  • points_to_evaluate (List[dict], optional) – List of configurations to be evaluated initially (in that order). Each config in the list can be partially specified, or even be an empty dict. For each hyperparameter not specified, the default value is determined using a midpoint heuristic. If None (default), this is mapped to [dict()], a single default config determined by the midpoint heuristic. If [] (empty list), no initial configurations are specified.

  • random_seed (int, optional) – Master random seed. Generators used in the scheduler or searcher are seeded using RandomSeedGenerator. If not given, the master random seed is drawn at random here.

  • max_resource_level (int, optional) – Largest rung level, corresponds to max_t in FIFOScheduler. Must be positive int larger than grace_period. If this is not given, it is inferred like in FIFOScheduler. In particular, it is not needed if max_resource_attr is given.

  • max_resource_attr (str, optional) – Key name in config for fixed attribute containing the maximum resource. If given, trials need not be stopped, which can run more efficiently.

  • resource_attr (str, optional) – Name of resource attribute in results obtained via ``on_trial_result(). The type of resource must be int. Default to “epoch”

  • searcher_data (str, optional) –

    Relevant only if a model-based searcher is used. Example: For NN tuning and resource_attr == "epoch", we receive a result for each epoch, but not all epoch values are also rung levels. searcher_data determines which of these results are passed to the searcher. As a rule, the more data the searcher receives, the better its fit, but also the more expensive get_config may become. Choices:

    • ”rungs” (default): Only results at rung levels. Cheapest

    • ”all”: All results. Most expensive

    Note: For a Gaussian additive learning curve surrogate model, this has to be set to “all”.

class syne_tune.optimizer.schedulers.synchronous.DifferentialEvolutionHyperbandScheduler(config_space, rungs_first_bracket, num_brackets_per_iteration=None, **kwargs)[source]

Bases: SynchronousHyperbandCommon

Differential Evolution Hyperband, as proposed in

DEHB: Evolutionary Hyperband for Scalable, Robust and Efficient Hyperparameter Optimization
Noor Awad, Neeratyoy Mallik, Frank Hutter
IJCAI 30 (2021), pages 2147-2153

We implement DEHB as a variant of synchronous Hyperband, which may differ slightly from the implementation of the authors. Main differences to synchronous Hyperband:

  • In DEHB, trials are not paused and potentially promoted (except in the very first bracket). Therefore, checkpointing is not used (except in the very first bracket, if support_pause_resume is True)

  • Only the initial configurations are drawn at random (or drawn from the searcher). Whenever possible, new configurations (in their internal encoding) are derived from earlier ones by way of differential evolution

Parameters:
  • config_space (Dict[str, Any]) – Configuration space for trial evaluation function

  • rungs_first_bracket (List[Tuple[int, int]]) – Determines rung level systems for each bracket, see DifferentialEvolutionHyperbandBracketManager

  • num_brackets_per_iteration (Optional[int]) – Number of brackets per iteration. The algorithm cycles through these brackets in one iteration. If not given, the maximum number is used (i.e., len(rungs_first_bracket))

  • metric (str) – Name of metric to optimize, key in result’s obtained via on_trial_result()

  • searcher (str, optional) – Searcher for get_config decisions. Passed to searcher_factory() along with search_options and extra information. Supported values: SUPPORTED_SEARCHERS_HYPERBAND. If searcher == "random_encoded" (default), the encoded configs are sampled directly, each entry independently from U([0, 1]). This distribution has higher entropy than for “random” if there are discrete hyperparameters in config_space. Note that points_to_evaluate is still used in this case.

  • search_options (Dict[str, Any], optional) – Passed to searcher_factory(). Note: If search_options["allow_duplicates"] == True, then suggest() may return a configuration more than once

  • mode (str, optional) – Mode to use for the metric given, can be “min” (default) or “max”

  • points_to_evaluate (List[dict], optional) – List of configurations to be evaluated initially (in that order). Each config in the list can be partially specified, or even be an empty dict. For each hyperparameter not specified, the default value is determined using a midpoint heuristic. If None (default), this is mapped to [dict()], a single default config determined by the midpoint heuristic. If [] (empty list), no initial configurations are specified.

  • random_seed (int, optional) – Master random seed. Generators used in the scheduler or searcher are seeded using RandomSeedGenerator. If not given, the master random seed is drawn at random here.

  • max_resource_attr (str, optional) – Key name in config for fixed attribute containing the maximum resource. If given, trials need not be stopped, which can run more efficiently.

  • max_resource_level (int, optional) – Largest rung level, corresponds to max_t in FIFOScheduler. Must be positive int larger than grace_period. If this is not given, it is inferred like in FIFOScheduler. In particular, it is not needed if max_resource_attr is given.

  • resource_attr (str, optional) – Name of resource attribute in results obtained via on_trial_result(). The type of resource must be int. Default to “epoch”

  • mutation_factor (float, optional) – In \((0, 1]\). Factor \(F\) used in the rand/1 mutation operation of DE. Default to 0.5

  • crossover_probability (float, optional) – In \((0, 1)\). Probability \(p\) used in crossover operation (child entries are chosen with probability \(p\)). Defaults to 0.5

  • support_pause_resume (bool, optional) – If True, _suggest() supports pause and resume in the first bracket (this is the default). If the objective supports checkpointing, this is made use of. Defaults to True. Note: The resumed trial still gets assigned a new trial_id, but it starts from the earlier checkpoint.

  • searcher_data (str, optional) –

    Relevant only if a model-based searcher is used. Example: For NN tuning and resource_attr == "epoch", we receive a result for each epoch, but not all epoch values are also rung levels. searcher_data determines which of these results are passed to the searcher. As a rule, the more data the searcher receives, the better its fit, but also the more expensive get_config may become. Choices:

    • ”rungs” (default): Only results at rung levels. Cheapest

    • ”all”: All results. Most expensive

    Note: For a Gaussian additive learning curve surrogate model, this has to be set to “all”.

MAX_RETRIES = 50
property rung_levels: List[int]
Returns:

Rung levels (positive int; increasing), may or may not include max_resource_level

property num_brackets: int
Returns:

Number of brackets (i.e., rung level systems). If the scheduler does not use brackets, it has to return 1

on_trial_result(trial, result)[source]

Called on each intermediate result reported by a trial.

At this point, the trial scheduler can make a decision by returning one of SchedulerDecision.CONTINUE, SchedulerDecision.PAUSE, or SchedulerDecision.STOP. This will only be called when the trial is currently running.

Parameters:
  • trial (Trial) – Trial for which results are reported

  • result (Dict[str, Any]) – Result dictionary

Return type:

str

Returns:

Decision what to do with the trial

on_trial_error(trial)[source]

Given the trial is currently pending, we send a result at its milestone for metric value NaN. Such trials are ranked after all others and will most likely not be promoted.

metric_names()[source]
Return type:

List[str]

Returns:

List of metric names. The first one is the target metric optimized over, unless the scheduler is a genuine multi-objective metric (for example, for sampling the Pareto front)

metric_mode()[source]
Return type:

str

Returns:

“min” if target metric is minimized, otherwise “max”. Here, “min” should be the default. For a genuine multi-objective scheduler, a list of modes is returned

class syne_tune.optimizer.schedulers.synchronous.GeometricDifferentialEvolutionHyperbandScheduler(config_space, **kwargs)[source]

Bases: DifferentialEvolutionHyperbandScheduler

Special case of DifferentialEvolutionHyperbandScheduler with rung system defined by geometric sequences. This is the most frequently used case.

Parameters:
  • config_space (Dict[str, Any]) – Configuration space for trial evaluation function

  • grace_period (int, optional) – Smallest (resource) rung level. Must be positive int. Defaults to 1

  • reduction_factor (float, optional) – Approximate ratio of successive rung levels. Must be >= 2. Defaults to 3

  • brackets (int, optional) – Number of brackets to be used. The default is to use the maximum number of brackets per iteration. Pass 1 for successive halving.

  • metric (str) – Name of metric to optimize, key in result’s obtained via on_trial_result()

  • searcher (str, optional) – Selects searcher. Passed to searcher_factory().. If searcher == "random_encoded" (default), the encoded configs are sampled directly, each entry independently from U([0, 1]). This distribution has higher entropy than for “random” if there are discrete hyperparameters in config_space. Note that points_to_evaluate is still used in this case.

  • search_options (Dict[str, Any], optional) – Passed to searcher_factory().

  • mode (str, optional) – Mode to use for the metric given, can be “min” (default) or “max”

  • points_to_evaluate (List[dict], optional) – List of configurations to be evaluated initially (in that order). Each config in the list can be partially specified, or even be an empty dict. For each hyperparameter not specified, the default value is determined using a midpoint heuristic. If None (default), this is mapped to [dict()], a single default config determined by the midpoint heuristic. If [] (empty list), no initial configurations are specified.

  • random_seed (int, optional) – Master random seed. Generators used in the scheduler or searcher are seeded using RandomSeedGenerator. If not given, the master random seed is drawn at random here.

  • max_resource_level (int, optional) – Largest rung level, corresponds to max_t in FIFOScheduler. Must be positive int larger than grace_period. If this is not given, it is inferred like in FIFOScheduler. In particular, it is not needed if max_resource_attr is given.

  • max_resource_attr (str, optional) – Key name in config for fixed attribute containing the maximum resource. If given, trials need not be stopped, which can run more efficiently.

  • resource_attr (str, optional) – Name of resource attribute in results obtained via on_trial_result(). The type of resource must be int. Default to “epoch”

  • mutation_factor (float, optional) – In \((0, 1]\). Factor \(F\) used in the rand/1 mutation operation of DE. Default to 0.5

  • crossover_probability (float, optional) – In \((0, 1)\). Probability \(p\) used in crossover operation (child entries are chosen with probability \(p\)). Defaults to 0.5

  • support_pause_resume (bool, optional) – If True, _suggest() supports pause and resume in the first bracket (this is the default). If the objective supports checkpointing, this is made use of. Defaults to True. Note: The resumed trial still gets assigned a new trial_id, but it starts from the earlier checkpoint.

Submodules