syne_tune.optimizer.schedulers.searchers.gp_multifidelity_searcher module

class syne_tune.optimizer.schedulers.searchers.gp_multifidelity_searcher.GPMultiFidelitySearcher(config_space, metric, points_to_evaluate=None, **kwargs)[source]

Bases: GPFIFOSearcher

Gaussian process Bayesian optimization for asynchronous Hyperband scheduler.

This searcher must be used with a scheduler of type MultiFidelitySchedulerMixin. It provides a novel combination of Bayesian optimization, based on a Gaussian process surrogate model, with Hyperband scheduling. In particular, observations across resource levels are modelled jointly.

It is not recommended to create GPMultiFidelitySearcher searcher objects directly, but rather to create HyperbandScheduler objects with searcher="bayesopt", and passing arguments here in search_options. This will use the appropriate functions from :mod:syne_tune.optimizer.schedulers.searchers.gp_searcher_factory to create components in a consistent way.

Most of GPFIFOSearcher comments apply here as well. In multi-fidelity HPO, we optimize a function \(f(\mathbf{x}, r)\), \(\mathbf{x}\) the configuration, \(r\) the resource (or time) attribute. The latter must be a positive integer. In most applications, resource_attr == "epoch", and the resource is the number of epochs already trained.

If model == "gp_multitask" (default), we model the function \(f(\mathbf{x}, r)\) jointly over all resource levels \(r\) at which it is observed (but see searcher_data in HyperbandScheduler). The kernel and mean function of our surrogate model are over \((\mathbf{x}, r)\). The surrogate model is selected by gp_resource_kernel. More details about the supported kernels is in:

Tiao, Klein, Lienart, Archambeau, Seeger (2020)

Model-based Asynchronous Hyperparameter and Neural Architecture Search

https://openreview.net/forum?id=a2rFihIU7i

The acquisition function (EI) which is optimized in get_config(), is obtained by fixing the resource level \(r\) to a value which is determined depending on the current state. If resource_acq == ‘bohb’, \(r\) is the largest value <= max_t, where we have seen \(\ge \mathrm{dimension}(\mathbf{x})\) metric values. If resource_acq == "first", \(r\) is the first milestone which config \(\mathbf{x}\) would reach when started.

Additional arguments on top of parent class GPFIFOSearcher.

Parameters:

model (str, optional) –
Selects surrogate model (learning curve model) to be used. Choices are:
- ”gp_multitask” (default): GP multi-task surrogate model
- ”gp_independent”: Independent GPs for each rung level, sharing an ARD kernel
- ”gp_issm”: Gaussian-additive model of ISSM type
- ”gp_expdecay”: Gaussian-additive model of exponential decay type (as in Freeze Thaw Bayesian Optimization)
gp_resource_kernel (str, optional) – Only relevant for model == "gp_multitask". Surrogate model over criterion function \(f(\mathbf{x}, r)\), \(\mathbf{x}\) the config, \(r\) the resource. Note that \(\mathbf{x}\) is encoded to be a vector with entries in [0, 1], and \(r\) is linearly mapped to [0, 1], while the criterion data is normalized to mean 0, variance 1. The reference above provides details on the models supported here. For the exponential decay kernel, the base kernel over \(\mathbf{x}\) is Matern 5/2 ARD. See SUPPORTED_RESOURCE_MODELS for supported choices. Defaults to “exp-decay-sum”
resource_acq (str, optional) – Only relevant for ``model in {"gp_multitask", "gp_independent"}. Determines how the EI acquisition function is used. Values: “bohb”, “first”. Defaults to “bohb”
max_size_data_for_model (int, optional) –
If this is set, we limit the number of observations the surrogate model is fitted on this value. If there are more observations, they are down sampled, see SubsampleMultiFidelityStateConverter for details. This down sampling is repeated every time the model is fit, which ensures that most recent data is taken into account. The opt_skip_* predicates are evaluated before the state is downsampled.

Pass None not to apply such a threshold. The default is DEFAULT_MAX_SIZE_DATA_FOR_MODEL.
opt_skip_num_max_resource (bool, optional) – Parameter for surrogate model fitting, skip predicate. If True, and number of observations above opt_skip_init_length, fitting is done only when there is a new datapoint at r = max_t, and skipped otherwise. Defaults to False
issm_gamma_one (bool, optional) – Only relevant for model == "gp_issm". If True, the gamma parameter of the ISSM is fixed to 1, otherwise it is optimized over. Defaults to False
expdecay_normalize_inputs (bool, optional) – Only relevant for model == "gp_expdecay". If True, resource values r are normalized to [0, 1] as input to the exponential decay surrogate model. Defaults to False

configure_scheduler(scheduler)[source]

Some searchers need to obtain information from the scheduler they are used with, in order to configure themselves. This method has to be called before the searcher can be used.

Parameters:: scheduler (TrialScheduler) – Scheduler the searcher is used with.

register_pending(trial_id, config=None, milestone=None)[source]: Registers trial as pending. This means the corresponding evaluation task is running. Once it finishes, update is called for this trial.

evaluation_failed(trial_id)[source]

Called by scheduler if an evaluation job for a trial failed.

The searcher should react appropriately (e.g., remove pending evaluations for this trial, not suggest the configuration again).

Parameters:: trial_id (str) – ID of trial whose evaluated failed

cleanup_pending(trial_id)[source]

Removes all pending evaluations for trial trial_id.

This should be called after an evaluation terminates. For various reasons (e.g., termination due to convergence), pending candidates for this evaluation may still be present.

Parameters:: trial_id (str) – ID of trial whose pending evaluations should be cleared

remove_case(trial_id, **kwargs)[source]

Remove data case previously appended by _update()

For searchers which maintain the dataset of all cases (reports) passed to update, this method allows to remove one case from the dataset.

Parameters:

trial_id (str) – ID of trial whose data is to be removed
kwargs – Extra arguments, optional

clone_from_state(state)[source]

Together with get_state(), this is needed in order to store and re-create the mutable state of the searcher.

Given state as returned by get_state(), this method combines the non-pickle-able part of the immutable state from self with state and returns the corresponding searcher clone. Afterwards, self is not used anymore.

Parameters:: state – See above
Returns:: New searcher object