syne_tune.optimizer.schedulers.searchers.gp_multifidelity_searcher module
- class syne_tune.optimizer.schedulers.searchers.gp_multifidelity_searcher.GPMultiFidelitySearcher(config_space, metric, points_to_evaluate=None, **kwargs)[source]
Bases:
GPFIFOSearcher
Gaussian process Bayesian optimization for asynchronous Hyperband scheduler.
This searcher must be used with a scheduler of type
MultiFidelitySchedulerMixin
. It provides a novel combination of Bayesian optimization, based on a Gaussian process surrogate model, with Hyperband scheduling. In particular, observations across resource levels are modelled jointly.It is not recommended to create
GPMultiFidelitySearcher
searcher objects directly, but rather to createHyperbandScheduler
objects withsearcher="bayesopt"
, and passing arguments here insearch_options
. This will use the appropriate functions from :mod:syne_tune.optimizer.schedulers.searchers.gp_searcher_factory
to create components in a consistent way.Most of
GPFIFOSearcher
comments apply here as well. In multi-fidelity HPO, we optimize a function \(f(\mathbf{x}, r)\), \(\mathbf{x}\) the configuration, \(r\) the resource (or time) attribute. The latter must be a positive integer. In most applications,resource_attr == "epoch"
, and the resource is the number of epochs already trained.If
model == "gp_multitask"
(default), we model the function \(f(\mathbf{x}, r)\) jointly over all resource levels \(r\) at which it is observed (but seesearcher_data
inHyperbandScheduler
). The kernel and mean function of our surrogate model are over \((\mathbf{x}, r)\). The surrogate model is selected bygp_resource_kernel
. More details about the supported kernels is in:Tiao, Klein, Lienart, Archambeau, Seeger (2020)Model-based Asynchronous Hyperparameter and Neural Architecture SearchThe acquisition function (EI) which is optimized in
get_config()
, is obtained by fixing the resource level \(r\) to a value which is determined depending on the current state. Ifresource_acq
== ‘bohb’, \(r\) is the largest value<= max_t
, where we have seen \(\ge \mathrm{dimension}(\mathbf{x})\) metric values. Ifresource_acq == "first"
, \(r\) is the first milestone which config \(\mathbf{x}\) would reach when started.Additional arguments on top of parent class
GPFIFOSearcher
.- Parameters:
model (str, optional) –
Selects surrogate model (learning curve model) to be used. Choices are:
”gp_multitask” (default): GP multi-task surrogate model
”gp_independent”: Independent GPs for each rung level, sharing an ARD kernel
”gp_issm”: Gaussian-additive model of ISSM type
”gp_expdecay”: Gaussian-additive model of exponential decay type (as in Freeze Thaw Bayesian Optimization)
gp_resource_kernel (str, optional) – Only relevant for
model == "gp_multitask"
. Surrogate model over criterion function \(f(\mathbf{x}, r)\), \(\mathbf{x}\) the config, \(r\) the resource. Note that \(\mathbf{x}\) is encoded to be a vector with entries in[0, 1]
, and \(r\) is linearly mapped to[0, 1]
, while the criterion data is normalized to mean 0, variance 1. The reference above provides details on the models supported here. For the exponential decay kernel, the base kernel over \(\mathbf{x}\) is Matern 5/2 ARD. SeeSUPPORTED_RESOURCE_MODELS
for supported choices. Defaults to “exp-decay-sum”resource_acq (str, optional) – Only relevant for ``model in
{"gp_multitask", "gp_independent"}
. Determines how the EI acquisition function is used. Values: “bohb”, “first”. Defaults to “bohb”max_size_data_for_model (int, optional) –
If this is set, we limit the number of observations the surrogate model is fitted on this value. If there are more observations, they are down sampled, see
SubsampleMultiFidelityStateConverter
for details. This down sampling is repeated every time the model is fit, which ensures that most recent data is taken into account. Theopt_skip_*
predicates are evaluated before the state is downsampled.Pass
None
not to apply such a threshold. The default isDEFAULT_MAX_SIZE_DATA_FOR_MODEL
.opt_skip_num_max_resource (bool, optional) – Parameter for surrogate model fitting, skip predicate. If
True
, and number of observations aboveopt_skip_init_length
, fitting is done only when there is a new datapoint atr = max_t
, and skipped otherwise. Defaults toFalse
issm_gamma_one (bool, optional) – Only relevant for
model == "gp_issm"
. IfTrue
, the gamma parameter of the ISSM is fixed to 1, otherwise it is optimized over. Defaults toFalse
expdecay_normalize_inputs (bool, optional) – Only relevant for
model == "gp_expdecay"
. IfTrue
, resource values r are normalized to[0, 1]
as input to the exponential decay surrogate model. Defaults toFalse
- configure_scheduler(scheduler)[source]
Some searchers need to obtain information from the scheduler they are used with, in order to configure themselves. This method has to be called before the searcher can be used.
- Parameters:
scheduler (
TrialScheduler
) – Scheduler the searcher is used with.
- register_pending(trial_id, config=None, milestone=None)[source]
Registers trial as pending. This means the corresponding evaluation task is running. Once it finishes, update is called for this trial.
- evaluation_failed(trial_id)[source]
Called by scheduler if an evaluation job for a trial failed.
The searcher should react appropriately (e.g., remove pending evaluations for this trial, not suggest the configuration again).
- Parameters:
trial_id (
str
) – ID of trial whose evaluated failed
- cleanup_pending(trial_id)[source]
Removes all pending evaluations for trial
trial_id
.This should be called after an evaluation terminates. For various reasons (e.g., termination due to convergence), pending candidates for this evaluation may still be present.
- Parameters:
trial_id (
str
) – ID of trial whose pending evaluations should be cleared
- remove_case(trial_id, **kwargs)[source]
Remove data case previously appended by
_update()
For searchers which maintain the dataset of all cases (reports) passed to update, this method allows to remove one case from the dataset.
- Parameters:
trial_id (
str
) – ID of trial whose data is to be removedkwargs – Extra arguments, optional
- clone_from_state(state)[source]
Together with
get_state()
, this is needed in order to store and re-create the mutable state of the searcher.Given state as returned by
get_state()
, this method combines the non-pickle-able part of the immutable state from self with state and returns the corresponding searcher clone. Afterwards,self
is not used anymore.- Parameters:
state – See above
- Returns:
New searcher object