syne_tune.optimizer.schedulers.searchers package

class syne_tune.optimizer.schedulers.searchers.BaseSearcher(config_space, metric, points_to_evaluate=None, mode='min')[source]

Bases: object

Base class of searchers, which are components of schedulers responsible for implementing get_config().

Note

This is an abstract base class. In order to implement a new searcher, try to start from StochasticAndFilterDuplicatesSearcher or StochasticSearcher, which implement generally useful properties.

Parameters:
  • config_space (Dict[str, Any]) – Configuration space

  • metric (Union[List[str], str]) –

    Name of metric passed to update(). Can be obtained from scheduler in configure_scheduler(). In the case of multi-objective optimization,

    metric is a list of strings specifying all objectives to be optimized.

  • points_to_evaluate (Optional[List[Dict[str, Any]]]) – List of configurations to be evaluated initially (in that order). Each config in the list can be partially specified, or even be an empty dict. For each hyperparameter not specified, the default value is determined using a midpoint heuristic. If None (default), this is mapped to [dict()], a single default config determined by the midpoint heuristic. If [] (empty list), no initial configurations are specified.

  • mode (Union[List[str], str]) – Should metric be minimized (“min”, default) or maximized (“max”). In the case of multi-objective optimization, mode can be a list defining for each metric if it is minimized or maximized

configure_scheduler(scheduler)[source]

Some searchers need to obtain information from the scheduler they are used with, in order to configure themselves. This method has to be called before the searcher can be used.

Parameters:

scheduler (TrialScheduler) – Scheduler the searcher is used with.

get_config(**kwargs)[source]

Suggest a new configuration.

Note: Query _next_initial_config() for initial configs to return first.

Parameters:

kwargs – Extra information may be passed from scheduler to searcher

Return type:

Optional[Dict[str, Any]]

Returns:

New configuration. The searcher may return None if a new configuration cannot be suggested. In this case, the tuning will stop. This happens if searchers never suggest the same config more than once, and all configs in the (finite) search space are exhausted.

on_trial_result(trial_id, config, result, update)[source]

Inform searcher about result

The scheduler passes every result. If update == True, the searcher should update its surrogate model (if any), otherwise result is an intermediate result not modelled.

The default implementation calls _update() if update == True. It can be overwritten by searchers which also react to intermediate results.

Parameters:
  • trial_id (str) – See on_trial_result()

  • config (Dict[str, Any]) – See on_trial_result()

  • result (Dict[str, Any]) – See on_trial_result()

  • update (bool) – Should surrogate model be updated?

register_pending(trial_id, config=None, milestone=None)[source]

Signals to searcher that evaluation for trial has started, but not yet finished, which allows model-based searchers to register this evaluation as pending.

Parameters:
  • trial_id (str) – ID of trial to be registered as pending evaluation

  • config (Optional[Dict[str, Any]]) – If trial_id has not been registered with the searcher, its configuration must be passed here. Ignored otherwise.

  • milestone (Optional[int]) – For multi-fidelity schedulers, this is the next rung level the evaluation will attend, so that model registers (config, milestone) as pending.

remove_case(trial_id, **kwargs)[source]

Remove data case previously appended by _update()

For searchers which maintain the dataset of all cases (reports) passed to update, this method allows to remove one case from the dataset.

Parameters:
  • trial_id (str) – ID of trial whose data is to be removed

  • kwargs – Extra arguments, optional

evaluation_failed(trial_id)[source]

Called by scheduler if an evaluation job for a trial failed.

The searcher should react appropriately (e.g., remove pending evaluations for this trial, not suggest the configuration again).

Parameters:

trial_id (str) – ID of trial whose evaluated failed

cleanup_pending(trial_id)[source]

Removes all pending evaluations for trial trial_id.

This should be called after an evaluation terminates. For various reasons (e.g., termination due to convergence), pending candidates for this evaluation may still be present.

Parameters:

trial_id (str) – ID of trial whose pending evaluations should be cleared

dataset_size()[source]
Returns:

Size of dataset a model is fitted to, or 0 if no model is fitted to data

model_parameters()[source]
Returns:

Dictionary with current model (hyper)parameter values if this is supported; otherwise empty

get_state()[source]

Together with clone_from_state(), this is needed in order to store and re-create the mutable state of the searcher. The state returned here must be pickle-able.

Return type:

Dict[str, Any]

Returns:

Pickle-able mutable state of searcher

clone_from_state(state)[source]

Together with get_state(), this is needed in order to store and re-create the mutable state of the searcher.

Given state as returned by get_state(), this method combines the non-pickle-able part of the immutable state from self with state and returns the corresponding searcher clone. Afterwards, self is not used anymore.

Parameters:

state (Dict[str, Any]) – See above

Returns:

New searcher object

property debug_log: DebugLogPrinter | None

Some subclasses support writing a debug log, using DebugLogPrinter. See RandomSearcher for an example.

Returns:

debug_log object`` or None (not supported)

syne_tune.optimizer.schedulers.searchers.impute_points_to_evaluate(points_to_evaluate, config_space)[source]

Transforms points_to_evaluate argument to BaseSearcher. Each config in the list can be partially specified, or even be an empty dict. For each hyperparameter not specified, the default value is determined using a midpoint heuristic. Also, duplicate entries are filtered out. If None (default), this is mapped to [dict()], a single default config determined by the midpoint heuristic. If [] (empty list), no initial configurations are specified.

Parameters:
  • points_to_evaluate (Optional[List[Dict[str, Any]]]) – Argument to BaseSearcher

  • config_space (Dict[str, Any]) – Configuration space

Return type:

List[Dict[str, Any]]

Returns:

List of fully specified initial configs

class syne_tune.optimizer.schedulers.searchers.StochasticSearcher(config_space, metric, points_to_evaluate=None, **kwargs)[source]

Bases: BaseSearcher

Base class of searchers which use random decisions. Creates the random_state member, which must be used for all random draws.

Making proper use of this interface allows us to run experiments with control of random seeds, e.g. for paired comparisons or integration testing.

Additional arguments on top of parent class BaseSearcher:

Parameters:
  • random_seed_generator (RandomSeedGenerator, optional) – If given, random seed is drawn from there

  • random_seed (int, optional) – Used if random_seed_generator is not given.

get_state()[source]

Together with clone_from_state(), this is needed in order to store and re-create the mutable state of the searcher. The state returned here must be pickle-able.

Return type:

Dict[str, Any]

Returns:

Pickle-able mutable state of searcher

set_random_state(random_state)[source]
class syne_tune.optimizer.schedulers.searchers.StochasticAndFilterDuplicatesSearcher(config_space, metric, points_to_evaluate=None, allow_duplicates=None, restrict_configurations=None, **kwargs)[source]

Bases: StochasticSearcher

Base class for searchers with the following properties:

  • Random decisions use common random_state

  • Maintains exclusion list to filter out duplicates in get_config() if allows_duplicates == False`. If this is ``True, duplicates are not filtered, and the exclusion list is used only to avoid configurations of failed trials.

  • If restrict_configurations is given, this is a list of configurations, and the searcher only suggests configurations from there. If allow_duplicates == False, entries are popped off this list once suggested. points_to_evaluate is filtered to only contain entries in this set.

In order to make use of these features:

  • Reject configurations in get_config() if should_not_suggest() returns True. If the configuration is drawn at random, use _get_random_config(), which incorporates this filtering

  • Implement _get_config() instead of get_config(). The latter adds the new config to the exclusion list if allow_duplicates == False

Note: Not all searchers which filter duplicates make use of this class.

Additional arguments on top of parent class StochasticSearcher:

Parameters:
  • allow_duplicates (Optional[bool]) – See above. Defaults to False

  • restrict_configurations (Optional[List[Dict[str, Any]]]) – See above, optional

property allow_duplicates: bool
should_not_suggest(config)[source]
Parameters:

config (Dict[str, Any]) – Configuration

Return type:

bool

Returns:

get_config() should not suggest this configuration?

get_config(**kwargs)[source]

Suggest a new configuration.

Note: Query _next_initial_config() for initial configs to return first.

Parameters:

kwargs – Extra information may be passed from scheduler to searcher

Return type:

Optional[Dict[str, Any]]

Returns:

New configuration. The searcher may return None if a new configuration cannot be suggested. In this case, the tuning will stop. This happens if searchers never suggest the same config more than once, and all configs in the (finite) search space are exhausted.

register_pending(trial_id, config=None, milestone=None)[source]

Signals to searcher that evaluation for trial has started, but not yet finished, which allows model-based searchers to register this evaluation as pending.

Parameters:
  • trial_id (str) – ID of trial to be registered as pending evaluation

  • config (Optional[Dict[str, Any]]) – If trial_id has not been registered with the searcher, its configuration must be passed here. Ignored otherwise.

  • milestone (Optional[int]) – For multi-fidelity schedulers, this is the next rung level the evaluation will attend, so that model registers (config, milestone) as pending.

evaluation_failed(trial_id)[source]

Called by scheduler if an evaluation job for a trial failed.

The searcher should react appropriately (e.g., remove pending evaluations for this trial, not suggest the configuration again).

Parameters:

trial_id (str) – ID of trial whose evaluated failed

get_state()[source]

Together with clone_from_state(), this is needed in order to store and re-create the mutable state of the searcher. The state returned here must be pickle-able.

Return type:

Dict[str, Any]

Returns:

Pickle-able mutable state of searcher

syne_tune.optimizer.schedulers.searchers.extract_random_seed(**kwargs)[source]
Return type:

(int, Dict[str, Any])

class syne_tune.optimizer.schedulers.searchers.RandomSearcher(config_space, metric, points_to_evaluate=None, debug_log=False, resource_attr=None, allow_duplicates=None, restrict_configurations=None, **kwargs)[source]

Bases: StochasticAndFilterDuplicatesSearcher

Searcher which randomly samples configurations to try next.

Additional arguments on top of parent class StochasticAndFilterDuplicatesSearcher:

Parameters:
  • debug_log (Union[bool, DebugLogPrinter]) – If True, debug log printing is activated. Logs which configs are chosen when, and which metric values are obtained. Defaults to False

  • resource_attr (Optional[str]) – Optional. Key in result passed to _update() for resource value (for multi-fidelity schedulers)

configure_scheduler(scheduler)[source]

Some searchers need to obtain information from the scheduler they are used with, in order to configure themselves. This method has to be called before the searcher can be used.

Parameters:

scheduler (TrialScheduler) – Scheduler the searcher is used with.

clone_from_state(state)[source]

Together with get_state(), this is needed in order to store and re-create the mutable state of the searcher.

Given state as returned by get_state(), this method combines the non-pickle-able part of the immutable state from self with state and returns the corresponding searcher clone. Afterwards, self is not used anymore.

Parameters:

state (Dict[str, Any]) – See above

Returns:

New searcher object

property debug_log

Some subclasses support writing a debug log, using DebugLogPrinter. See RandomSearcher for an example.

Returns:

debug_log object`` or None (not supported)

class syne_tune.optimizer.schedulers.searchers.GridSearcher(config_space, metric, points_to_evaluate=None, num_samples=None, shuffle_config=True, allow_duplicates=False, **kwargs)[source]

Bases: StochasticSearcher

Searcher that samples configurations from an equally spaced grid over config_space.

It first evaluates configurations defined in points_to_evaluate and then continues with the remaining points from the grid.

Additional arguments on top of parent class StochasticSearcher.

Parameters:
  • num_samples (Optional[Dict[str, int]]) – Dictionary, optional. Number of samples per hyperparameter. This is required for hyperparameters of type float, optional for integer hyperparameters, and will be ignored for other types (categorical, scalar). If left unspecified, a default value of DEFAULT_NSAMPLE will be used for float parameters, and the smallest of DEFAULT_NSAMPLE and integer range will be used for integer parameters.

  • shuffle_config (bool) – If True (default), the order of configurations suggested after those specified in points_to_evaluate is shuffled. Otherwise, the order will follow the Cartesian product of the configurations.

  • allow_duplicates (bool) – If True, get_config() may return the same configuration more than once. Defaults to False

get_config(**kwargs)[source]

Select the next configuration from the grid.

This is done without replacement, so previously returned configs are not suggested again.

Return type:

Optional[dict]

Returns:

A new configuration that is valid, or None if no new config can be suggested. The returned configuration is a dictionary that maps hyperparameters to its values.

get_state()[source]

Together with clone_from_state(), this is needed in order to store and re-create the mutable state of the searcher. The state returned here must be pickle-able.

Return type:

Dict[str, Any]

Returns:

Pickle-able mutable state of searcher

clone_from_state(state)[source]

Together with get_state(), this is needed in order to store and re-create the mutable state of the searcher.

Given state as returned by get_state(), this method combines the non-pickle-able part of the immutable state from self with state and returns the corresponding searcher clone. Afterwards, self is not used anymore.

Parameters:

state (Dict[str, Any]) – See above

Returns:

New searcher object

syne_tune.optimizer.schedulers.searchers.searcher_factory(searcher_name, **kwargs)[source]

Factory for searcher objects

This function creates searcher objects from string argument name and additional kwargs. It is typically called in the constructor of a scheduler (see FIFOScheduler), which provides most of the required kwargs.

Parameters:
  • searcher_name (str) – Value of searcher argument to scheduler (see FIFOScheduler)

  • kwargs – Argument to BaseSearcher constructor

Return type:

BaseSearcher

Returns:

New searcher object

class syne_tune.optimizer.schedulers.searchers.ModelBasedSearcher(config_space, metric, points_to_evaluate=None, **kwargs)[source]

Bases: StochasticSearcher

Common code for surrogate model based searchers

If num_initial_random_choices > 0, initial configurations are drawn using an internal RandomSearcher object, which is created in _assign_random_searcher(). This internal random searcher shares random_state with the searcher here. This ensures that if ModelBasedSearcher and RandomSearcher objects are created with the same random_seed and points_to_evaluate argument, initial configurations are identical until _get_config_modelbased() kicks in.

Note that this works because random_state is only used in the internal random searcher until meth:_get_config_modelbased is first called.

on_trial_result(trial_id, config, result, update)[source]

Inform searcher about result

The scheduler passes every result. If update == True, the searcher should update its surrogate model (if any), otherwise result is an intermediate result not modelled.

The default implementation calls _update() if update == True. It can be overwritten by searchers which also react to intermediate results.

Parameters:
  • trial_id (str) – See on_trial_result()

  • config (Dict[str, Any]) – See on_trial_result()

  • result (Dict[str, Any]) – See on_trial_result()

  • update (bool) – Should surrogate model be updated?

get_config(**kwargs)[source]

Runs Bayesian optimization in order to suggest the next config to evaluate.

Return type:

Optional[Dict[str, Any]]

Returns:

Next config to evaluate at

dataset_size()[source]
Returns:

Size of dataset a model is fitted to, or 0 if no model is fitted to data

model_parameters()[source]
Returns:

Dictionary with current model (hyper)parameter values if this is supported; otherwise empty

set_params(param_dict)[source]
get_state()[source]

The mutable state consists of the GP model parameters, the TuningJobState, and the skip_optimization predicate (which can have a mutable state). We assume that skip_optimization can be pickled.

Note that we do not have to store the state of _random_searcher, since this internal searcher shares its random_state with the searcher here.

Return type:

Dict[str, Any]

property debug_log

Some subclasses support writing a debug log, using DebugLogPrinter. See RandomSearcher for an example.

Returns:

debug_log object`` or None (not supported)

class syne_tune.optimizer.schedulers.searchers.BayesianOptimizationSearcher(config_space, metric, points_to_evaluate=None, **kwargs)[source]

Bases: ModelBasedSearcher

Common Code for searchers using Bayesian optimization

We implement Bayesian optimization, based on a model factory which parameterizes the state transformer. This implementation works with any type of surrogate model and acquisition function, which are compatible with each other.

The following happens in get_config():

  • For the first num_init_random calls, a config is drawn at random (after points_to_evaluate, which are included in the num_init_random initial ones). Afterwards, Bayesian optimization is used, unless there are no finished evaluations yet (a surrogate model cannot be used with no data at all)

  • For BO, model hyperparameter are refit first. This step can be skipped (see opt_skip_* parameters).

  • Next, the BO decision is made based on BayesianOptimizationAlgorithm. This involves sampling num_init_candidates` configs are sampled at random, ranking them with a scoring function (initial_scoring), and finally runing local optimization starting from the top scoring config.

configure_scheduler(scheduler)[source]

Some searchers need to obtain information from the scheduler they are used with, in order to configure themselves. This method has to be called before the searcher can be used.

Parameters:

scheduler (TrialScheduler) – Scheduler the searcher is used with.

register_pending(trial_id, config=None, milestone=None)[source]

Registers trial as pending. This means the corresponding evaluation task is running. Once it finishes, update is called for this trial.

get_batch_configs(batch_size, num_init_candidates_for_batch=None, **kwargs)[source]

Asks for a batch of batch_size configurations to be suggested. This is roughly equivalent to calling get_config batch_size times, marking the suggested configs as pending in the state (but the state is not modified here). This means the batch is chosen sequentially, at about the cost of calling get_config batch_size times.

If num_init_candidates_for_batch is given, it is used instead of num_init_candidates for the selection of all but the first config in the batch. In order to speed up batch selection, choose num_init_candidates_for_batch smaller than num_init_candidates.

If less than batch_size configs are returned, the search space has been exhausted.

Note: Batch selection does not support debug_log right now: make sure to switch this off when creating scheduler and searcher.

Return type:

List[Dict[str, Union[int, float, str]]]

evaluation_failed(trial_id)[source]

Called by scheduler if an evaluation job for a trial failed.

The searcher should react appropriately (e.g., remove pending evaluations for this trial, not suggest the configuration again).

Parameters:

trial_id (str) – ID of trial whose evaluated failed

class syne_tune.optimizer.schedulers.searchers.GPFIFOSearcher(config_space, metric, points_to_evaluate=None, clone_from_state=False, **kwargs)[source]

Bases: BayesianOptimizationSearcher

Gaussian process Bayesian optimization for FIFO scheduler

This searcher must be used with FIFOScheduler. It provides Bayesian optimization, based on a Gaussian process surrogate model.

It is not recommended creating GPFIFOSearcher searcher objects directly, but rather to create FIFOScheduler objects with searcher="bayesopt", and passing arguments here in search_options. This will use the appropriate functions from :mod:syne_tune.optimizer.schedulers.searchers.gp_searcher_factory to create components in a consistent way.

Most of the implementation is generic in BayesianOptimizationSearcher.

Note: If metric values are to be maximized (mode-"max" in scheduler), the searcher uses map_reward to map metric values to internal criterion values, and minimizes the latter. The default choice is to multiply values by -1.

Pending configurations (for which evaluation tasks are currently running) are dealt with by fantasizing (i.e., target values are drawn from the current posterior, and acquisition functions are averaged over this sample, see num_fantasy_samples).

The GP surrogate model uses a Matern 5/2 covariance function with automatic relevance determination (ARD) of input attributes, and a constant mean function. The acquisition function is expected improvement (EI). All hyperparameters of the surrogate model are estimated by empirical Bayes (maximizing the marginal likelihood). In general, this hyperparameter fitting is the most expensive part of a get_config() call.

Note that the full logic of construction based on arguments is given in :mod:syne_tune.optimizer.schedulers.searchers.gp_searcher_factory. In particular, see gp_fifo_searcher_defaults() for default values.

Additional arguments on top of parent class StochasticSearcher:

Parameters:
  • clone_from_state (bool) – Internal argument, do not use

  • resource_attr (str, optional) – Name of resource attribute in reports. This is optional here, but required for multi-fidelity searchers. If resource_attr and cost_attr are given, cost values are read from each report and stored in the state. This allows cost models to be fit on more data.

  • cost_attr (str, optional) – Name of cost attribute in data obtained from reporter (e.g., elapsed training time). Needed only by cost-aware searchers. Depending on whether resource_attr is given, cost values are read from each report or only at the end.

  • num_init_random (int, optional) – Number of initial get_config() calls for which randomly sampled configs are returned. Afterwards, the model-based searcher is used. Defaults to DEFAULT_NUM_INITIAL_RANDOM_EVALUATIONS

  • num_init_candidates (int, optional) – Number of initial candidates sampled at random in order to seed the model-based search in get_config. Defaults to DEFAULT_NUM_INITIAL_CANDIDATES

  • num_fantasy_samples (int, optional) – Number of samples drawn for fantasizing (latent target values for pending evaluations), defaults to 20

  • no_fantasizing (bool, optional) – If True, fantasizing is not done and pending evaluations are ignored. This may lead to loss of diversity in decisions. Defaults to False

  • input_warping (bool, optional) – If True, we use a warping transform, so the kernel function becomes \(k(w(x), w(x'))\), where \(w(x)\) is a warping transform parameterized by two non-negative numbers per component, which are learned as hyperparameters. See also Warping. Coordinates which belong to categorical hyperparameters, are not warped. Defaults to False.

  • boxcox_transform (bool, optional) – If True, target values are transformed before being fitted with a Gaussian marginal likelihood. This is using the Box-Cox transform with a parameter \(\lambda\), which is learned alongside other parameters of the surrogate model. The transform is \(\log y\) for \(\lambda = 0\), and \(y - 1\) for \(\lambda = 1\). This option requires the targets to be positive. Defaults to False.

  • gp_base_kernel (str, optional) – Selects the covariance (or kernel) function to be used. Supported choices are SUPPORTED_BASE_MODELS. Defaults to “matern52-ard” (Matern 5/2 with automatic relevance determination).

  • acq_function (str, optional) – Selects the acquisition function to be used. Supported choices are SUPPORTED_ACQUISITION_FUNCTIONS. Defaults to “ei” (expected improvement acquisition function).

  • acq_function_kwargs (dict, optional) – Some acquisition functions have additional parameters, they can be passed here. If none are given, default values are used.

  • initial_scoring (str, optional) –

    Scoring function to rank initial candidates (local optimization of EI is started from top scorer):

    • ”thompson_indep”: Independent Thompson sampling; randomized score, which can increase exploration

    • ”acq_func”: score is the same (EI) acquisition function which is used for local optimization afterwards

    Defaults to DEFAULT_INITIAL_SCORING

  • skip_local_optimization (bool, optional) – If True, the local gradient-based optimization of the acquisition function is skipped, and the top-ranked initial candidate (after initial scoring) is returned instead. In this case, initial_scoring="acq_func" makes most sense, otherwise the acquisition function will not be used. Defaults to False

  • opt_nstarts (int, optional) – Parameter for surrogate model fitting. Number of random restarts. Defaults to 2

  • opt_maxiter (int, optional) – Parameter for surrogate model fitting. Maximum number of iterations per restart. Defaults to 50

  • opt_warmstart (bool, optional) – Parameter for surrogate model fitting. If True, each fitting is started from the previous optimum. Not recommended in general. Defaults to False

  • opt_verbose (bool, optional) – Parameter for surrogate model fitting. If True, lots of output. Defaults to False

  • max_size_data_for_model (int, optional) – If this is set, we limit the number of observations the surrogate model is fitted on this value. If there are more observations, they are down sampled, see SubsampleSingleFidelityStateConverter for details. This down sampling is repeated every time the model is fit. The opt_skip_* predicates are evaluated before the state is downsampled. Pass None not to apply such a threshold. The default is DEFAULT_MAX_SIZE_DATA_FOR_MODEL.

  • max_size_top_fraction (float, optional) – Only used if max_size_data_for_model is set. This fraction of the down sampled set is filled with the top entries in the full set, the remaining ones are sampled at random from the full set, see SubsampleSingleFidelityStateConverter for details. Defaults to 0.25.

  • opt_skip_init_length (int, optional) – Parameter for surrogate model fitting, skip predicate. Fitting is never skipped as long as number of observations below this threshold. Defaults to 150

  • opt_skip_period (int, optional) – Parameter for surrogate model fitting, skip predicate. If >1, and number of observations above opt_skip_init_length, fitting is done only K-th call, and skipped otherwise. Defaults to 1 (no skipping)

  • allow_duplicates (bool, optional) – If True, get_config() may return the same configuration more than once. Defaults to False

  • restrict_configurations (List[dict], optional) – If given, the searcher only suggests configurations from this list. This needs skip_local_optimization == True. If allow_duplicates == False, entries are popped off this list once suggested.

  • map_reward (str or MapReward, optional) –

    In the scheduler, the metric may be minimized or maximized, but internally, Bayesian optimization is minimizing the criterion. map_reward converts from metric to internal criterion:

    • ”minus_x”: criterion = -metric

    • ”<a>_minus_x”: criterion = <a> - metric. For example “1_minus_x” maps accuracy to zero-one error

    From a technical standpoint, it does not matter what is chosen here, because criterion is only used internally. Also note that criterion data is always normalized to mean 0, variance 1 before fitted with a Gaussian process. Defaults to “1_minus_x”

  • transfer_learning_task_attr (str, optional) – Used to support transfer HPO, where the state contains observed data from several tasks, one of which is the active one. To this end, config_space must contain a categorical parameter of name transfer_learning_task_attr, whose range are all task IDs. Also, transfer_learning_active_task must denote the active task, and transfer_learning_active_config_space is used as active_config_space argument in HyperparameterRanges. This allows us to use a narrower search space for the active task than for the union of all tasks (config_space must be that), which is needed if some configurations of non-active tasks lie outside of the ranges in active_config_space. One of the implications is that filter_observed_data() is selecting configs of the active task, so that incumbents or exclusion lists are restricted to data from the active task.

  • transfer_learning_active_task (str, optional) – See transfer_learning_task_attr.

  • transfer_learning_active_config_space (Dict[str, Any], optional) – See transfer_learning_task_attr. If not given, config_space is the search space for the active task as well. This active config space need not contain the transfer_learning_task_attr parameter. In fact, this parameter is set to a categorical with transfer_learning_active_task as single value, so that new configs are chosen for the active task only.

  • transfer_learning_model (str, optional) –

    See transfer_learning_task_attr. Specifies the surrogate model to be used for transfer learning:

    • ”matern52_product”: Kernel is product of Matern 5/2 (not ARD) on transfer_learning_task_attr and Matern 5/2 (ARD) on the rest. Assumes that data from same task are more closely related than data from different tasks

    • ”matern52_same”: Kernel is Matern 5/2 (ARD) on the rest of the variables, transfer_learning_task_attr is ignored. Assumes that data from all tasks can be merged together

    Defaults to “matern52_product”

clone_from_state(state)[source]

Together with get_state(), this is needed in order to store and re-create the mutable state of the searcher.

Given state as returned by get_state(), this method combines the non-pickle-able part of the immutable state from self with state and returns the corresponding searcher clone. Afterwards, self is not used anymore.

Parameters:

state – See above

Returns:

New searcher object

class syne_tune.optimizer.schedulers.searchers.GPMultiFidelitySearcher(config_space, metric, points_to_evaluate=None, **kwargs)[source]

Bases: GPFIFOSearcher

Gaussian process Bayesian optimization for asynchronous Hyperband scheduler.

This searcher must be used with a scheduler of type MultiFidelitySchedulerMixin. It provides a novel combination of Bayesian optimization, based on a Gaussian process surrogate model, with Hyperband scheduling. In particular, observations across resource levels are modelled jointly.

It is not recommended to create GPMultiFidelitySearcher searcher objects directly, but rather to create HyperbandScheduler objects with searcher="bayesopt", and passing arguments here in search_options. This will use the appropriate functions from :mod:syne_tune.optimizer.schedulers.searchers.gp_searcher_factory to create components in a consistent way.

Most of GPFIFOSearcher comments apply here as well. In multi-fidelity HPO, we optimize a function \(f(\mathbf{x}, r)\), \(\mathbf{x}\) the configuration, \(r\) the resource (or time) attribute. The latter must be a positive integer. In most applications, resource_attr == "epoch", and the resource is the number of epochs already trained.

If model == "gp_multitask" (default), we model the function \(f(\mathbf{x}, r)\) jointly over all resource levels \(r\) at which it is observed (but see searcher_data in HyperbandScheduler). The kernel and mean function of our surrogate model are over \((\mathbf{x}, r)\). The surrogate model is selected by gp_resource_kernel. More details about the supported kernels is in:

Tiao, Klein, Lienart, Archambeau, Seeger (2020)
Model-based Asynchronous Hyperparameter and Neural Architecture Search

The acquisition function (EI) which is optimized in get_config(), is obtained by fixing the resource level \(r\) to a value which is determined depending on the current state. If resource_acq == ‘bohb’, \(r\) is the largest value <= max_t, where we have seen \(\ge \mathrm{dimension}(\mathbf{x})\) metric values. If resource_acq == "first", \(r\) is the first milestone which config \(\mathbf{x}\) would reach when started.

Additional arguments on top of parent class GPFIFOSearcher.

Parameters:
  • model (str, optional) –

    Selects surrogate model (learning curve model) to be used. Choices are:

    • ”gp_multitask” (default): GP multi-task surrogate model

    • ”gp_independent”: Independent GPs for each rung level, sharing an ARD kernel

    • ”gp_issm”: Gaussian-additive model of ISSM type

    • ”gp_expdecay”: Gaussian-additive model of exponential decay type (as in Freeze Thaw Bayesian Optimization)

  • gp_resource_kernel (str, optional) – Only relevant for model == "gp_multitask". Surrogate model over criterion function \(f(\mathbf{x}, r)\), \(\mathbf{x}\) the config, \(r\) the resource. Note that \(\mathbf{x}\) is encoded to be a vector with entries in [0, 1], and \(r\) is linearly mapped to [0, 1], while the criterion data is normalized to mean 0, variance 1. The reference above provides details on the models supported here. For the exponential decay kernel, the base kernel over \(\mathbf{x}\) is Matern 5/2 ARD. See SUPPORTED_RESOURCE_MODELS for supported choices. Defaults to “exp-decay-sum”

  • resource_acq (str, optional) – Only relevant for ``model in {"gp_multitask", "gp_independent"}. Determines how the EI acquisition function is used. Values: “bohb”, “first”. Defaults to “bohb”

  • max_size_data_for_model (int, optional) –

    If this is set, we limit the number of observations the surrogate model is fitted on this value. If there are more observations, they are down sampled, see SubsampleMultiFidelityStateConverter for details. This down sampling is repeated every time the model is fit, which ensures that most recent data is taken into account. The opt_skip_* predicates are evaluated before the state is downsampled.

    Pass None not to apply such a threshold. The default is DEFAULT_MAX_SIZE_DATA_FOR_MODEL.

  • opt_skip_num_max_resource (bool, optional) – Parameter for surrogate model fitting, skip predicate. If True, and number of observations above opt_skip_init_length, fitting is done only when there is a new datapoint at r = max_t, and skipped otherwise. Defaults to False

  • issm_gamma_one (bool, optional) – Only relevant for model == "gp_issm". If True, the gamma parameter of the ISSM is fixed to 1, otherwise it is optimized over. Defaults to False

  • expdecay_normalize_inputs (bool, optional) – Only relevant for model == "gp_expdecay". If True, resource values r are normalized to [0, 1] as input to the exponential decay surrogate model. Defaults to False

configure_scheduler(scheduler)[source]

Some searchers need to obtain information from the scheduler they are used with, in order to configure themselves. This method has to be called before the searcher can be used.

Parameters:

scheduler (TrialScheduler) – Scheduler the searcher is used with.

register_pending(trial_id, config=None, milestone=None)[source]

Registers trial as pending. This means the corresponding evaluation task is running. Once it finishes, update is called for this trial.

evaluation_failed(trial_id)[source]

Called by scheduler if an evaluation job for a trial failed.

The searcher should react appropriately (e.g., remove pending evaluations for this trial, not suggest the configuration again).

Parameters:

trial_id (str) – ID of trial whose evaluated failed

cleanup_pending(trial_id)[source]

Removes all pending evaluations for trial trial_id.

This should be called after an evaluation terminates. For various reasons (e.g., termination due to convergence), pending candidates for this evaluation may still be present.

Parameters:

trial_id (str) – ID of trial whose pending evaluations should be cleared

remove_case(trial_id, **kwargs)[source]

Remove data case previously appended by _update()

For searchers which maintain the dataset of all cases (reports) passed to update, this method allows to remove one case from the dataset.

Parameters:
  • trial_id (str) – ID of trial whose data is to be removed

  • kwargs – Extra arguments, optional

clone_from_state(state)[source]

Together with get_state(), this is needed in order to store and re-create the mutable state of the searcher.

Given state as returned by get_state(), this method combines the non-pickle-able part of the immutable state from self with state and returns the corresponding searcher clone. Afterwards, self is not used anymore.

Parameters:

state – See above

Returns:

New searcher object

Subpackages

Submodules