syne_tune.optimizer.schedulers.searchers.bayesopt.models.subsample_state_multi_fidelity module

syne_tune.optimizer.schedulers.searchers.bayesopt.models.subsample_state_multi_fidelity.cap_size_tuning_job_state(state, max_size, random_state=None)[source]

Returns state which is identical to state, except that the trials_evaluations are replaced by a subset so the total number of metric values is <= max_size. Filtering is done by preserving data from trials which have observations at the higher resource levels. For some trials, we may remove values at low resources, but keep values at higher ones, in order to meet the max_size constraint.

Parameters:
  • state (TuningJobState) – Original state to filter down

  • max_size (int) – Maximum number of observed metric values in new state

  • random_state (Optional[RandomState]) – Used for random sampling. Defaults to numpy.random.

Return type:

TuningJobState

Returns:

New state meeting the max_size constraint. This is a copy of state even if this meets the constraint already.

class syne_tune.optimizer.schedulers.searchers.bayesopt.models.subsample_state_multi_fidelity.SubsampleMultiFidelityStateConverter(max_size, random_state=None)[source]

Bases: StateForModelConverter

Converts state by (possibly) down sampling the observation so that their total number is <= max_size. This is done in a way that trials with observations in higher rung levels are retained (with all their data), so observations are preferentially removed at low levels, and from trials which do not have observations higher up.

This state converter makes sense if observed data is only used at geometrically spaced rung levels, so the number of observations per trial remains small. If a trial runs up on the order of max_resource_level observations, it does not work, because it ends up retaining densely sampled observations from very few trials. Use SubsampleMFDenseDataStateConverter in such a case.

set_random_state(random_state)[source]

Some state converters use random sampling. For these, the random state has to be set before first usage.

Parameters:

random_state (RandomState) – Random state to be used internally

syne_tune.optimizer.schedulers.searchers.bayesopt.models.subsample_state_multi_fidelity.sparsify_tuning_job_state(state, max_size, grace_period, reduction_factor)[source]

Does the first step of state conversion in SubsampleMFDenseDataStateConverter, in that dense observations are sparsified w.r.t. a geometrically spaced rung level system.

Parameters:
  • state (TuningJobState) – Original state to filter down

  • max_size (int) – Maximum number of observed metric values in new state

  • grace_period (int) – Minimum resource level \(r_{min}\)

  • reduction_factor (float) – Reduction factor \(\eta\)

Return type:

TuningJobState

Returns:

New state which either meets the max_size constraint, or is maximally sparsified

class syne_tune.optimizer.schedulers.searchers.bayesopt.models.subsample_state_multi_fidelity.SubsampleMFDenseDataStateConverter(max_size, grace_period=None, reduction_factor=None, random_state=None)[source]

Bases: SubsampleMultiFidelityStateConverter

Variant of SubsampleMultiFidelityStateConverter, which has the same goal, but does subsampling in a different way. The current default for most GP-based multi-fidelity algorithms (e.g., MOBSTER, Hyper-Tune) is to use observations only at geometrically spaced rung levels (such as 1, 3, 9, …), and SubsampleMultiFidelityStateConverter makes sense.

But for some (e.g., DyHPO), observations are recorded at all (or linearly spaced) resource levels, so there is much more data for trials which progressed further. Here, we do the state conversion in two steps, always stopping the process once the target size max_size is reached. We assume a geometric rung level spacing, given by grace_period and reduction_factor, only for the purpose of state conversion. In the first step, we sparsify the observations. If each rung level \(r_k`\) defines a bucket \(B_k = r_{k-1} + 1, \dots, r_k\), each trial should have at most one observation in each bucket. Sparsification is done top down. If the result of this first step is still larger than max_size, we continue with subsampling as in SubsampleMultiFidelityStateConverter.