syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw module
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw.ZeroKernel(dimension, **kwargs)[source]
Bases:
KernelFunction
Constant zero kernel. This works only in the context used here, we do return matrices or vectors, but zero scalars.
- forward(X1, X2, **kwargs)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw.ZeroMean(**kwargs)[source]
Bases:
MeanFunction
- forward(X)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw.ExponentialDecayBaseKernelFunction(r_max, r_min=1, normalize_inputs=False, **kwargs)[source]
Bases:
KernelFunction
Implements exponential decay kernel k_r(r, r’) from the Freeze-Thaw paper, corresponding to
ExponentialDecayResourcesKernelFunction
with delta=0 and no x attributes.Note: Inputs r lie in [r_min, r_max]. Optionally, they are normalized to [0, 1].
- forward(X1, X2)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
- param_encoding_pairs()[source]
- Returns list of tuples
(param_internal, encoding)
over all Gluon parameters maintained here.
- Returns:
List [(param_internal, encoding)]
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw.logdet_cholfact_cov_resource(likelihood)[source]
Computes the additional log(det(Lbar)) term. This is sum_i log(det(Lbar_i)), where Lbar_i is upper left submatrix of
likelihood['lfact_all']
, with sizelikelihood['ydims'][i]
.- Parameters:
likelihood (
Dict
) – Result ofresource_kernel_likelihood_computations
- Return type:
float
- Returns:
log(det(Lbar))
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw.resource_kernel_likelihood_precomputations(targets)[source]
Precomputations required by
resource_kernel_likelihood_computations
.Importantly,
prepare_data
orders datapoints by nonincreasing number of targetsydims[i]
. For0 <= j < ydim_max
,ydim_max = ydims[0] = max(ydims)
,num_configs[j]
is the number of datapoints i for whichydims[i] > j
.yflat
is a flat matrix (rows corresponding to fantasy samples; column vector if no fantasizing) consisting ofydim_max
parts, where part j is of sizenum_configs[j]
and containsy[j]
for targets of those i counted innum_configs[j]
.- Parameters:
targets (
List
[ndarray
]) – Targets from data representation returned byprepare_data
- Return type:
Dict
- Returns:
See above
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw.resource_kernel_likelihood_computations(precomputed, res_kernel, noise_variance, skip_c_d=False)[source]
Given
precomputed
fromresource_kernel_likelihood_precomputations
and resource kernel functionres_kernel
, compute quantities required for inference and marginal likelihood computation, pertaining to the likelihood of a additive model, as in the Freeze-Thaw paper.Note that
res_kernel
takes raw (unnormalized) r as inputs. The code here works for any resource kernel and mean function, not just forExponentialDecayBaseKernelFunction
.Results returned are: - c: n vector [c_i] - d: n vector [d_i], positive - vtv: n vector [|v_i|^2] - wtv: (n, F) matrix[(W_i)^T v_i], F number of fantasy samples - wtw: n vector [|w_i|^2] (only if no fantasizing) - lfact_all: Cholesky factor for kernel matrix - ydims: Target vector sizes (copy from
precomputed
)- Parameters:
precomputed (
Dict
) – Output ofresource_kernel_likelihood_precomputations
res_kernel (
ExponentialDecayBaseKernelFunction
) – Kernel k(r, r’) over resourcesnoise_variance – Noise variance sigma^2
skip_c_d (
bool
) – If True, c and d are not computed
- Return type:
Dict
- Returns:
Quantities required for inference and learning criterion
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw.resource_kernel_likelihood_slow_computations(targets, res_kernel, noise_variance, skip_c_d=False)[source]
Naive implementation of
resource_kernel_likelihood_computations
, which does not require precomputations, but is somewhat slower. Here, results are computed one datapoint at a time, instead of en bulk.This code is used in unit testing only.
- Return type:
Dict
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw.predict_posterior_marginals_extended(poster_state, mean, kernel, test_features, resources, res_kernel)[source]
These are posterior marginals on f_r = h + g_r variables, where (x, r) are zipped from
test_features
,resources
.posterior_means
is a (n, F) matrix, where F is the number of fantasy samples, or F == 1 without fantasizing.- Parameters:
poster_state (
Dict
) – Posterior statemean – Mean function
kernel – Kernel function
test_features – Feature matrix for test points (not extended)
resources (
List
[int
]) – Resource values corresponding to rows oftest_features
res_kernel (
ExponentialDecayBaseKernelFunction
) – Kernel k(r, r’) over resources
- Returns:
posterior_means, posterior_variances
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.learncurve.freeze_thaw.sample_posterior_joint(poster_state, mean, kernel, feature, targets, res_kernel, noise_variance, lfact_all, means_all, random_state, num_samples=1)[source]
Given
poster_state
for some data plus one additional configuration with data (feature
,targets
), draw joint samples of unobserved targets for this configuration.targets
may be empty, but must not be complete (there must be some unobserved targets). The additional configuration must not be in the dataset used to computeposter_state
.If
targets
correspond to resource values range(r_min, r_obs), we sample latent target values y_r corresponding to range(r_obs, r_max+1), returning a dict with [y_r] undery
(matrix withnum_samples
columns).- Parameters:
poster_state (
Dict
) – Posterior state for datamean – Mean function
kernel – Kernel function
feature – Features for additional config
targets (
ndarray
) – Target values for additional configres_kernel (
ExponentialDecayBaseKernelFunction
) – Kernel k(r, r’) over resourcesnoise_variance – Noise variance sigma^2
lfact_all – Cholesky factor of complete resource kernel matrix
means_all – See
lfact_all
random_state (
RandomState
) – numpy.random.RandomStatenum_samples (
int
) – Number of joint samples to draw (default: 1)
- Return type:
Dict
- Returns:
See above