syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel package
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.KernelFunction(dimension, **kwargs)[source]
Bases:
MeanFunction
Base class of kernel (or covariance) function math:
k(x, x')
- Parameters:
dimension (
int
) – Dimensionality of input points after encoding intondarray
- property dimension
- Returns:
Dimension d of input points
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.Matern52(dimension, ARD=False, encoding_type='logarithm', has_covariance_scale=True, **kwargs)[source]
Bases:
KernelFunction
Block that is responsible for the computation of Matern 5/2 kernel.
if
ARD == False
,inverse_bandwidths
is equal to a scalar broadcast to the d components (withd = dimension
, i.e., the number of features inX
).Arguments on top of base class
SquaredDistance
:- Parameters:
has_covariance_scale (
bool
) – Kernel has covariance scale parameter? Defaults toTrue
- property ARD: bool
- forward(X1, X2)[source]
Computes Matern 5/2 kernel matrix
- Parameters:
X1 – input matrix, shape
(n1,d)
X2 – input matrix, shape
(n2,d)
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.ExponentialDecayResourcesKernelFunction(kernel_x, mean_x, encoding_type='logarithm', alpha_init=1.0, mean_lam_init=0.5, gamma_init=0.5, delta_fixed_value=None, delta_init=0.5, max_metric_value=1.0, **kwargs)[source]
Bases:
KernelFunction
Variant of the kernel function for modeling exponentially decaying learning curves, proposed in:
Swersky, K., Snoek, J., & Adams, R. P. (2014).Freeze-Thaw Bayesian Optimization.The argument in that paper actually justifies using a non-zero mean function (see
ExponentialDecayResourcesMeanFunction
) and centralizing the kernel proposed there. This is done here. Details in:Tiao, Klein, Archambeau, Seeger (2020)Model-based Asynchronous Hyperparameter OptimizationWe implement a new family of kernel functions, for which the additive Freeze-Thaw kernel is one instance (
delta == 0
). The kernel has parametersalpha
,mean_lam
,gamma > 0
, and0 <= delta <= 1
. Note thatbeta = alpha / mean_lam
is used in the Freeze-Thaw paper (the Gamma distribution overlambda
is parameterized differently). The additive Freeze-Thaw kernel is obtained fordelta == 0
(usedelta_fixed_value = 0
).In fact, this class is configured with a kernel and a mean function over inputs
x
(dimensiond
) and represents a kernel (and mean function) over inputs(x, r)
(dimensiond + 1
), where the resource attributer >= 0
is last.- forward(X1, X2, **kwargs)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
- param_encoding_pairs()[source]
- Returns list of tuples
(param_internal, encoding)
over all Gluon parameters maintained here.
- Returns:
List [(param_internal, encoding)]
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.ExponentialDecayResourcesMeanFunction(kernel, **kwargs)[source]
Bases:
MeanFunction
- forward(X)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.FabolasKernelFunction(dimension=1, encoding_type='logarithm', u1_init=1.0, u3_init=0.0, **kwargs)[source]
Bases:
KernelFunction
The kernel function proposed in:
Klein, A., Falkner, S., Bartels, S., Hennig, P., & Hutter, np. (2016). Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets, in AISTATS 2017. ArXiv:1605.07079 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1605.07079
Please note this is only one of the components of the factorized kernel proposed in the paper. This is the finite-rank (“degenerate”) kernel for modelling data subset fraction sizes. Defined as:
k(x, y) = (U phi(x))^T (U phi(y)), x, y in [0, 1], phi(x) = [1, (1 - x)^2]^T, U = [[u1, u3], [0, u2]] upper triangular, u1, u2 > 0.
- forward(X1, X2)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.ProductKernelFunction(kernel1, kernel2, name_prefixes=None, **kwargs)[source]
Bases:
KernelFunction
Given two kernel functions K1, K2, this class represents the product kernel function given by
\[((x_1, x_2), (y_1, y_2)) \mapsto K(x_1, y_1) \cdot K(x_2, y_2)\]We assume that parameters of K1 and K2 are disjoint.
- forward(X1, X2)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.FreezeThawKernelFunction(kernel_x, mean_x, encoding_type='logarithm', alpha_init=1.0, mean_lam_init=0.5, gamma_init=0.5, max_metric_value=1.0, **kwargs)[source]
Bases:
KernelFunction
Variant of the kernel function for modeling exponentially decaying learning curves, proposed in:
Swersky, K., Snoek, J., & Adams, R. P. (2014). Freeze-Thaw Bayesian Optimization. ArXiv:1406.3896 [Cs, Stat). Retrieved from http://arxiv.org/abs/1406.3896
The argument in that paper actually justifies using a non-zero mean function (see
ExponentialDecayResourcesMeanFunction
) and centralizing the kernel proposed there. This is done here.As in the Freeze-Thaw paper, learning curves for different configs are conditionally independent.
This class is configured with a kernel and a mean function over inputs x (dimension d) and represents a kernel (and mean function) over inputs (x, r) (dimension d + 1), where the resource attribute r >= 0 is last.
Note: This kernel is mostly for debugging! Its conditional independence assumptions allow for faster inference, as implemented in
GaussProcExpDecayPosteriorState
.- forward(X1, X2, **kwargs)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
- param_encoding_pairs()[source]
- Returns list of tuples
(param_internal, encoding)
over all Gluon parameters maintained here.
- Returns:
List [(param_internal, encoding)]
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.FreezeThawMeanFunction(kernel, **kwargs)[source]
Bases:
MeanFunction
- forward(X)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.CrossValidationMeanFunction(kernel, **kwargs)[source]
Bases:
MeanFunction
- forward(X)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.CrossValidationKernelFunction(kernel_main, kernel_residual, mean_main, num_folds, **kwargs)[source]
Bases:
KernelFunction
Kernel function suitable for \(f(x, r)\) being the average of
r
validation metrics evaluated on different (train, validation) splits.More specifically, there are ‘num_folds`` such splits, and \(f(x, r)\) is the average over the first
r
of them.We model the score on fold
k
as \(e_k(x) = f(x) + g_k(x)\), where \(f(x)\) and the \(g_k(x)\) are a priori independent Gaussian processes with kernelskernel_main
andkernel_residual
(all \(g_k\) share the same kernel). Moreover, the \(g_k\) are zero-mean, while \(f(x)\) may have a mean function. Then:\[ \begin{align}\begin{aligned}f(x, r) = r^{-1} sum_{k \le r} e_k(x),\\k((x, r), (x', r')) = k_{main}(x, x') + \frac{k_{residual}(x, x')}{\mathrm{max}(r, r')}\end{aligned}\end{align} \]Note that
kernel_main
,kernel_residual
are over inputs \(x\) (dimensiond
), while the kernel represented here is over inputs \((x, r)\) of dimensiond + 1
, where the resource attribute \(r\) (number of folds) is last.Inputs are encoded. We assume a linear encoding for r with bounds 1 and
num_folds
. TODO: Right now, all HPs are encoded, and the resource attribute counts as HP, even if it is not optimized over. This creates a dependence to how inputs are encoded.- forward(X1, X2, **kwargs)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.RangeKernelFunction(dimension, kernel, start, **kwargs)[source]
Bases:
KernelFunction
Given kernel function
K
and rangeR
, this class represents\[(x, y) \mapsto K(x_R, y_R)\]- forward(X1, X2)[source]
Overrides to implement forward computation using
NDArray
. Only accepts positional arguments. Parameters ———- *args : list of NDArrayInput tensors.
- diagonal(X)[source]
- Parameters:
X – Input data, shape
(n, d)
- Returns:
Diagonal of \(k(X, X)\), shape
(n,)
- diagonal_depends_on_X()[source]
For stationary kernels, diagonal does not depend on
X
- Returns:
Does
diagonal()
depend onX
?
Submodules
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.base module
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.cross_validation module
decode_resource_values()
CrossValidationKernelFunction
CrossValidationKernelFunction.forward()
CrossValidationKernelFunction.diagonal()
CrossValidationKernelFunction.diagonal_depends_on_X()
CrossValidationKernelFunction.param_encoding_pairs()
CrossValidationKernelFunction.mean_function()
CrossValidationKernelFunction.get_params()
CrossValidationKernelFunction.set_params()
CrossValidationMeanFunction
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.exponential_decay module
ExponentialDecayResourcesKernelFunction
ExponentialDecayResourcesKernelFunction.forward()
ExponentialDecayResourcesKernelFunction.diagonal()
ExponentialDecayResourcesKernelFunction.diagonal_depends_on_X()
ExponentialDecayResourcesKernelFunction.param_encoding_pairs()
ExponentialDecayResourcesKernelFunction.mean_function()
ExponentialDecayResourcesKernelFunction.get_params()
ExponentialDecayResourcesKernelFunction.set_params()
ExponentialDecayResourcesMeanFunction
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.fabolas module
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.freeze_thaw module
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.product_kernel module
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.kernel.range_kernel module