syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.custom_op module
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.custom_op.AddJitterOp(*args, **kwargs)
Finds smaller jitter to add to diagonal of square matrix to render the matrix positive definite (in that linalg.potrf works).
Given input x (positive semi-definite matrix) and
sigsq_init
(nonneg scalar), findsigsq_final
(nonneg scalar), so that:sigsq_final = sigsq_init + jitter
,jitter >= 0
,x + sigsq_final * Id
positive definite (so thatpotrf
call works)We return the matrix
x + sigsq_final * Id
, for whichpotrf
has not failed.For the gradient, the dependence of jitter on the inputs is ignored.
The values tried for sigsq_final are:
sigsq_init, sigsq_init + initial_jitter * (jitter_growth ** k)
,k = 0, 1, 2, ...
,initial_jitter = initial_jitter_factor * max(mean(diag(x)), 1)
Note: The scaling of initial_jitter with
mean(diag(x))
is taken fromGPy
. The rationale is that the largest eigenvalue of x is>= mean(diag(x))
, and likely of this magnitude.There is no guarantee that the Cholesky factor returned is well-conditioned enough for subsequent computations to be reliable. A better solution would be to estimate the condition number of the Cholesky factor, and to add jitter until this is bounded below a threshold we tolerate. See
Higham, N.A Survey of Condition Number Estimation for Triangular MatricesMIMS EPrint: 2007.10Algorithm 4.1 could work for us.
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.custom_op.flatten_and_concat(x, sigsq_init)[source]
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.custom_op.cholesky_factorization(*args, **kwargs)
Replacement for
autograd.numpy.linalg.cholesky()
. Our backward (vjp) is faster and simpler, while somewhat less general (only works ifa.ndim == 2
).See https://arxiv.org/abs/1710.08717 for derivation of backward (vjp) expression.
- Parameters:
a – Symmmetric positive definite matrix A
- Returns:
Lower-triangular Cholesky factor L of A