syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.custom_op module
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.custom_op.AddJitterOp(*args, **kwargs)
Finds smaller jitter to add to diagonal of square matrix to render the matrix positive definite (in that linalg.potrf works).
Given input x (positive semi-definite matrix) and
sigsq_init(nonneg scalar), findsigsq_final(nonneg scalar), so that:sigsq_final = sigsq_init + jitter,jitter >= 0,x + sigsq_final * Idpositive definite (so thatpotrfcall works)We return the matrix
x + sigsq_final * Id, for whichpotrfhas not failed.For the gradient, the dependence of jitter on the inputs is ignored.
The values tried for sigsq_final are:
sigsq_init, sigsq_init + initial_jitter * (jitter_growth ** k),k = 0, 1, 2, ...,initial_jitter = initial_jitter_factor * max(mean(diag(x)), 1)Note: The scaling of initial_jitter with
mean(diag(x))is taken fromGPy. The rationale is that the largest eigenvalue of x is>= mean(diag(x)), and likely of this magnitude.There is no guarantee that the Cholesky factor returned is well-conditioned enough for subsequent computations to be reliable. A better solution would be to estimate the condition number of the Cholesky factor, and to add jitter until this is bounded below a threshold we tolerate. See
Higham, N.A Survey of Condition Number Estimation for Triangular MatricesMIMS EPrint: 2007.10Algorithm 4.1 could work for us.
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.custom_op.flatten_and_concat(x, sigsq_init)[source]
- syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.custom_op.cholesky_factorization(*args, **kwargs)
Replacement for
autograd.numpy.linalg.cholesky(). Our backward (vjp) is faster and simpler, while somewhat less general (only works ifa.ndim == 2).See https://arxiv.org/abs/1710.08717 for derivation of backward (vjp) expression.
- Parameters:
a – Symmmetric positive definite matrix A
- Returns:
Lower-triangular Cholesky factor L of A