syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model module

class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.LinearCostModel[source]

Bases: CostModel

Deterministic cost model where both c0(x) and c1(x) are linear models of the form

c0(x) = np.dot(features0(x), weights0),

c1(x) = np.dot(features1(x), weights1)

The feature maps features0, features1 are supplied by subclasses. The weights are fit by ridge regression, using scikit.learn.RidgeCV, the regularization constant is set by LOO cross-validation.

property cost_metric_name: str

Returns:: Name of metric in TrialEvaluations of cases in TuningJobState

feature_matrices(candidates)[source]

Has to be supplied by subclasses

Parameters:: candidates (List[Dict[str, Union[int, float, str]]]) – List of n candidate configs (non-extended)
Return type:: (ndarray, ndarray)
Returns:: Feature matrices features0 (n, dim0), features1 (n, dim1)

update(state)[source]

Update inner representation in order to be ready to return cost value samples.

Note: The metric :attr``cost_metric_name`` must be dict-valued in state, with keys being resource values \(r\). In order to support a proper estimation of \(c_0\) and \(c_1\), there should (ideally) be entries with the same \(x\) and different resource levels \(r\). The likelihood function takes into account that \(c(x, r) = c_0(x) + r c_1(x)\).

Parameters:: state (TuningJobState) – Current dataset (only trials_evaluations is used)

sample_joint(candidates)[source]

Draws cost values \((c_0(x), c_1(x))\) for candidates (non-extended).

If the model is random, the sampling is done jointly. Also, if sample_joint() is called multiple times, the posterior is to be updated after each call, such that the sample over the union of candidates over all calls is drawn jointly (but see resample()). Also, if measurement noise is allowed in update, this noise is not added here. A sample from \(c(x, r)\) is obtained as \(c_0(x) + r c_1(x)\). If the model is deterministic, the model determined in update() is just evaluated.

Parameters:: candidates (List[Dict[str, Union[int, float, str]]]) – Non-extended configs
Return type:: List[CostValue]
Returns:: List of \((c_0(x), c_1(x))\)

class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.MLPLinearCostModel(num_inputs, num_outputs, num_hidden_layers, hidden_layer_width, batch_size, bs_exponent=None, extra_mlp=False, c0_mlp_feature=False, expected_hidden_layer_width=None)[source]

Bases: LinearCostModel

Deterministic linear cost model for multi-layer perceptron.

If config is a HP configuration, num_hidden_layers(config) is the number of hidden layers, hidden_layer_width(config, layer) is the number of units in hidden layer layer (0-based), batch_size(config) is the batch size.

If expected_hidden_layer_width is given, it maps layer (0-based) to expected layer width under random sampling. In this case, all MLP features are normalized to expected value 1 under random sampling (but ignoring bs_exponent if != 1). Note: If needed, we could incorporate bs_exponent in general. If batch_size was uniform between a and b:

\[ext{E}\left[ bs^{bs_{exp} - 1} \]

ight] =

rac{ ext{b^{bs_{exp}} - a^{bs_{exp}} }{ (bs_{exp} * (b - a) }

feature_matrices(candidates)[source]

Has to be supplied by subclasses

Parameters:: candidates (List[Dict[str, Union[int, float, str]]]) – List of n candidate configs (non-extended)
Return type:: (ndarray, ndarray)
Returns:: Feature matrices features0 (n, dim0), features1 (n, dim1)

class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.FixedLayersMLPCostModel(num_inputs, num_outputs, num_units_keys=None, bs_exponent=None, extra_mlp=False, c0_mlp_feature=False, expected_hidden_layer_width=None)[source]

Bases: MLPLinearCostModel

Linear cost model for MLP with num_hidden_layers hidden layers.

static get_expected_hidden_layer_width(config_space, num_units_keys)[source]

Constructs expected_hidden_layer_width function from the training evaluation function. Works because impute_points_to_evaluate imputes with the expected value under random sampling.

Parameters:

config_space (Dict) – Configuration space
num_units_keys (List[str]) – Keys into config_space for number of units of different layers

Returns:

expected_hidden_layer_width, exp_vals

class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.NASBench201LinearCostModel(config_keys, map_config_values, conv_separate_features, count_sum)[source]

Bases: LinearCostModel

Deterministic linear cost model for NASBench201.

The cell graph is:

node1 = x0(node0)

node2 = x1(node0) + x2(node1)

node3 = x3(node0) + x4(node1) + x5(node2)

config_keys contains attribute names of x0, ..., x5 in a config, in this ordering. map_config_values maps values in the config (for fields corresponding to x0, ..., x5) to entries of Op.

Parameters:

config_keys (Tuple[str, ...]) – See above
map_config_values (Dict[str, int]) – See above
conv_separate_features (bool) – If True, we use separate features for nor_conv_1x1, nor_conv_3x3 (c1 has 4 features). Otherwise, these two are captured by a single features (c1 has 3 features)
count_sum (bool) – If True, we use an additional feature for pointwise sum operators inside a cell (there are between 0 and 3)

class Op(value)[source]

Bases: IntEnum

An enumeration.

SKIP_CONNECT = 0

NONE = 1

NOR_CONV_1x1 = 2

NOR_CONV_3x3 = 3

AVG_POOL_3x3 = 4

feature_matrices(candidates)[source]

Has to be supplied by subclasses

Parameters:: candidates (List[Dict[str, Union[int, float, str]]]) – List of n candidate configs (non-extended)
Return type:: (ndarray, ndarray)
Returns:: Feature matrices features0 (n, dim0), features1 (n, dim1)

class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.BiasOnlyLinearCostModel[source]

Bases: LinearCostModel

Simple baseline: features0(x) = [1], features1(x) = [1]

feature_matrices(candidates)[source]

Has to be supplied by subclasses

Parameters:: candidates (List[Dict[str, Union[int, float, str]]]) – List of n candidate configs (non-extended)
Return type:: (ndarray, ndarray)
Returns:: Feature matrices features0 (n, dim0), features1 (n, dim1)