syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model module
- class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.LinearCostModel[source]
Bases:
CostModel
Deterministic cost model where both
c0(x)
andc1(x)
are linear models of the formc0(x) = np.dot(features0(x), weights0)
,c1(x) = np.dot(features1(x), weights1)
The feature maps
features0
,features1
are supplied by subclasses. The weights are fit by ridge regression, usingscikit.learn.RidgeCV
, the regularization constant is set by LOO cross-validation.- property cost_metric_name: str
- Returns:
Name of metric in
TrialEvaluations
of cases inTuningJobState
- feature_matrices(candidates)[source]
Has to be supplied by subclasses
- Parameters:
candidates (
List
[Dict
[str
,Union
[int
,float
,str
]]]) – List of n candidate configs (non-extended)- Return type:
(
ndarray
,ndarray
)- Returns:
Feature matrices
features0
(n, dim0)
,features1
(n, dim1)
- update(state)[source]
Update inner representation in order to be ready to return cost value samples.
Note: The metric :attr``cost_metric_name`` must be dict-valued in
state
, with keys being resource values \(r\). In order to support a proper estimation of \(c_0\) and \(c_1\), there should (ideally) be entries with the same \(x\) and different resource levels \(r\). The likelihood function takes into account that \(c(x, r) = c_0(x) + r c_1(x)\).- Parameters:
state (
TuningJobState
) – Current dataset (onlytrials_evaluations
is used)
- sample_joint(candidates)[source]
Draws cost values \((c_0(x), c_1(x))\) for candidates (non-extended).
If the model is random, the sampling is done jointly. Also, if
sample_joint()
is called multiple times, the posterior is to be updated after each call, such that the sample over the union of candidates over all calls is drawn jointly (but seeresample()
). Also, if measurement noise is allowed in update, this noise is not added here. A sample from \(c(x, r)\) is obtained as \(c_0(x) + r c_1(x)\). If the model is deterministic, the model determined inupdate()
is just evaluated.- Parameters:
candidates (
List
[Dict
[str
,Union
[int
,float
,str
]]]) – Non-extended configs- Return type:
List
[CostValue
]- Returns:
List of \((c_0(x), c_1(x))\)
- class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.MLPLinearCostModel(num_inputs, num_outputs, num_hidden_layers, hidden_layer_width, batch_size, bs_exponent=None, extra_mlp=False, c0_mlp_feature=False, expected_hidden_layer_width=None)[source]
Bases:
LinearCostModel
Deterministic linear cost model for multi-layer perceptron.
If config is a HP configuration,
num_hidden_layers(config)
is the number of hidden layers,hidden_layer_width(config, layer)
is the number of units in hidden layerlayer
(0-based),batch_size(config)
is the batch size.If
expected_hidden_layer_width
is given, it mapslayer
(0-based) to expected layer width under random sampling. In this case, all MLP features are normalized to expected value 1 under random sampling (but ignoringbs_exponent
if != 1). Note: If needed, we could incorporatebs_exponent
in general. Ifbatch_size
was uniform between a and b:\[ext{E}\left[ bs^{bs_{exp} - 1} \]ight] =
rac{ ext{b^{bs_{exp}} - a^{bs_{exp}} }{ (bs_{exp} * (b - a) }
- class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.FixedLayersMLPCostModel(num_inputs, num_outputs, num_units_keys=None, bs_exponent=None, extra_mlp=False, c0_mlp_feature=False, expected_hidden_layer_width=None)[source]
Bases:
MLPLinearCostModel
Linear cost model for MLP with
num_hidden_layers
hidden layers.Constructs expected_hidden_layer_width function from the training evaluation function. Works because
impute_points_to_evaluate
imputes with the expected value under random sampling.- Parameters:
config_space (
Dict
) – Configuration spacenum_units_keys (
List
[str
]) – Keys intoconfig_space
for number of units of different layers
- Returns:
expected_hidden_layer_width
,exp_vals
- class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.NASBench201LinearCostModel(config_keys, map_config_values, conv_separate_features, count_sum)[source]
Bases:
LinearCostModel
Deterministic linear cost model for NASBench201.
The cell graph is:
node1 = x0(node0)
node2 = x1(node0) + x2(node1)
node3 = x3(node0) + x4(node1) + x5(node2)
config_keys
contains attribute names ofx0, ..., x5
in a config, in this ordering.map_config_values
maps values in the config (for fields corresponding tox0, ..., x5
) to entries ofOp
.- Parameters:
config_keys (
Tuple
[str
,...
]) – See abovemap_config_values (
Dict
[str
,int
]) – See aboveconv_separate_features (
bool
) – If True, we use separate features fornor_conv_1x1
,nor_conv_3x3
(c1
has 4 features). Otherwise, these two are captured by a single features (c1
has 3 features)count_sum (
bool
) – If True, we use an additional feature for pointwise sum operators inside a cell (there are between 0 and 3)
- class syne_tune.optimizer.schedulers.searchers.bayesopt.models.cost.linear_cost_model.BiasOnlyLinearCostModel[source]
Bases:
LinearCostModel
Simple baseline:
features0(x) = [1], features1(x) = [1]