syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.gluon module

Gluon APIs for autograd

class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.gluon.Block(prefix=None, params=None)[source]

Bases: object

Base class for all neural network layers and models. Your models should subclass this class. Block can be nested recursively in a tree structure. You can create and assign child Block as regular attributes:

from mxnet.gluon import Block, nn
from mxnet import ndarray as F
class Model(Block):
    def __init__(self, **kwargs):
        super(Model, self).__init__(**kwargs)
        # use name_scope to give child Blocks appropriate names.
        with self.name_scope():
            self.dense0 = nn.Dense(20)
            self.dense1 = nn.Dense(20)
    def forward(self, x):
        x = F.relu(self.dense0(x))
        return F.relu(self.dense1(x))
model = Model()
model.initialize(ctx=mx.cpu(0))
model(F.zeros((10, 10), ctx=mx.cpu(0)))

Child Block assigned this way will be registered and collect_params() will collect their Parameters recursively. You can also manually register child blocks with register_child(). Parameters ———- prefix : str

Prefix acts like a name space. All children blocks created in parent block’s name_scope() will have parent block’s prefix in their name. Please refer to naming tutorial for more info on prefix and naming.

paramsParameterDict or None

ParameterDict for sharing weights with the new Block. For example, if you want dense1 to share dense0’s weights, you can do:

dense0 = nn.Dense(20)
dense1 = nn.Dense(20, params=dense0.collect_params())
property prefix

Prefix of this Block.

property name

Name of this Block, without ‘_’ in the end.

name_scope()[source]

Returns a name space object managing a child Block and parameter names. Should be used within a with statement:

with self.name_scope():
    self.dense = nn.Dense(20)

Please refer to the naming tutorial for more info on prefix and naming.

property params

Returns this Block’s parameter dictionary (does not include its children’s parameters).

collect_params(select=None)[source]

Returns a ParameterDict containing this Block and all of its children’s Parameters(default), also can returns the select ParameterDict which match some given regular expressions. For example, collect the specified parameters in [‘conv1_weight’, ‘conv1_bias’, ‘fc_weight’, ‘fc_bias’]:

model.collect_params('conv1_weight|conv1_bias|fc_weight|fc_bias')

or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:

model.collect_params('.*weight|.*bias')

Parameters

selectstr

regular expressions

Returns

The selected ParameterDict

register_child(block, name=None)[source]

Registers block as a child of self. Block s assigned to self as attributes will be registered automatically.

apply(fn)[source]

Applies fn recursively to every child block as well as self. Parameters ———- fn : callable

Function to be applied to each submodule, of form fn(block).

Returns

this block

initialize(init=None, ctx=None, verbose=False, force_reinit=False)[source]

Initializes Parameter s of this Block and its children. Equivalent to block.collect_params().initialize(...) Parameters ———- init : Initializer

Global default Initializer to be used when Parameter.init() is None. Otherwise, Parameter.init() takes precedence.

ctxContext or list of Context

Keeps a copy of Parameters on one or many context(s).

verbosebool, default False

Whether to verbosely print out details on initialization.

force_reinitbool, default False

Whether to force re-initialization if parameter is already initialized.

hybridize(active=True, **kwargs)[source]

Please refer description of HybridBlock hybridize().

cast(dtype)[source]

Cast this Block to use another data type. Parameters ———- dtype : str or numpy.dtype

The new data type.

forward(*args)[source]

Overrides to implement forward computation using NDArray. Only accepts positional arguments. Parameters ———- *args : list of NDArray

Input tensors.

hybrid_forward(*args)[source]
class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.gluon.Parameter(name, grad_req='write', shape=None, dtype=<class 'numpy.float64'>, lr_mult=1.0, wd_mult=1.0, init=None, allow_deferred_init=False, differentiable=True, stype='default', grad_stype='default')[source]

Bases: object

A Container holding parameters (weights) of Blocks. Parameter holds a copy of the parameter on each Context after it is initialized with Parameter.initialize(...). If grad_req is not 'null', it will also hold a gradient array on each Context:

x = np.zeros((16, 100))
w = Parameter('fc_weight', shape=(16, 100), init=np.random.uniform)
w.initialize()
b.initialize()
z = x + w.data

Parameters

namestr

Name of this parameter.

grad_req{‘write’, ‘add’, ‘null’}, default ‘write’

Specifies how to update gradient to grad arrays. - 'write' means everytime gradient is written to grad NDArray. - 'add' means everytime gradient is added to the grad NDArray. You need

to manually call zero_grad() to clear the gradient buffer before each iteration when using this option.

  • ‘null’ means gradient is not requested for this parameter. gradient arrays will not be allocated.

shapeint or tuple of int, default None

Shape of this parameter. By default shape is not specified. Parameter with unknown shape can be used for Symbol API, but init will throw an error when using NDArray API.

dtypenumpy.dtype or str, default ‘float64’

Data type of this parameter. For example, numpy.float64 or 'float64'.

lr_multfloat, default 1.0

Learning rate multiplier. Learning rate will be multiplied by lr_mult when updating this parameter with optimizer.

wd_multfloat, default 1.0

Weight decay multiplier (L2 regularizer coefficient). Works similar to lr_mult.

initInitializer, default None

Initializer of this parameter. Will use the global initializer by default.

stype: {‘default’, ‘row_sparse’, ‘csr’}, defaults to ‘default’.

The storage type of the parameter.

grad_stype: {‘default’, ‘row_sparse’, ‘csr’}, defaults to ‘default’.

The storage type of the parameter’s gradient.

Attributes

grad_req{‘write’, ‘add’, ‘null’}

This can be set before or after initialization. Setting grad_req to 'null' with x.grad_req = 'null' saves memory and computation when you don’t need gradient w.r.t x.

lr_multfloat

Local learning rate multiplier for this Parameter. The actual learning rate is calculated with learning_rate * lr_mult. You can set it with param.lr_mult = 2.0

wd_multfloat

Local weight decay multiplier for this Parameter.

property grad_req
property dtype

The type of the parameter. Setting the dtype value is equivalent to casting the value of the parameter

property shape

The shape of the parameter. By default, an unknown dimension size is 0. However, when the NumPy semantic is turned on, unknown dimension size is -1.

initialize(init=None, ctx=None, default_init=None, force_reinit=False)[source]

Initializes parameter and gradient arrays. Only used for NDArray API. Parameters ———- init : Initializer

The initializer to use. Overrides Parameter.init() and default_init.

ctxContext or list of Context, defaults to context.current_context().

Initialize Parameter on given context. If ctx is a list of Context, a copy will be made for each context. .. note:

Copies are independent arrays. User is responsible for keeping
their values consistent when updating.
Normally :py:class:`gluon.Trainer` does this for you.
default_initInitializer

Default initializer is used when both init() and Parameter.init() are None.

force_reinitbool, default False

Whether to force re-initialization if parameter is already initialized.

Examples

>>> weight = mx.gluon.Parameter('weight', shape=(2, 2))
>>> weight.initialize(ctx=mx.cpu(0))
>>> weight.data()
[[-0.01068833  0.01729892]
 [ 0.02042518 -0.01618656]]
<NDArray 2x2 @cpu(0)>
>>> weight.grad()
[[ 0.  0.]
 [ 0.  0.]]
<NDArray 2x2 @cpu(0)>
>>> weight.initialize(ctx=[mx.gpu(0), mx.gpu(1)])
>>> weight.data(mx.gpu(0))
[[-0.00873779 -0.02834515]
 [ 0.05484822 -0.06206018]]
<NDArray 2x2 @gpu(0)>
>>> weight.data(mx.gpu(1))
[[-0.00873779 -0.02834515]
 [ 0.05484822 -0.06206018]]
<NDArray 2x2 @gpu(1)>
reset_ctx(ctx)[source]

Re-assign Parameter to other contexts. Parameters ———- ctx : Context or list of Context, default context.current_context().

Assign Parameter to given context. If ctx is a list of Context, a copy will be made for each context.

set_data(data)[source]

Sets this parameter’s value on all contexts.

data(ctx=None)[source]

Returns a copy of this parameter on one context. Must have been initialized on this context before. For sparse parameters, use Parameter.row_sparse_data() instead. Parameters ———- ctx : Context

Desired context.

Returns

NDArray on ctx

list_data()[source]

Returns copies of this parameter on all contexts, in the same order as creation. For sparse parameters, use Parameter.list_row_sparse_data() instead. Returns ——- list of NDArrays

grad(ctx=None)[source]

Returns a gradient buffer for this parameter on one context. Parameters ———- ctx : Context

Desired context.

list_grad()[source]

Returns gradient buffers on all contexts, in the same order as values().

list_ctx()[source]

Returns a list of contexts this parameter is initialized on.

zero_grad()[source]

Sets gradient buffer on all contexts to 0. No action is taken if parameter is uninitialized or doesn’t require gradient.

cast(dtype)[source]

Cast data and gradient of this Parameter to a new data type. Parameters ———- dtype : str or numpy.dtype

The new data type.

class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.gluon.ParameterDict(prefix='', shared=None)[source]

Bases: object

A dictionary managing a set of parameters. Parameters ———- prefix : str, default ''

The prefix to be prepended to all Parameters’ names created by this dict.

sharedParameterDict or None

If not None, when this dict’s get() method creates a new parameter, will first try to retrieve it from “shared” dict. Usually used for sharing parameters with another Block.

items()[source]
keys()[source]
values()[source]
property prefix

Prefix of this dict. It will be prepended to Parameter`s' name created with :py:func:`get.

get(name, **kwargs)[source]

Retrieves a Parameter with name self.prefix+name. If not found, get() will first try to retrieve it from “shared” dict. If still not found, get() will create a new Parameter with key-word arguments and insert it to self. Parameters ———- name : str

Name of the desired Parameter. It will be prepended with this dictionary’s prefix.

**kwargsDict[str, Any]

The rest of key-word arguments for the created Parameter.

Returns

Parameter

The created or retrieved Parameter.

update(other)[source]

Copies all Parameters in other to self.

initialize(init=None, ctx=None, verbose=False, force_reinit=False)[source]

Initializes all Parameters managed by this dictionary to be used for NDArray API. It has no effect when using Symbol API. Parameters ———- init : Initializer

Global default Initializer to be used when Parameter.init() is None. Otherwise, Parameter.init() takes precedence.

ctxContext or list of Context

Keeps a copy of Parameters on one or many context(s).

verbosebool, default False

Whether to verbosely print out details on initialization.

force_reinitbool, default False

Whether to force re-initialization if parameter is already initialized.

reset_ctx(ctx)[source]

Re-assign all Parameters to other contexts. Parameters ———- ctx : Context or list of Context, default context.current_context().

Assign Parameter to given context. If ctx is a list of Context, a copy will be made for each context.

list_ctx()[source]

Returns a list of all the contexts on which the underlying Parameters are initialized.

setattr(name, value)[source]

Set an attribute to a new value for all Parameters. For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:

model.collect_params().setattr('grad_req', 'null')
or change the learning rate multiplier::

model.collect_params().setattr(‘lr_mult’, 0.5)

Parameters

namestr

Name of the attribute.

valuevalid type for attribute name

The new value for the attribute.