syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.gluon module
Gluon APIs for autograd
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.gluon.Block(prefix=None, params=None)[source]
Bases:
object
Base class for all neural network layers and models. Your models should subclass this class.
Block
can be nested recursively in a tree structure. You can create and assign childBlock
as regular attributes:from mxnet.gluon import Block, nn from mxnet import ndarray as F class Model(Block): def __init__(self, **kwargs): super(Model, self).__init__(**kwargs) # use name_scope to give child Blocks appropriate names. with self.name_scope(): self.dense0 = nn.Dense(20) self.dense1 = nn.Dense(20) def forward(self, x): x = F.relu(self.dense0(x)) return F.relu(self.dense1(x)) model = Model() model.initialize(ctx=mx.cpu(0)) model(F.zeros((10, 10), ctx=mx.cpu(0)))
Child
Block
assigned this way will be registered andcollect_params()
will collect their Parameters recursively. You can also manually register child blocks withregister_child()
. Parameters ———- prefix : strPrefix acts like a name space. All children blocks created in parent block’s
name_scope()
will have parent block’s prefix in their name. Please refer to naming tutorial for more info on prefix and naming.- paramsParameterDict or None
ParameterDict
for sharing weights with the newBlock
. For example, if you wantdense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20, params=dense0.collect_params())
- name_scope()[source]
Returns a name space object managing a child
Block
and parameter names. Should be used within awith
statement:with self.name_scope(): self.dense = nn.Dense(20)
Please refer to the naming tutorial for more info on prefix and naming.
- property params
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
- collect_params(select=None)[source]
Returns a
ParameterDict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectParameterDict
which match some given regular expressions. For example, collect the specified parameters in [‘conv1_weight’, ‘conv1_bias’, ‘fc_weight’, ‘fc_bias’]:model.collect_params('conv1_weight|conv1_bias|fc_weight|fc_bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
Parameters
- selectstr
regular expressions
Returns
The selected
ParameterDict
- register_child(block, name=None)[source]
Registers block as a child of self.
Block
s assigned to self as attributes will be registered automatically.
- apply(fn)[source]
Applies
fn
recursively to every child block as well as self. Parameters ———- fn : callableFunction to be applied to each submodule, of form
fn(block)
.Returns
this block
- initialize(init=None, ctx=None, verbose=False, force_reinit=False)[source]
Initializes
Parameter
s of thisBlock
and its children. Equivalent toblock.collect_params().initialize(...)
Parameters ———- init : InitializerGlobal default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.- ctxContext or list of Context
Keeps a copy of Parameters on one or many context(s).
- verbosebool, default False
Whether to verbosely print out details on initialization.
- force_reinitbool, default False
Whether to force re-initialization if parameter is already initialized.
- cast(dtype)[source]
Cast this Block to use another data type. Parameters ———- dtype : str or numpy.dtype
The new data type.
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.gluon.Parameter(name, grad_req='write', shape=None, dtype=<class 'numpy.float64'>, lr_mult=1.0, wd_mult=1.0, init=None, allow_deferred_init=False, differentiable=True, stype='default', grad_stype='default')[source]
Bases:
object
A Container holding parameters (weights) of Blocks.
Parameter
holds a copy of the parameter on eachContext
after it is initialized withParameter.initialize(...)
. Ifgrad_req
is not'null'
, it will also hold a gradient array on eachContext
:x = np.zeros((16, 100)) w = Parameter('fc_weight', shape=(16, 100), init=np.random.uniform) w.initialize() b.initialize() z = x + w.data
Parameters
- namestr
Name of this parameter.
- grad_req{‘write’, ‘add’, ‘null’}, default ‘write’
Specifies how to update gradient to grad arrays. -
'write'
means everytime gradient is written to gradNDArray
. -'add'
means everytime gradient is added to the gradNDArray
. You needto manually call
zero_grad()
to clear the gradient buffer before each iteration when using this option.‘null’ means gradient is not requested for this parameter. gradient arrays will not be allocated.
- shapeint or tuple of int, default None
Shape of this parameter. By default shape is not specified. Parameter with unknown shape can be used for
Symbol
API, butinit
will throw an error when usingNDArray
API.- dtypenumpy.dtype or str, default ‘float64’
Data type of this parameter. For example,
numpy.float64
or'float64'
.- lr_multfloat, default 1.0
Learning rate multiplier. Learning rate will be multiplied by lr_mult when updating this parameter with optimizer.
- wd_multfloat, default 1.0
Weight decay multiplier (L2 regularizer coefficient). Works similar to lr_mult.
- initInitializer, default None
Initializer of this parameter. Will use the global initializer by default.
- stype: {‘default’, ‘row_sparse’, ‘csr’}, defaults to ‘default’.
The storage type of the parameter.
- grad_stype: {‘default’, ‘row_sparse’, ‘csr’}, defaults to ‘default’.
The storage type of the parameter’s gradient.
Attributes
- grad_req{‘write’, ‘add’, ‘null’}
This can be set before or after initialization. Setting
grad_req
to'null'
withx.grad_req = 'null'
saves memory and computation when you don’t need gradient w.r.t x.- lr_multfloat
Local learning rate multiplier for this Parameter. The actual learning rate is calculated with
learning_rate * lr_mult
. You can set it withparam.lr_mult = 2.0
- wd_multfloat
Local weight decay multiplier for this Parameter.
- property grad_req
- property dtype
The type of the parameter. Setting the dtype value is equivalent to casting the value of the parameter
- property shape
The shape of the parameter. By default, an unknown dimension size is 0. However, when the NumPy semantic is turned on, unknown dimension size is -1.
- initialize(init=None, ctx=None, default_init=None, force_reinit=False)[source]
Initializes parameter and gradient arrays. Only used for
NDArray
API. Parameters ———- init : InitializerThe initializer to use. Overrides
Parameter.init()
and default_init.- ctxContext or list of Context, defaults to
context.current_context()
. Initialize Parameter on given context. If ctx is a list of Context, a copy will be made for each context. .. note:
Copies are independent arrays. User is responsible for keeping their values consistent when updating. Normally :py:class:`gluon.Trainer` does this for you.
- default_initInitializer
Default initializer is used when both
init()
andParameter.init()
areNone
.- force_reinitbool, default False
Whether to force re-initialization if parameter is already initialized.
Examples
>>> weight = mx.gluon.Parameter('weight', shape=(2, 2)) >>> weight.initialize(ctx=mx.cpu(0)) >>> weight.data() [[-0.01068833 0.01729892] [ 0.02042518 -0.01618656]] <NDArray 2x2 @cpu(0)> >>> weight.grad() [[ 0. 0.] [ 0. 0.]] <NDArray 2x2 @cpu(0)> >>> weight.initialize(ctx=[mx.gpu(0), mx.gpu(1)]) >>> weight.data(mx.gpu(0)) [[-0.00873779 -0.02834515] [ 0.05484822 -0.06206018]] <NDArray 2x2 @gpu(0)> >>> weight.data(mx.gpu(1)) [[-0.00873779 -0.02834515] [ 0.05484822 -0.06206018]] <NDArray 2x2 @gpu(1)>
- ctxContext or list of Context, defaults to
- reset_ctx(ctx)[source]
Re-assign Parameter to other contexts. Parameters ———- ctx : Context or list of Context, default
context.current_context()
.Assign Parameter to given context. If ctx is a list of Context, a copy will be made for each context.
- data(ctx=None)[source]
Returns a copy of this parameter on one context. Must have been initialized on this context before. For sparse parameters, use
Parameter.row_sparse_data()
instead. Parameters ———- ctx : ContextDesired context.
Returns
NDArray on ctx
- list_data()[source]
Returns copies of this parameter on all contexts, in the same order as creation. For sparse parameters, use
Parameter.list_row_sparse_data()
instead. Returns ——- list of NDArrays
- grad(ctx=None)[source]
Returns a gradient buffer for this parameter on one context. Parameters ———- ctx : Context
Desired context.
- class syne_tune.optimizer.schedulers.searchers.bayesopt.gpautograd.gluon.ParameterDict(prefix='', shared=None)[source]
Bases:
object
A dictionary managing a set of parameters. Parameters ———- prefix : str, default
''
The prefix to be prepended to all Parameters’ names created by this dict.
- sharedParameterDict or None
If not
None
, when this dict’sget()
method creates a new parameter, will first try to retrieve it from “shared” dict. Usually used for sharing parameters with another Block.
- property prefix
Prefix of this dict. It will be prepended to
Parameter`s' name created with :py:func:`get
.
- get(name, **kwargs)[source]
Retrieves a
Parameter
with nameself.prefix+name
. If not found,get()
will first try to retrieve it from “shared” dict. If still not found,get()
will create a newParameter
with key-word arguments and insert it to self. Parameters ———- name : strName of the desired Parameter. It will be prepended with this dictionary’s prefix.
Returns
- Parameter
The created or retrieved
Parameter
.
- initialize(init=None, ctx=None, verbose=False, force_reinit=False)[source]
Initializes all Parameters managed by this dictionary to be used for
NDArray
API. It has no effect when usingSymbol
API. Parameters ———- init : InitializerGlobal default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.- ctxContext or list of Context
Keeps a copy of Parameters on one or many context(s).
- verbosebool, default False
Whether to verbosely print out details on initialization.
- force_reinitbool, default False
Whether to force re-initialization if parameter is already initialized.
- reset_ctx(ctx)[source]
Re-assign all Parameters to other contexts. Parameters ———- ctx : Context or list of Context, default
context.current_context()
.Assign Parameter to given context. If ctx is a list of Context, a copy will be made for each context.
- list_ctx()[source]
Returns a list of all the contexts on which the underlying Parameters are initialized.
- setattr(name, value)[source]
Set an attribute to a new value for all Parameters. For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.collect_params().setattr('grad_req', 'null')
- or change the learning rate multiplier::
model.collect_params().setattr(‘lr_mult’, 0.5)
Parameters
- namestr
Name of the attribute.
- valuevalid type for attribute name
The new value for the attribute.