Initializers¶
Initializers provide init values for network parameter blobs. In Caffe, they are called Fillers.

class
NullInitializer
¶ An initializer that does nothing. To initialize with zeros, use a ConstantInitializer.

class
ConstantInitializer
¶ Set everything to a constant.

value
¶ The value used to initialize a parameter blob. Typically this is set to 0.


class
XavierInitializer
¶ An initializer based on [BengioGlorot2010], but does not use the fanout value. It fills the parameter blob by randomly sampling uniform data from \([S,S]\) where the scale \(S=\sqrt{3 / F_{\text{in}}}\). Here \(F_{\text{in}}\) is the fanin: the number of input nodes.
Heuristics are used to determine the fanin: For a ND tensor parameter blob, the product of all the 1 to N1 dimensions are considered as fanin, while the last dimension is considered as fanout.
[BengioGlorot2010] Y. Bengio and X. Glorot, Understanding the difficulty of training deep feedforward neural networks, in Proceedings of AISTATS 2010, pp. 249256.

class
GaussianInitializer
¶ Initialize each element in the parameter blob as independent and identically distributed Gaussian random variables.

mean
¶ Default 0.

std
¶ Default 1.


class
OrthogonalInitializer
¶ Initialize the parameter blob to be a random orthogonal matrix (i.e. \(W^TW=I\)), times a scalar gain factor. Based on [Saxe2013].
[Saxe2013] Andrew M. Saxe, James L. McClelland, Surya Ganguli, Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, http://arxiv.org/abs/1312.6120 with a presentation https://www.youtube.com/watch?v=Ap7atxKi3Q 
gain
¶ Default 1. Use \(\sqrt{2}\) for layers with ReLU activations.
