Neural Network RNN Cells
[TOC]
Module for constructing RNN Cells.
Base interface for all RNN Cells
class tf.nn.rnn_cell.RNNCell
Abstract object representing an RNN cell.
The definition of cell in this package differs from the definition used in the literature. In the literature, cell refers to an object with a single scalar output. The definition in this package refers to a horizontal array of such units.
An RNN cell, in the most abstract setting, is anything that has
a state and performs some operation that takes a matrix of inputs.
This operation results in an output matrix with self.output_size
columns.
If self.state_size
is an integer, this operation also results in a new
state matrix with self.state_size
columns. If self.state_size
is a
tuple of integers, then it results in a tuple of len(state_size)
state
matrices, each with a column size corresponding to values in state_size
.
This module provides a number of basic commonly used RNN cells, such as
LSTM (Long Short Term Memory) or GRU (Gated Recurrent Unit), and a number
of operators that allow add dropouts, projections, or embeddings for inputs.
Constructing multi-layer cells is supported by the class MultiRNNCell
,
or by calling the rnn
ops several times. Every RNNCell
must have the
properties below and and implement __call__
with the following signature.
tf.nn.rnn_cell.RNNCell.__call__(inputs, state, scope=None)
Run this RNN cell on inputs, starting from the given state.
Args:
inputs
:2-D
tensor with shape[batch_size x input_size]
.state
: ifself.state_size
is an integer, this should be a2-D Tensor
with shape[batch_size x self.state_size]
. Otherwise, ifself.state_size
is a tuple of integers, this should be a tuple with shapes[batch_size x s] for s in self.state_size
.scope
: VariableScope for the created subgraph; defaults to class name.
Returns:
A pair containing:
- Output: A
2-D
tensor with shape[batch_size x self.output_size]
. - New state: Either a single
2-D
tensor, or a tuple of tensors matching the arity and shapes ofstate
.
tf.nn.rnn_cell.RNNCell.output_size
Integer or TensorShape: size of outputs produced by this cell.
tf.nn.rnn_cell.RNNCell.state_size
size(s) of state(s) used by this cell.
It can be represented by an Integer, a TensorShape or a tuple of Integers or TensorShapes.
tf.nn.rnn_cell.RNNCell.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.
RNN Cells for use with TensorFlow's core RNN methods
class tf.nn.rnn_cell.BasicRNNCell
The most basic RNN cell.
tf.nn.rnn_cell.BasicRNNCell.__call__(inputs, state, scope=None)
Most basic RNN: output = new_state = activation(W input + U state + B).
tf.nn.rnn_cell.BasicRNNCell.__init__(num_units, input_size=None, activation=tanh)
tf.nn.rnn_cell.BasicRNNCell.output_size
tf.nn.rnn_cell.BasicRNNCell.state_size
tf.nn.rnn_cell.BasicRNNCell.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.
class tf.nn.rnn_cell.BasicLSTMCell
Basic LSTM recurrent network cell.
The implementation is based on: http://arxiv.org/abs/1409.2329.
We add forget_bias (default: 1) to the biases of the forget gate in order to reduce the scale of forgetting in the beginning of the training.
It does not allow cell clipping, a projection layer, and does not use peep-hole connections: it is the basic baseline.
For advanced models, please use the full LSTMCell that follows.
tf.nn.rnn_cell.BasicLSTMCell.__call__(inputs, state, scope=None)
Long short-term memory cell (LSTM).
tf.nn.rnn_cell.BasicLSTMCell.__init__(num_units, forget_bias=1.0, input_size=None, state_is_tuple=True, activation=tanh)
Initialize the basic LSTM cell.
Args:
num_units
: int, The number of units in the LSTM cell.forget_bias
: float, The bias added to forget gates (see above).input_size
: Deprecated and unused.state_is_tuple
: If True, accepted and returned states are 2-tuples of thec_state
andm_state
. If False, they are concatenated along the column axis. The latter behavior will soon be deprecated.activation
: Activation function of the inner states.
tf.nn.rnn_cell.BasicLSTMCell.output_size
tf.nn.rnn_cell.BasicLSTMCell.state_size
tf.nn.rnn_cell.BasicLSTMCell.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.
class tf.nn.rnn_cell.GRUCell
Gated Recurrent Unit cell (cf. http://arxiv.org/abs/1406.1078).
tf.nn.rnn_cell.GRUCell.__call__(inputs, state, scope=None)
Gated recurrent unit (GRU) with nunits cells.
tf.nn.rnn_cell.GRUCell.__init__(num_units, input_size=None, activation=tanh)
tf.nn.rnn_cell.GRUCell.output_size
tf.nn.rnn_cell.GRUCell.state_size
tf.nn.rnn_cell.GRUCell.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.
class tf.nn.rnn_cell.LSTMCell
Long short-term memory unit (LSTM) recurrent network cell.
The default non-peephole implementation is based on:
http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf
S. Hochreiter and J. Schmidhuber. "Long Short-Term Memory". Neural Computation, 9(8):1735-1780, 1997.
The peephole implementation is based on:
https://research.google.com/pubs/archive/43905.pdf
Hasim Sak, Andrew Senior, and Francoise Beaufays. "Long short-term memory recurrent neural network architectures for large scale acoustic modeling." INTERSPEECH, 2014.
The class uses optional peep-hole connections, optional cell clipping, and an optional projection layer.
tf.nn.rnn_cell.LSTMCell.__call__(inputs, state, scope=None)
Run one step of LSTM.
Args:
inputs
: input Tensor, 2D, batch x num_units.state
: ifstate_is_tuple
is False, this must be a state Tensor,2-D, batch x state_size
. Ifstate_is_tuple
is True, this must be a tuple of state Tensors, both2-D
, with column sizesc_state
andm_state
.scope
: VariableScope for the created subgraph; defaults to "LSTMCell".
Returns:
A tuple containing:
- A
2-D, [batch x output_dim]
, Tensor representing the output of the LSTM after readinginputs
when previous state wasstate
. Here output_dim is: num_proj if num_proj was set, num_units otherwise. - Tensor(s) representing the new state of LSTM after reading
inputs
when the previous state wasstate
. Same type and shape(s) asstate
.
Raises:
ValueError
: If input size cannot be inferred from inputs via static shape inference.
tf.nn.rnn_cell.LSTMCell.__init__(num_units, input_size=None, use_peepholes=False, cell_clip=None, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=1, num_proj_shards=1, forget_bias=1.0, state_is_tuple=True, activation=tanh)
Initialize the parameters for an LSTM cell.
Args:
num_units
: int, The number of units in the LSTM cellinput_size
: Deprecated and unused.use_peepholes
: bool, set True to enable diagonal/peephole connections.cell_clip
: (optional) A float value, if provided the cell state is clipped by this value prior to the cell output activation.initializer
: (optional) The initializer to use for the weight and projection matrices.num_proj
: (optional) int, The output dimensionality for the projection matrices. If None, no projection is performed.proj_clip
: (optional) A float value. Ifnum_proj > 0
andproj_clip
is provided, then the projected values are clipped elementwise to within[-proj_clip, proj_clip]
.num_unit_shards
: How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards.num_proj_shards
: How to split the projection matrix. If >1, the projection matrix is stored across num_proj_shards.forget_bias
: Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.state_is_tuple
: If True, accepted and returned states are 2-tuples of thec_state
andm_state
. If False, they are concatenated along the column axis. This latter behavior will soon be deprecated.activation
: Activation function of the inner states.
tf.nn.rnn_cell.LSTMCell.output_size
tf.nn.rnn_cell.LSTMCell.state_size
tf.nn.rnn_cell.LSTMCell.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.
Classes storing split RNNCell
state
class tf.nn.rnn_cell.LSTMStateTuple
Tuple used by LSTM Cells for state_size
, zero_state
, and output state.
Stores two elements: (c, h)
, in that order.
Only used when state_is_tuple=True
.
tf.nn.rnn_cell.LSTMStateTuple.__getnewargs__()
Return self as a plain tuple. Used by copy and pickle.
tf.nn.rnn_cell.LSTMStateTuple.__getstate__()
Exclude the OrderedDict from pickling
tf.nn.rnn_cell.LSTMStateTuple.__new__(_cls, c, h)
Create new instance of LSTMStateTuple(c, h)
tf.nn.rnn_cell.LSTMStateTuple.__repr__()
Return a nicely formatted representation string
tf.nn.rnn_cell.LSTMStateTuple.c
Alias for field number 0
tf.nn.rnn_cell.LSTMStateTuple.dtype
tf.nn.rnn_cell.LSTMStateTuple.h
Alias for field number 1
RNN Cell wrappers (RNNCells that wrap other RNNCells)
class tf.nn.rnn_cell.MultiRNNCell
RNN cell composed sequentially of multiple simple cells.
tf.nn.rnn_cell.MultiRNNCell.__call__(inputs, state, scope=None)
Run this multi-layer cell on inputs, starting from state.
tf.nn.rnn_cell.MultiRNNCell.__init__(cells, state_is_tuple=True)
Create a RNN cell composed sequentially of a number of RNNCells.
Args:
cells
: list of RNNCells that will be composed in this order.state_is_tuple
: If True, accepted and returned states are n-tuples, wheren = len(cells)
. If False, the states are all concatenated along the column axis. This latter behavior will soon be deprecated.
Raises:
ValueError
: if cells is empty (not allowed), or at least one of the cells returns a state tuple but the flagstate_is_tuple
isFalse
.
tf.nn.rnn_cell.MultiRNNCell.output_size
tf.nn.rnn_cell.MultiRNNCell.state_size
tf.nn.rnn_cell.MultiRNNCell.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.
class tf.nn.rnn_cell.DropoutWrapper
Operator adding dropout to inputs and outputs of the given cell.
tf.nn.rnn_cell.DropoutWrapper.__call__(inputs, state, scope=None)
Run the cell with the declared dropouts.
tf.nn.rnn_cell.DropoutWrapper.__init__(cell, input_keep_prob=1.0, output_keep_prob=1.0, seed=None)
Create a cell with added input and/or output dropout.
Dropout is never used on the state.
Args:
cell
: an RNNCell, a projection to output_size is added to it.input_keep_prob
: unit Tensor or float between 0 and 1, input keep probability; if it is float and 1, no input dropout will be added.output_keep_prob
: unit Tensor or float between 0 and 1, output keep probability; if it is float and 1, no output dropout will be added.seed
: (optional) integer, the randomness seed.
Raises:
TypeError
: if cell is not an RNNCell.ValueError
: if keep_prob is not between 0 and 1.
tf.nn.rnn_cell.DropoutWrapper.output_size
tf.nn.rnn_cell.DropoutWrapper.state_size
tf.nn.rnn_cell.DropoutWrapper.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.
class tf.nn.rnn_cell.EmbeddingWrapper
Operator adding input embedding to the given cell.
Note: in many cases it may be more efficient to not use this wrapper, but instead concatenate the whole sequence of your inputs in time, do the embedding on this batch-concatenated sequence, then split it and feed into your RNN.
tf.nn.rnn_cell.EmbeddingWrapper.__call__(inputs, state, scope=None)
Run the cell on embedded inputs.
tf.nn.rnn_cell.EmbeddingWrapper.__init__(cell, embedding_classes, embedding_size, initializer=None)
Create a cell with an added input embedding.
Args:
cell
: an RNNCell, an embedding will be put before its inputs.embedding_classes
: integer, how many symbols will be embedded.embedding_size
: integer, the size of the vectors we embed into.initializer
: an initializer to use when creating the embedding; if None, the initializer from variable scope or a default one is used.
Raises:
TypeError
: if cell is not an RNNCell.ValueError
: if embedding_classes is not positive.
tf.nn.rnn_cell.EmbeddingWrapper.output_size
tf.nn.rnn_cell.EmbeddingWrapper.state_size
tf.nn.rnn_cell.EmbeddingWrapper.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.
class tf.nn.rnn_cell.InputProjectionWrapper
Operator adding an input projection to the given cell.
Note: in many cases it may be more efficient to not use this wrapper, but instead concatenate the whole sequence of your inputs in time, do the projection on this batch-concatenated sequence, then split it.
tf.nn.rnn_cell.InputProjectionWrapper.__call__(inputs, state, scope=None)
Run the input projection and then the cell.
tf.nn.rnn_cell.InputProjectionWrapper.__init__(cell, num_proj, input_size=None)
Create a cell with input projection.
Args:
cell
: an RNNCell, a projection of inputs is added before it.num_proj
: Python integer. The dimension to project to.input_size
: Deprecated and unused.
Raises:
TypeError
: if cell is not an RNNCell.
tf.nn.rnn_cell.InputProjectionWrapper.output_size
tf.nn.rnn_cell.InputProjectionWrapper.state_size
tf.nn.rnn_cell.InputProjectionWrapper.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.
class tf.nn.rnn_cell.OutputProjectionWrapper
Operator adding an output projection to the given cell.
Note: in many cases it may be more efficient to not use this wrapper, but instead concatenate the whole sequence of your outputs in time, do the projection on this batch-concatenated sequence, then split it if needed or directly feed into a softmax.
tf.nn.rnn_cell.OutputProjectionWrapper.__call__(inputs, state, scope=None)
Run the cell and output projection on inputs, starting from state.
tf.nn.rnn_cell.OutputProjectionWrapper.__init__(cell, output_size)
Create a cell with output projection.
Args:
cell
: an RNNCell, a projection to output_size is added to it.output_size
: integer, the size of the output after projection.
Raises:
TypeError
: if cell is not an RNNCell.ValueError
: if output_size is not positive.
tf.nn.rnn_cell.OutputProjectionWrapper.output_size
tf.nn.rnn_cell.OutputProjectionWrapper.state_size
tf.nn.rnn_cell.OutputProjectionWrapper.zero_state(batch_size, dtype)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size x s]
for each s in state_size
.