# RNN (contrib)

[TOC]

Additional RNN operations and cells.

## This package provides additional contributed RNNCells.

### Block RNNCells

`class tf.contrib.rnn.LSTMBlockCell`

Basic LSTM recurrent network cell.

The implementation is based on: http://arxiv.org/abs/1409.2329.

We add `forget_bias`

(default: 1) to the biases of the forget gate in order to
reduce the scale of forgetting in the beginning of the training.

Unlike `rnn_cell.LSTMCell`

, this is a monolithic op and should be much faster.
The weight and bias matrixes should be compatible as long as the variable
scope matches, and you use `use_compatible_names=True`

.

`tf.contrib.rnn.LSTMBlockCell.__call__(x, states_prev, scope=None)`

Long short-term memory cell (LSTM).

`tf.contrib.rnn.LSTMBlockCell.__init__(num_units, forget_bias=1.0, use_peephole=False, use_compatible_names=False)`

Initialize the basic LSTM cell.

##### Args:

: int, The number of units in the LSTM cell.`num_units`

: float, The bias added to forget gates (see above).`forget_bias`

: Whether to use peephole connections or not.`use_peephole`

: If True, use the same variable naming as rnn_cell.LSTMCell`use_compatible_names`

`tf.contrib.rnn.LSTMBlockCell.output_size`

`tf.contrib.rnn.LSTMBlockCell.state_size`

`tf.contrib.rnn.LSTMBlockCell.zero_state(batch_size, dtype)`

Return zero-filled state tensor(s).

##### Args:

: int, float, or unit Tensor representing the batch size.`batch_size`

: the data type to use for the state.`dtype`

##### Returns:

If `state_size`

is an int or TensorShape, then the return value is a
`N-D`

tensor of shape `[batch_size x state_size]`

filled with zeros.

If `state_size`

is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of `2-D`

tensors with
the shapes `[batch_size x s]`

for each s in `state_size`

.

`class tf.contrib.rnn.GRUBlockCell`

Block GRU cell implementation.

The implementation is based on: http://arxiv.org/abs/1406.1078 Computes the LSTM cell forward propagation for 1 time step.

This kernel op implements the following mathematical equations:

Biases are initialized with:

`b_ru`

- constant_initializer(1.0)`b_c`

- constant_initializer(0.0)

```
x_h_prev = [x, h_prev]
[r_bar u_bar] = x_h_prev * w_ru + b_ru
r = sigmoid(r_bar)
u = sigmoid(u_bar)
h_prevr = h_prev \circ r
x_h_prevr = [x h_prevr]
c_bar = x_h_prevr * w_c + b_c
c = tanh(c_bar)
h = (1-u) \circ c + u \circ h_prev
```

`tf.contrib.rnn.GRUBlockCell.__call__(x, h_prev, scope=None)`

GRU cell.

`tf.contrib.rnn.GRUBlockCell.__init__(cell_size)`

Initialize the Block GRU cell.

##### Args:

: int, GRU cell size.`cell_size`

`tf.contrib.rnn.GRUBlockCell.output_size`

`tf.contrib.rnn.GRUBlockCell.state_size`

`tf.contrib.rnn.GRUBlockCell.zero_state(batch_size, dtype)`

Return zero-filled state tensor(s).

##### Args:

: int, float, or unit Tensor representing the batch size.`batch_size`

: the data type to use for the state.`dtype`

##### Returns:

If `state_size`

is an int or TensorShape, then the return value is a
`N-D`

tensor of shape `[batch_size x state_size]`

filled with zeros.

If `state_size`

is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of `2-D`

tensors with
the shapes `[batch_size x s]`

for each s in `state_size`

.

### Fused RNNCells

`class tf.contrib.rnn.FusedRNNCell`

Abstract object representing a fused RNN cell.

A fused RNN cell represents the entire RNN expanded over the time dimension. In effect, this represents an entire recurrent network.

Unlike RNN cells which are subclasses of `rnn_cell.RNNCell`

, a `FusedRNNCell`

operates on the entire time sequence at once, by putting the loop over time
inside the cell. This usually leads to much more efficient, but more complex
and less flexible implementations.

Every `FusedRNNCell`

must implement `__call__`

with the following signature.

`tf.contrib.rnn.FusedRNNCell.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)`

Run this fused RNN on inputs, starting from the given state.

##### Args:

:`inputs`

`3-D`

tensor with shape`[time_len x batch_size x input_size]`

or a list of`time_len`

tensors of shape`[batch_size x input_size]`

.: either a tensor with shape`initial_state`

`[batch_size x state_size]`

or a tuple with shapes`[batch_size x s] for s in state_size`

, if the cell takes tuples. If this is not provided, the cell is expected to create a zero initial state of type`dtype`

.: The data type for the initial state and expected output. Required if`dtype`

`initial_state`

is not provided or RNN state has a heterogeneous dtype.: Specifies the length of each sequence in inputs. An`sequence_length`

`int32`

or`int64`

vector (tensor) size`[batch_size]`

, values in`[0, time_len)`

. Defaults to`time_len`

for each element.:`scope`

`VariableScope`

or`string`

for the created subgraph; defaults to class name.

##### Returns:

A pair containing:

- Output: A
`3-D`

tensor of shape`[time_len x batch_size x output_size]`

or a list of`time_len`

tensors of shape`[batch_size x output_size]`

, to match the type of the`inputs`

. - Final state: Either a single
`2-D`

tensor, or a tuple of tensors matching the arity and shapes of`initial_state`

.

`class tf.contrib.rnn.FusedRNNCellAdaptor`

This is an adaptor for RNNCell classes to be used with `FusedRNNCell`

.

`tf.contrib.rnn.FusedRNNCellAdaptor.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)`

`tf.contrib.rnn.FusedRNNCellAdaptor.__init__(cell, use_dynamic_rnn=False)`

Initialize the adaptor.

##### Args:

: an instance of a subclass of a`cell`

`rnn_cell.RNNCell`

.: whether to use dynamic (or static) RNN.`use_dynamic_rnn`

`class tf.contrib.rnn.TimeReversedFusedRNN`

This is an adaptor to time-reverse a FusedRNNCell.

For example,

```
cell = tf.nn.rnn_cell.BasicRNNCell(10)
fw_lstm = tf.contrib.rnn.FusedRNNCellAdaptor(cell, use_dynamic_rnn=True)
bw_lstm = tf.contrib.rnn.TimeReversedFusedRNN(fw_lstm)
fw_out, fw_state = fw_lstm(inputs)
bw_out, bw_state = bw_lstm(inputs)
```

`tf.contrib.rnn.TimeReversedFusedRNN.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)`

`tf.contrib.rnn.TimeReversedFusedRNN.__init__(cell)`

`class tf.contrib.rnn.LSTMBlockFusedCell`

FusedRNNCell implementation of LSTM.

This is an extremely efficient LSTM implementation, that uses a single TF op for the entire LSTM. It should be both faster and more memory-efficient than LSTMBlockCell defined above.

The implementation is based on: http://arxiv.org/abs/1409.2329.

We add forget_bias (default: 1) to the biases of the forget gate in order to reduce the scale of forgetting in the beginning of the training.

The variable naming is consistent with `rnn_cell.LSTMCell`

.

`tf.contrib.rnn.LSTMBlockFusedCell.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)`

Run this LSTM on inputs, starting from the given state.

##### Args:

:`inputs`

`3-D`

tensor with shape`[time_len, batch_size, input_size]`

or a list of`time_len`

tensors of shape`[batch_size, input_size]`

.: a tuple`initial_state`

`(initial_cell_state, initial_output)`

with tensors of shape`[batch_size, self._num_units]`

. If this is not provided, the cell is expected to create a zero initial state of type`dtype`

.: The data type for the initial state and expected output. Required if`dtype`

`initial_state`

is not provided or RNN state has a heterogeneous dtype.: Specifies the length of each sequence in inputs. An`sequence_length`

`int32`

or`int64`

vector (tensor) size`[batch_size]`

, values in`[0, time_len).`

Defaults to`time_len`

for each element.:`scope`

`VariableScope`

for the created subgraph; defaults to class name.

##### Returns:

A pair containing:

- Output: A
`3-D`

tensor of shape`[time_len, batch_size, output_size]`

or a list of time_len tensors of shape`[batch_size, output_size]`

, to match the type of the`inputs`

. - Final state: a tuple
`(cell_state, output)`

matching`initial_state`

.

##### Raises:

: in case of shape mismatches`ValueError`

`tf.contrib.rnn.LSTMBlockFusedCell.__init__(num_units, forget_bias=1.0, cell_clip=None, use_peephole=False)`

Initialize the LSTM cell.

##### Args:

: int, The number of units in the LSTM cell.`num_units`

: float, The bias added to forget gates (see above).`forget_bias`

: clip the cell to this value. Defaults to`cell_clip`

`3`

.: Whether to use peephole connections or not.`use_peephole`

`tf.contrib.rnn.LSTMBlockFusedCell.num_units`

Number of units in this cell (output dimension).

### LSTM-like cells

`class tf.contrib.rnn.CoupledInputForgetGateLSTMCell`

Long short-term memory unit (LSTM) recurrent network cell.

The default non-peephole implementation is based on:

http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf

S. Hochreiter and J. Schmidhuber. "Long Short-Term Memory". Neural Computation, 9(8):1735-1780, 1997.

The peephole implementation is based on:

https://research.google.com/pubs/archive/43905.pdf

Hasim Sak, Andrew Senior, and Francoise Beaufays. "Long short-term memory recurrent neural network architectures for large scale acoustic modeling." INTERSPEECH, 2014.

The coupling of input and forget gate is based on:

http://arxiv.org/pdf/1503.04069.pdf

Greff et al. "LSTM: A Search Space Odyssey"

The class uses optional peep-hole connections, and an optional projection layer.

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.__call__(inputs, state, scope=None)`

Run one step of LSTM.

##### Args:

: input Tensor, 2D, batch x num_units.`inputs`

: if`state`

`state_is_tuple`

is False, this must be a state Tensor,`2-D, batch x state_size`

. If`state_is_tuple`

is True, this must be a tuple of state Tensors, both`2-D`

, with column sizes`c_state`

and`m_state`

.: VariableScope for the created subgraph; defaults to "LSTMCell".`scope`

##### Returns:

A tuple containing:

- A
`2-D, [batch x output_dim]`

, Tensor representing the output of the LSTM after reading`inputs`

when previous state was`state`

. Here output_dim is: num_proj if num_proj was set, num_units otherwise. - Tensor(s) representing the new state of LSTM after reading
`inputs`

when the previous state was`state`

. Same type and shape(s) as`state`

.

##### Raises:

: If input size cannot be inferred from inputs via static shape inference.`ValueError`

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.__init__(num_units, use_peepholes=False, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=1, num_proj_shards=1, forget_bias=1.0, state_is_tuple=False, activation=tanh)`

Initialize the parameters for an LSTM cell.

##### Args:

: int, The number of units in the LSTM cell`num_units`

: bool, set True to enable diagonal/peephole connections.`use_peepholes`

: (optional) The initializer to use for the weight and projection matrices.`initializer`

: (optional) int, The output dimensionality for the projection matrices. If None, no projection is performed.`num_proj`

: (optional) A float value. If`proj_clip`

`num_proj > 0`

and`proj_clip`

is provided, then the projected values are clipped elementwise to within`[-proj_clip, proj_clip]`

.: How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards.`num_unit_shards`

: How to split the projection matrix. If >1, the projection matrix is stored across num_proj_shards.`num_proj_shards`

: Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.`forget_bias`

: If True, accepted and returned states are 2-tuples of the`state_is_tuple`

`c_state`

and`m_state`

. By default (False), they are concatenated along the column axis. This default behavior will soon be deprecated.: Activation function of the inner states.`activation`

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.output_size`

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.state_size`

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.zero_state(batch_size, dtype)`

Return zero-filled state tensor(s).

##### Args:

: int, float, or unit Tensor representing the batch size.`batch_size`

: the data type to use for the state.`dtype`

##### Returns:

If `state_size`

is an int or TensorShape, then the return value is a
`N-D`

tensor of shape `[batch_size x state_size]`

filled with zeros.

If `state_size`

is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of `2-D`

tensors with
the shapes `[batch_size x s]`

for each s in `state_size`

.

`class tf.contrib.rnn.TimeFreqLSTMCell`

Time-Frequency Long short-term memory unit (LSTM) recurrent network cell.

This implementation is based on:

Tara N. Sainath and Bo Li "Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks." submitted to INTERSPEECH, 2016.

It uses peep-hole connections and optional cell clipping.

`tf.contrib.rnn.TimeFreqLSTMCell.__call__(inputs, state, scope=None)`

Run one step of LSTM.

##### Args:

: input Tensor, 2D, batch x num_units.`inputs`

: state Tensor, 2D, batch x state_size.`state`

: VariableScope for the created subgraph; defaults to "TimeFreqLSTMCell".`scope`

##### Returns:

A tuple containing:

- A 2D, batch x output_dim, Tensor representing the output of the LSTM after reading "inputs" when previous state was "state". Here output_dim is num_units.
- A 2D, batch x state_size, Tensor representing the new state of LSTM after reading "inputs" when previous state was "state".

##### Raises:

: if an input_size was specified and the provided inputs have a different dimension.`ValueError`

`tf.contrib.rnn.TimeFreqLSTMCell.__init__(num_units, use_peepholes=False, cell_clip=None, initializer=None, num_unit_shards=1, forget_bias=1.0, feature_size=None, frequency_skip=None)`

Initialize the parameters for an LSTM cell.

##### Args:

: int, The number of units in the LSTM cell`num_units`

: bool, set True to enable diagonal/peephole connections.`use_peepholes`

: (optional) A float value, if provided the cell state is clipped by this value prior to the cell output activation.`cell_clip`

: (optional) The initializer to use for the weight and projection matrices.`initializer`

: int, How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards.`num_unit_shards`

: float, Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.`forget_bias`

: int, The size of the input feature the LSTM spans over.`feature_size`

: int, The amount the LSTM filter is shifted by in frequency.`frequency_skip`

`tf.contrib.rnn.TimeFreqLSTMCell.output_size`

`tf.contrib.rnn.TimeFreqLSTMCell.state_size`

`tf.contrib.rnn.TimeFreqLSTMCell.zero_state(batch_size, dtype)`

Return zero-filled state tensor(s).

##### Args:

: int, float, or unit Tensor representing the batch size.`batch_size`

: the data type to use for the state.`dtype`

##### Returns:

`state_size`

is an int or TensorShape, then the return value is a
`N-D`

tensor of shape `[batch_size x state_size]`

filled with zeros.

`state_size`

is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of `2-D`

tensors with
the shapes `[batch_size x s]`

for each s in `state_size`

.

`class tf.contrib.rnn.GridLSTMCell`

Grid Long short-term memory unit (LSTM) recurrent network cell.

The default is based on: Nal Kalchbrenner, Ivo Danihelka and Alex Graves "Grid Long Short-Term Memory," Proc. ICLR 2016. http://arxiv.org/abs/1507.01526

When peephole connections are used, the implementation is based on: Tara N. Sainath and Bo Li "Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks." submitted to INTERSPEECH, 2016.

The code uses optional peephole connections, shared_weights and cell clipping.

`tf.contrib.rnn.GridLSTMCell.__call__(inputs, state, scope=None)`

Run one step of LSTM.

##### Args:

: input Tensor, 2D, batch x num_units.`inputs`

: state Tensor, 2D, batch x state_size.`state`

: VariableScope for the created subgraph; defaults to "LSTMCell".`scope`

##### Returns:

A tuple containing:

- A 2D, batch x output_dim, Tensor representing the output of the LSTM after reading "inputs" when previous state was "state". Here output_dim is num_units.
- A 2D, batch x state_size, Tensor representing the new state of LSTM after reading "inputs" when previous state was "state".

##### Raises:

: if an input_size was specified and the provided inputs have a different dimension.`ValueError`

`tf.contrib.rnn.GridLSTMCell.__init__(num_units, use_peepholes=False, share_time_frequency_weights=False, cell_clip=None, initializer=None, num_unit_shards=1, forget_bias=1.0, feature_size=None, frequency_skip=None, num_frequency_blocks=1, couple_input_forget_gates=False, state_is_tuple=False)`

Initialize the parameters for an LSTM cell.

##### Args:

: int, The number of units in the LSTM cell`num_units`

: bool, default False. Set True to enable diagonal/peephole connections.`use_peepholes`

: bool, default False. Set True to enable shared cell weights between time and frequency LSTMs.`share_time_frequency_weights`

: (optional) A float value, if provided the cell state is clipped by this value prior to the cell output activation.`cell_clip`

: (optional) The initializer to use for the weight and projection matrices.`initializer`

: int, How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards.`num_unit_shards`

: float, Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.`forget_bias`

: int, The size of the input feature the LSTM spans over.`feature_size`

: int, The amount the LSTM filter is shifted by in frequency.`frequency_skip`

: int, The total number of frequency blocks needed to cover the whole input feature.`num_frequency_blocks`

: bool, Whether to couple the input and forget gates, i.e. f_gate = 1.0 - i_gate, to reduce model parameters and computation cost.`couple_input_forget_gates`

: If True, accepted and returned states are 2-tuples of the`state_is_tuple`

`c_state`

and`m_state`

. By default (False), they are concatenated along the column axis. This default behavior will soon be deprecated.

`tf.contrib.rnn.GridLSTMCell.output_size`

`tf.contrib.rnn.GridLSTMCell.state_size`

`tf.contrib.rnn.GridLSTMCell.state_tuple_type`

`tf.contrib.rnn.GridLSTMCell.zero_state(batch_size, dtype)`

Return zero-filled state tensor(s).

##### Args:

: int, float, or unit Tensor representing the batch size.`batch_size`

: the data type to use for the state.`dtype`

##### Returns:

`state_size`

is an int or TensorShape, then the return value is a
`N-D`

tensor of shape `[batch_size x state_size]`

filled with zeros.

`state_size`

is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of `2-D`

tensors with
the shapes `[batch_size x s]`

for each s in `state_size`

.

### RNNCell wrappers

`class tf.contrib.rnn.AttentionCellWrapper`

Basic attention cell wrapper.

Implementation based on https://arxiv.org/pdf/1601.06733.pdf.

`tf.contrib.rnn.AttentionCellWrapper.__call__(inputs, state, scope=None)`

Long short-term memory cell with attention (LSTMA).

`tf.contrib.rnn.AttentionCellWrapper.__init__(cell, attn_length, attn_size=None, attn_vec_size=None, input_size=None, state_is_tuple=False)`

Create a cell with attention.

##### Args:

: an RNNCell, an attention is added to it.`cell`

: integer, the size of an attention window.`attn_length`

: integer, the size of an attention vector. Equal to cell.output_size by default.`attn_size`

: integer, the number of convolutional features calculated on attention state and a size of the hidden layer built from base cell state. Equal attn_size to by default.`attn_vec_size`

: integer, the size of a hidden linear layer, built from inputs and attention. Derived from the input tensor by default.`input_size`

: If True, accepted and returned states are n-tuples, where`state_is_tuple`

`n = len(cells)`

. By default (False), the states are all concatenated along the column axis.

##### Raises:

: if cell is not an RNNCell.`TypeError`

: if cell returns a state tuple but the flag`ValueError`

`state_is_tuple`

is`False`

or if attn_length is zero or less.

`tf.contrib.rnn.AttentionCellWrapper.output_size`

`tf.contrib.rnn.AttentionCellWrapper.state_size`

`tf.contrib.rnn.AttentionCellWrapper.zero_state(batch_size, dtype)`

Return zero-filled state tensor(s).

##### Args:

: int, float, or unit Tensor representing the batch size.`batch_size`

: the data type to use for the state.`dtype`

##### Returns:

`state_size`

is an int or TensorShape, then the return value is a
`N-D`

tensor of shape `[batch_size x state_size]`

filled with zeros.

`state_size`

is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of `2-D`

tensors with
the shapes `[batch_size x s]`

for each s in `state_size`

.

## Other Functions and Classes

`class tf.contrib.rnn.LSTMBlockWrapper`

This is a helper class that provides housekeeping for LSTM cells.

This may be useful for alternative LSTM and similar type of cells.
The subclasses must implement `_call_cell`

method and `num_units`

property.

`tf.contrib.rnn.LSTMBlockWrapper.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)`

Run this LSTM on inputs, starting from the given state.

##### Args:

:`inputs`

`3-D`

tensor with shape`[time_len, batch_size, input_size]`

or a list of`time_len`

tensors of shape`[batch_size, input_size]`

.: a tuple`initial_state`

`(initial_cell_state, initial_output)`

with tensors of shape`[batch_size, self._num_units]`

. If this is not provided, the cell is expected to create a zero initial state of type`dtype`

.: The data type for the initial state and expected output. Required if`dtype`

`initial_state`

is not provided or RNN state has a heterogeneous dtype.: Specifies the length of each sequence in inputs. An`sequence_length`

`int32`

or`int64`

vector (tensor) size`[batch_size]`

, values in`[0, time_len).`

Defaults to`time_len`

for each element.:`scope`

`VariableScope`

for the created subgraph; defaults to class name.

##### Returns:

A pair containing:

- Output: A
`3-D`

tensor of shape`[time_len, batch_size, output_size]`

or a list of time_len tensors of shape`[batch_size, output_size]`

, to match the type of the`inputs`

. - Final state: a tuple
`(cell_state, output)`

matching`initial_state`

.

##### Raises:

: in case of shape mismatches`ValueError`

`tf.contrib.rnn.LSTMBlockWrapper.num_units`

Number of units in this cell (output dimension).

`class tf.contrib.rnn.LayerNormBasicLSTMCell`

LSTM unit with layer normalization and recurrent dropout.

This class adds layer normalization and recurrent dropout to a basic LSTM unit. Layer normalization implementation is based on:

https://arxiv.org/abs/1607.06450.

"Layer Normalization" Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

and is applied before the internal nonlinearities. Recurrent dropout is base on:

https://arxiv.org/abs/1603.05118

"Recurrent Dropout without Memory Loss" Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth.

`tf.contrib.rnn.LayerNormBasicLSTMCell.__call__(inputs, state, scope=None)`

LSTM cell with layer normalization and recurrent dropout.

`tf.contrib.rnn.LayerNormBasicLSTMCell.__init__(num_units, forget_bias=1.0, input_size=None, activation=tanh, layer_norm=True, norm_gain=1.0, norm_shift=0.0, dropout_keep_prob=1.0, dropout_prob_seed=None)`

Initializes the basic LSTM cell.

##### Args:

: int, The number of units in the LSTM cell.`num_units`

: float, The bias added to forget gates (see above).`forget_bias`

: Deprecated and unused.`input_size`

: Activation function of the inner states.`activation`

: If`layer_norm`

`True`

, layer normalization will be applied.: float, The layer normalization gain initial value. If`norm_gain`

`layer_norm`

has been set to`False`

, this argument will be ignored.: float, The layer normalization shift initial value. If`norm_shift`

`layer_norm`

has been set to`False`

, this argument will be ignored.: unit Tensor or float between 0 and 1 representing the recurrent dropout probability value. If float and 1.0, no dropout will be applied.`dropout_keep_prob`

: (optional) integer, the randomness seed.`dropout_prob_seed`

`tf.contrib.rnn.LayerNormBasicLSTMCell.output_size`

`tf.contrib.rnn.LayerNormBasicLSTMCell.state_size`

`tf.contrib.rnn.LayerNormBasicLSTMCell.zero_state(batch_size, dtype)`

Return zero-filled state tensor(s).

##### Args:

: int, float, or unit Tensor representing the batch size.`batch_size`

: the data type to use for the state.`dtype`

##### Returns:

`state_size`

is an int or TensorShape, then the return value is a
`N-D`

tensor of shape `[batch_size x state_size]`

filled with zeros.

`state_size`

is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of `2-D`

tensors with
the shapes `[batch_size x s]`

for each s in `state_size`

.

`tf.contrib.rnn.stack_bidirectional_dynamic_rnn(cells_fw, cells_bw, inputs, initial_states_fw=None, initial_states_bw=None, dtype=None, sequence_length=None, scope=None)`

Creates a dynamic bidirectional recurrent neural network.

Stacks several bidirectional rnn layers. The combined forward and backward layer outputs are used as input of the next layer. tf.bidirectional_rnn does not allow to share forward and backward information between layers. The input_size of the first forward and backward cells must match. The initial state for both directions is zero and no intermediate states are returned.

##### Args:

: List of instances of RNNCell, one per layer, to be used for forward direction.`cells_fw`

: List of instances of RNNCell, one per layer, to be used for backward direction.`cells_bw`

: A length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements.`inputs`

: (optional) A list of the initial states (one per layer) for the forward RNN. Each tensor must has an appropriate type and shape`initial_states_fw`

`[batch_size, cell_fw.state_size]`

.: (optional) Same as for`initial_states_bw`

`initial_states_fw`

, but using the corresponding properties of`cells_bw`

.: (optional) The data type for the initial state. Required if either of the initial states are not provided.`dtype`

: (optional) An int32/int64 vector, size`sequence_length`

`[batch_size]`

, containing the actual lengths for each of the sequences.: VariableScope for the created subgraph; defaults to None.`scope`

##### Returns:

A tuple (outputs, output_state_fw, output_state_bw) where:

: Output`outputs`

`Tensor`

shaped:`batch_size, max_time, layers_output]`

. Where layers_output are depth-concatenated forward and backward outputs. output_states_fw is the final states, one tensor per layer, of the forward rnn. output_states_bw is the final states, one tensor per layer, of the backward rnn.

##### Raises:

: If`TypeError`

`cell_fw`

or`cell_bw`

is not an instance of`RNNCell`

.: If inputs is`ValueError`

`None`

, not a list or an empty list.

`tf.contrib.rnn.stack_bidirectional_rnn(cells_fw, cells_bw, inputs, initial_states_fw=None, initial_states_bw=None, dtype=None, sequence_length=None, scope=None)`

Creates a bidirectional recurrent neural network.

Stacks several bidirectional rnn layers. The combined forward and backward layer outputs are used as input of the next layer. tf.bidirectional_rnn does not allow to share forward and backward information between layers. The input_size of the first forward and backward cells must match. The initial state for both directions is zero and no intermediate states are returned.

As described in https://arxiv.org/abs/1303.5778

##### Args:

: List of instances of RNNCell, one per layer, to be used for forward direction.`cells_fw`

: List of instances of RNNCell, one per layer, to be used for backward direction.`cells_bw`

: A length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements.`inputs`

: (optional) A list of the initial states (one per layer) for the forward RNN. Each tensor must has an appropriate type and shape`initial_states_fw`

`[batch_size, cell_fw.state_size]`

.: (optional) Same as for`initial_states_bw`

`initial_states_fw`

, but using the corresponding properties of`cells_bw`

.: (optional) The data type for the initial state. Required if either of the initial states are not provided.`dtype`

: (optional) An int32/int64 vector, size`sequence_length`

`[batch_size]`

, containing the actual lengths for each of the sequences.: VariableScope for the created subgraph; defaults to None.`scope`

##### Returns:

A tuple (outputs, output_state_fw, output_state_bw) where:
outputs is a length `T`

list of outputs (one for each input), which
are depth-concatenated forward and backward outputs.
output_states_fw is the final states, one tensor per layer,
of the forward rnn.
output_states_bw is the final states, one tensor per layer,
of the backward rnn.

##### Raises:

: If`TypeError`

`cell_fw`

or`cell_bw`

is not an instance of`RNNCell`

.: If inputs is None, not a list or an empty list.`ValueError`