RNN (contrib)

[TOC]

Additional RNN operations and cells.

This package provides additional contributed RNNCells.

Block RNNCells


class tf.contrib.rnn.LSTMBlockCell

Basic LSTM recurrent network cell.

The implementation is based on: http://arxiv.org/abs/1409.2329.

We add forget_bias (default: 1) to the biases of the forget gate in order to reduce the scale of forgetting in the beginning of the training.

Unlike rnn_cell.LSTMCell, this is a monolithic op and should be much faster. The weight and bias matrixes should be compatible as long as the variable scope matches, and you use use_compatible_names=True.


tf.contrib.rnn.LSTMBlockCell.__call__(x, states_prev, scope=None)

Long short-term memory cell (LSTM).


tf.contrib.rnn.LSTMBlockCell.__init__(num_units, forget_bias=1.0, use_peephole=False, use_compatible_names=False)

Initialize the basic LSTM cell.

Args:
  • num_units: int, The number of units in the LSTM cell.
  • forget_bias: float, The bias added to forget gates (see above).
  • use_peephole: Whether to use peephole connections or not.
  • use_compatible_names: If True, use the same variable naming as rnn_cell.LSTMCell

tf.contrib.rnn.LSTMBlockCell.output_size


tf.contrib.rnn.LSTMBlockCell.state_size


tf.contrib.rnn.LSTMBlockCell.zero_state(batch_size, dtype)

Return zero-filled state tensor(s).

Args:
  • batch_size: int, float, or unit Tensor representing the batch size.
  • dtype: the data type to use for the state.
Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.


class tf.contrib.rnn.GRUBlockCell

Block GRU cell implementation.

The implementation is based on: http://arxiv.org/abs/1406.1078 Computes the LSTM cell forward propagation for 1 time step.

This kernel op implements the following mathematical equations:

Biases are initialized with:

  • b_ru - constant_initializer(1.0)
  • b_c - constant_initializer(0.0)
x_h_prev = [x, h_prev]

[r_bar u_bar] = x_h_prev * w_ru + b_ru

r = sigmoid(r_bar)
u = sigmoid(u_bar)

h_prevr = h_prev \circ r

x_h_prevr = [x h_prevr]

c_bar = x_h_prevr * w_c + b_c
c = tanh(c_bar)

h = (1-u) \circ c + u \circ h_prev

tf.contrib.rnn.GRUBlockCell.__call__(x, h_prev, scope=None)

GRU cell.


tf.contrib.rnn.GRUBlockCell.__init__(cell_size)

Initialize the Block GRU cell.

Args:
  • cell_size: int, GRU cell size.

tf.contrib.rnn.GRUBlockCell.output_size


tf.contrib.rnn.GRUBlockCell.state_size


tf.contrib.rnn.GRUBlockCell.zero_state(batch_size, dtype)

Return zero-filled state tensor(s).

Args:
  • batch_size: int, float, or unit Tensor representing the batch size.
  • dtype: the data type to use for the state.
Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

Fused RNNCells


class tf.contrib.rnn.FusedRNNCell

Abstract object representing a fused RNN cell.

A fused RNN cell represents the entire RNN expanded over the time dimension. In effect, this represents an entire recurrent network.

Unlike RNN cells which are subclasses of rnn_cell.RNNCell, a FusedRNNCell operates on the entire time sequence at once, by putting the loop over time inside the cell. This usually leads to much more efficient, but more complex and less flexible implementations.

Every FusedRNNCell must implement __call__ with the following signature.


tf.contrib.rnn.FusedRNNCell.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)

Run this fused RNN on inputs, starting from the given state.

Args:
  • inputs: 3-D tensor with shape [time_len x batch_size x input_size] or a list of time_len tensors of shape [batch_size x input_size].
  • initial_state: either a tensor with shape [batch_size x state_size] or a tuple with shapes [batch_size x s] for s in state_size, if the cell takes tuples. If this is not provided, the cell is expected to create a zero initial state of type dtype.
  • dtype: The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
  • sequence_length: Specifies the length of each sequence in inputs. An int32 or int64 vector (tensor) size [batch_size], values in [0, time_len). Defaults to time_len for each element.
  • scope: VariableScope or string for the created subgraph; defaults to class name.
Returns:

A pair containing:

  • Output: A 3-D tensor of shape [time_len x batch_size x output_size] or a list of time_len tensors of shape [batch_size x output_size], to match the type of the inputs.
  • Final state: Either a single 2-D tensor, or a tuple of tensors matching the arity and shapes of initial_state.

class tf.contrib.rnn.FusedRNNCellAdaptor

This is an adaptor for RNNCell classes to be used with FusedRNNCell.


tf.contrib.rnn.FusedRNNCellAdaptor.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)


tf.contrib.rnn.FusedRNNCellAdaptor.__init__(cell, use_dynamic_rnn=False)

Initialize the adaptor.

Args:
  • cell: an instance of a subclass of a rnn_cell.RNNCell.
  • use_dynamic_rnn: whether to use dynamic (or static) RNN.

class tf.contrib.rnn.TimeReversedFusedRNN

This is an adaptor to time-reverse a FusedRNNCell.

For example,

cell = tf.nn.rnn_cell.BasicRNNCell(10)
fw_lstm = tf.contrib.rnn.FusedRNNCellAdaptor(cell, use_dynamic_rnn=True)
bw_lstm = tf.contrib.rnn.TimeReversedFusedRNN(fw_lstm)
fw_out, fw_state = fw_lstm(inputs)
bw_out, bw_state = bw_lstm(inputs)

tf.contrib.rnn.TimeReversedFusedRNN.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)


tf.contrib.rnn.TimeReversedFusedRNN.__init__(cell)


class tf.contrib.rnn.LSTMBlockFusedCell

FusedRNNCell implementation of LSTM.

This is an extremely efficient LSTM implementation, that uses a single TF op for the entire LSTM. It should be both faster and more memory-efficient than LSTMBlockCell defined above.

The implementation is based on: http://arxiv.org/abs/1409.2329.

We add forget_bias (default: 1) to the biases of the forget gate in order to reduce the scale of forgetting in the beginning of the training.

The variable naming is consistent with rnn_cell.LSTMCell.


tf.contrib.rnn.LSTMBlockFusedCell.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)

Run this LSTM on inputs, starting from the given state.

Args:
  • inputs: 3-D tensor with shape [time_len, batch_size, input_size] or a list of time_len tensors of shape [batch_size, input_size].
  • initial_state: a tuple (initial_cell_state, initial_output) with tensors of shape [batch_size, self._num_units]. If this is not provided, the cell is expected to create a zero initial state of type dtype.
  • dtype: The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
  • sequence_length: Specifies the length of each sequence in inputs. An int32 or int64 vector (tensor) size [batch_size], values in [0, time_len). Defaults to time_len for each element.
  • scope: VariableScope for the created subgraph; defaults to class name.
Returns:

A pair containing:

  • Output: A 3-D tensor of shape [time_len, batch_size, output_size] or a list of time_len tensors of shape [batch_size, output_size], to match the type of the inputs.
  • Final state: a tuple (cell_state, output) matching initial_state.
Raises:
  • ValueError: in case of shape mismatches

tf.contrib.rnn.LSTMBlockFusedCell.__init__(num_units, forget_bias=1.0, cell_clip=None, use_peephole=False)

Initialize the LSTM cell.

Args:
  • num_units: int, The number of units in the LSTM cell.
  • forget_bias: float, The bias added to forget gates (see above).
  • cell_clip: clip the cell to this value. Defaults to 3.
  • use_peephole: Whether to use peephole connections or not.

tf.contrib.rnn.LSTMBlockFusedCell.num_units

Number of units in this cell (output dimension).

LSTM-like cells


class tf.contrib.rnn.CoupledInputForgetGateLSTMCell

Long short-term memory unit (LSTM) recurrent network cell.

The default non-peephole implementation is based on:

http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf

S. Hochreiter and J. Schmidhuber. "Long Short-Term Memory". Neural Computation, 9(8):1735-1780, 1997.

The peephole implementation is based on:

https://research.google.com/pubs/archive/43905.pdf

Hasim Sak, Andrew Senior, and Francoise Beaufays. "Long short-term memory recurrent neural network architectures for large scale acoustic modeling." INTERSPEECH, 2014.

The coupling of input and forget gate is based on:

http://arxiv.org/pdf/1503.04069.pdf

Greff et al. "LSTM: A Search Space Odyssey"

The class uses optional peep-hole connections, and an optional projection layer.


tf.contrib.rnn.CoupledInputForgetGateLSTMCell.__call__(inputs, state, scope=None)

Run one step of LSTM.

Args:
  • inputs: input Tensor, 2D, batch x num_units.
  • state: if state_is_tuple is False, this must be a state Tensor, 2-D, batch x state_size. If state_is_tuple is True, this must be a tuple of state Tensors, both 2-D, with column sizes c_state and m_state.
  • scope: VariableScope for the created subgraph; defaults to "LSTMCell".
Returns:

A tuple containing:

  • A 2-D, [batch x output_dim], Tensor representing the output of the LSTM after reading inputs when previous state was state. Here output_dim is: num_proj if num_proj was set, num_units otherwise.
  • Tensor(s) representing the new state of LSTM after reading inputs when the previous state was state. Same type and shape(s) as state.
Raises:
  • ValueError: If input size cannot be inferred from inputs via static shape inference.

tf.contrib.rnn.CoupledInputForgetGateLSTMCell.__init__(num_units, use_peepholes=False, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=1, num_proj_shards=1, forget_bias=1.0, state_is_tuple=False, activation=tanh)

Initialize the parameters for an LSTM cell.

Args:
  • num_units: int, The number of units in the LSTM cell
  • use_peepholes: bool, set True to enable diagonal/peephole connections.
  • initializer: (optional) The initializer to use for the weight and projection matrices.
  • num_proj: (optional) int, The output dimensionality for the projection matrices. If None, no projection is performed.
  • proj_clip: (optional) A float value. If num_proj > 0 and proj_clip is provided, then the projected values are clipped elementwise to within [-proj_clip, proj_clip].

  • num_unit_shards: How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards.

  • num_proj_shards: How to split the projection matrix. If >1, the projection matrix is stored across num_proj_shards.
  • forget_bias: Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.
  • state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state. By default (False), they are concatenated along the column axis. This default behavior will soon be deprecated.
  • activation: Activation function of the inner states.

tf.contrib.rnn.CoupledInputForgetGateLSTMCell.output_size


tf.contrib.rnn.CoupledInputForgetGateLSTMCell.state_size


tf.contrib.rnn.CoupledInputForgetGateLSTMCell.zero_state(batch_size, dtype)

Return zero-filled state tensor(s).

Args:
  • batch_size: int, float, or unit Tensor representing the batch size.
  • dtype: the data type to use for the state.
Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.


class tf.contrib.rnn.TimeFreqLSTMCell

Time-Frequency Long short-term memory unit (LSTM) recurrent network cell.

This implementation is based on:

Tara N. Sainath and Bo Li "Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks." submitted to INTERSPEECH, 2016.

It uses peep-hole connections and optional cell clipping.


tf.contrib.rnn.TimeFreqLSTMCell.__call__(inputs, state, scope=None)

Run one step of LSTM.

Args:
  • inputs: input Tensor, 2D, batch x num_units.
  • state: state Tensor, 2D, batch x state_size.
  • scope: VariableScope for the created subgraph; defaults to "TimeFreqLSTMCell".
Returns:

A tuple containing:

  • A 2D, batch x output_dim, Tensor representing the output of the LSTM after reading "inputs" when previous state was "state". Here output_dim is num_units.
  • A 2D, batch x state_size, Tensor representing the new state of LSTM after reading "inputs" when previous state was "state".
Raises:
  • ValueError: if an input_size was specified and the provided inputs have a different dimension.

tf.contrib.rnn.TimeFreqLSTMCell.__init__(num_units, use_peepholes=False, cell_clip=None, initializer=None, num_unit_shards=1, forget_bias=1.0, feature_size=None, frequency_skip=None)

Initialize the parameters for an LSTM cell.

Args:
  • num_units: int, The number of units in the LSTM cell
  • use_peepholes: bool, set True to enable diagonal/peephole connections.
  • cell_clip: (optional) A float value, if provided the cell state is clipped by this value prior to the cell output activation.
  • initializer: (optional) The initializer to use for the weight and projection matrices.
  • num_unit_shards: int, How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards.
  • forget_bias: float, Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.
  • feature_size: int, The size of the input feature the LSTM spans over.
  • frequency_skip: int, The amount the LSTM filter is shifted by in frequency.

tf.contrib.rnn.TimeFreqLSTMCell.output_size


tf.contrib.rnn.TimeFreqLSTMCell.state_size


tf.contrib.rnn.TimeFreqLSTMCell.zero_state(batch_size, dtype)

Return zero-filled state tensor(s).

Args:
  • batch_size: int, float, or unit Tensor representing the batch size.
  • dtype: the data type to use for the state.
Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.


class tf.contrib.rnn.GridLSTMCell

Grid Long short-term memory unit (LSTM) recurrent network cell.

The default is based on: Nal Kalchbrenner, Ivo Danihelka and Alex Graves "Grid Long Short-Term Memory," Proc. ICLR 2016. http://arxiv.org/abs/1507.01526

When peephole connections are used, the implementation is based on: Tara N. Sainath and Bo Li "Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks." submitted to INTERSPEECH, 2016.

The code uses optional peephole connections, shared_weights and cell clipping.


tf.contrib.rnn.GridLSTMCell.__call__(inputs, state, scope=None)

Run one step of LSTM.

Args:
  • inputs: input Tensor, 2D, batch x num_units.
  • state: state Tensor, 2D, batch x state_size.
  • scope: VariableScope for the created subgraph; defaults to "LSTMCell".
Returns:

A tuple containing:

  • A 2D, batch x output_dim, Tensor representing the output of the LSTM after reading "inputs" when previous state was "state". Here output_dim is num_units.
  • A 2D, batch x state_size, Tensor representing the new state of LSTM after reading "inputs" when previous state was "state".
Raises:
  • ValueError: if an input_size was specified and the provided inputs have a different dimension.

tf.contrib.rnn.GridLSTMCell.__init__(num_units, use_peepholes=False, share_time_frequency_weights=False, cell_clip=None, initializer=None, num_unit_shards=1, forget_bias=1.0, feature_size=None, frequency_skip=None, num_frequency_blocks=1, couple_input_forget_gates=False, state_is_tuple=False)

Initialize the parameters for an LSTM cell.

Args:
  • num_units: int, The number of units in the LSTM cell
  • use_peepholes: bool, default False. Set True to enable diagonal/peephole connections.
  • share_time_frequency_weights: bool, default False. Set True to enable shared cell weights between time and frequency LSTMs.
  • cell_clip: (optional) A float value, if provided the cell state is clipped by this value prior to the cell output activation.
  • initializer: (optional) The initializer to use for the weight and projection matrices.
  • num_unit_shards: int, How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards.
  • forget_bias: float, Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.
  • feature_size: int, The size of the input feature the LSTM spans over.
  • frequency_skip: int, The amount the LSTM filter is shifted by in frequency.
  • num_frequency_blocks: int, The total number of frequency blocks needed to cover the whole input feature.
  • couple_input_forget_gates: bool, Whether to couple the input and forget gates, i.e. f_gate = 1.0 - i_gate, to reduce model parameters and computation cost.
  • state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state. By default (False), they are concatenated along the column axis. This default behavior will soon be deprecated.

tf.contrib.rnn.GridLSTMCell.output_size


tf.contrib.rnn.GridLSTMCell.state_size


tf.contrib.rnn.GridLSTMCell.state_tuple_type


tf.contrib.rnn.GridLSTMCell.zero_state(batch_size, dtype)

Return zero-filled state tensor(s).

Args:
  • batch_size: int, float, or unit Tensor representing the batch size.
  • dtype: the data type to use for the state.
Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

RNNCell wrappers


class tf.contrib.rnn.AttentionCellWrapper

Basic attention cell wrapper.

Implementation based on https://arxiv.org/pdf/1601.06733.pdf.


tf.contrib.rnn.AttentionCellWrapper.__call__(inputs, state, scope=None)

Long short-term memory cell with attention (LSTMA).


tf.contrib.rnn.AttentionCellWrapper.__init__(cell, attn_length, attn_size=None, attn_vec_size=None, input_size=None, state_is_tuple=False)

Create a cell with attention.

Args:
  • cell: an RNNCell, an attention is added to it.
  • attn_length: integer, the size of an attention window.
  • attn_size: integer, the size of an attention vector. Equal to cell.output_size by default.
  • attn_vec_size: integer, the number of convolutional features calculated on attention state and a size of the hidden layer built from base cell state. Equal attn_size to by default.
  • input_size: integer, the size of a hidden linear layer, built from inputs and attention. Derived from the input tensor by default.
  • state_is_tuple: If True, accepted and returned states are n-tuples, where n = len(cells). By default (False), the states are all concatenated along the column axis.
Raises:
  • TypeError: if cell is not an RNNCell.
  • ValueError: if cell returns a state tuple but the flag state_is_tuple is False or if attn_length is zero or less.

tf.contrib.rnn.AttentionCellWrapper.output_size


tf.contrib.rnn.AttentionCellWrapper.state_size


tf.contrib.rnn.AttentionCellWrapper.zero_state(batch_size, dtype)

Return zero-filled state tensor(s).

Args:
  • batch_size: int, float, or unit Tensor representing the batch size.
  • dtype: the data type to use for the state.
Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

Other Functions and Classes


class tf.contrib.rnn.LSTMBlockWrapper

This is a helper class that provides housekeeping for LSTM cells.

This may be useful for alternative LSTM and similar type of cells. The subclasses must implement _call_cell method and num_units property.


tf.contrib.rnn.LSTMBlockWrapper.__call__(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)

Run this LSTM on inputs, starting from the given state.

Args:
  • inputs: 3-D tensor with shape [time_len, batch_size, input_size] or a list of time_len tensors of shape [batch_size, input_size].
  • initial_state: a tuple (initial_cell_state, initial_output) with tensors of shape [batch_size, self._num_units]. If this is not provided, the cell is expected to create a zero initial state of type dtype.
  • dtype: The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
  • sequence_length: Specifies the length of each sequence in inputs. An int32 or int64 vector (tensor) size [batch_size], values in [0, time_len). Defaults to time_len for each element.
  • scope: VariableScope for the created subgraph; defaults to class name.
Returns:

A pair containing:

  • Output: A 3-D tensor of shape [time_len, batch_size, output_size] or a list of time_len tensors of shape [batch_size, output_size], to match the type of the inputs.
  • Final state: a tuple (cell_state, output) matching initial_state.
Raises:
  • ValueError: in case of shape mismatches

tf.contrib.rnn.LSTMBlockWrapper.num_units

Number of units in this cell (output dimension).


class tf.contrib.rnn.LayerNormBasicLSTMCell

LSTM unit with layer normalization and recurrent dropout.

This class adds layer normalization and recurrent dropout to a basic LSTM unit. Layer normalization implementation is based on:

https://arxiv.org/abs/1607.06450.

"Layer Normalization" Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

and is applied before the internal nonlinearities. Recurrent dropout is base on:

https://arxiv.org/abs/1603.05118

"Recurrent Dropout without Memory Loss" Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth.


tf.contrib.rnn.LayerNormBasicLSTMCell.__call__(inputs, state, scope=None)

LSTM cell with layer normalization and recurrent dropout.


tf.contrib.rnn.LayerNormBasicLSTMCell.__init__(num_units, forget_bias=1.0, input_size=None, activation=tanh, layer_norm=True, norm_gain=1.0, norm_shift=0.0, dropout_keep_prob=1.0, dropout_prob_seed=None)

Initializes the basic LSTM cell.

Args:
  • num_units: int, The number of units in the LSTM cell.
  • forget_bias: float, The bias added to forget gates (see above).
  • input_size: Deprecated and unused.
  • activation: Activation function of the inner states.
  • layer_norm: If True, layer normalization will be applied.
  • norm_gain: float, The layer normalization gain initial value. If layer_norm has been set to False, this argument will be ignored.
  • norm_shift: float, The layer normalization shift initial value. If layer_norm has been set to False, this argument will be ignored.
  • dropout_keep_prob: unit Tensor or float between 0 and 1 representing the recurrent dropout probability value. If float and 1.0, no dropout will be applied.
  • dropout_prob_seed: (optional) integer, the randomness seed.

tf.contrib.rnn.LayerNormBasicLSTMCell.output_size


tf.contrib.rnn.LayerNormBasicLSTMCell.state_size


tf.contrib.rnn.LayerNormBasicLSTMCell.zero_state(batch_size, dtype)

Return zero-filled state tensor(s).

Args:
  • batch_size: int, float, or unit Tensor representing the batch size.
  • dtype: the data type to use for the state.
Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.


tf.contrib.rnn.stack_bidirectional_dynamic_rnn(cells_fw, cells_bw, inputs, initial_states_fw=None, initial_states_bw=None, dtype=None, sequence_length=None, scope=None)

Creates a dynamic bidirectional recurrent neural network.

Stacks several bidirectional rnn layers. The combined forward and backward layer outputs are used as input of the next layer. tf.bidirectional_rnn does not allow to share forward and backward information between layers. The input_size of the first forward and backward cells must match. The initial state for both directions is zero and no intermediate states are returned.

Args:
  • cells_fw: List of instances of RNNCell, one per layer, to be used for forward direction.
  • cells_bw: List of instances of RNNCell, one per layer, to be used for backward direction.
  • inputs: A length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements.
  • initial_states_fw: (optional) A list of the initial states (one per layer) for the forward RNN. Each tensor must has an appropriate type and shape [batch_size, cell_fw.state_size].
  • initial_states_bw: (optional) Same as for initial_states_fw, but using the corresponding properties of cells_bw.
  • dtype: (optional) The data type for the initial state. Required if either of the initial states are not provided.
  • sequence_length: (optional) An int32/int64 vector, size [batch_size], containing the actual lengths for each of the sequences.
  • scope: VariableScope for the created subgraph; defaults to None.
Returns:

A tuple (outputs, output_state_fw, output_state_bw) where:

  • outputs: Output Tensor shaped: batch_size, max_time, layers_output]. Where layers_output are depth-concatenated forward and backward outputs. output_states_fw is the final states, one tensor per layer, of the forward rnn. output_states_bw is the final states, one tensor per layer, of the backward rnn.
Raises:
  • TypeError: If cell_fw or cell_bw is not an instance of RNNCell.
  • ValueError: If inputs is None, not a list or an empty list.

tf.contrib.rnn.stack_bidirectional_rnn(cells_fw, cells_bw, inputs, initial_states_fw=None, initial_states_bw=None, dtype=None, sequence_length=None, scope=None)

Creates a bidirectional recurrent neural network.

Stacks several bidirectional rnn layers. The combined forward and backward layer outputs are used as input of the next layer. tf.bidirectional_rnn does not allow to share forward and backward information between layers. The input_size of the first forward and backward cells must match. The initial state for both directions is zero and no intermediate states are returned.

As described in https://arxiv.org/abs/1303.5778

Args:
  • cells_fw: List of instances of RNNCell, one per layer, to be used for forward direction.
  • cells_bw: List of instances of RNNCell, one per layer, to be used for backward direction.
  • inputs: A length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements.
  • initial_states_fw: (optional) A list of the initial states (one per layer) for the forward RNN. Each tensor must has an appropriate type and shape [batch_size, cell_fw.state_size].
  • initial_states_bw: (optional) Same as for initial_states_fw, but using the corresponding properties of cells_bw.
  • dtype: (optional) The data type for the initial state. Required if either of the initial states are not provided.
  • sequence_length: (optional) An int32/int64 vector, size [batch_size], containing the actual lengths for each of the sequences.
  • scope: VariableScope for the created subgraph; defaults to None.
Returns:

A tuple (outputs, output_state_fw, output_state_bw) where: outputs is a length T list of outputs (one for each input), which are depth-concatenated forward and backward outputs. output_states_fw is the final states, one tensor per layer, of the forward rnn. output_states_bw is the final states, one tensor per layer, of the backward rnn.

Raises:
  • TypeError: If cell_fw or cell_bw is not an instance of RNNCell.
  • ValueError: If inputs is None, not a list or an empty list.

results matching ""

    No results matching ""