BayesFlow Monte Carlo (contrib)
[TOC]
Monte Carlo integration and helpers.
Background
Monte Carlo integration refers to the practice of estimating an expectation with
a sample mean. For example, given random variable Z in R^k
with density p
,
the expectation of function f
can be approximated like:
E_p[f(Z)] = \int f(z) p(z) dz
~ S_n
:= n^{-1} \sum_{i=1}^n f(z_i), z_i iid samples from p.
If E_p[|f(Z)|] < infinity
, then S_n --> E_p[f(Z)]
by the strong law of large
numbers. If E_p[f(Z)^2] < infinity
, then S_n
is asymptotically normal with
variance Var[f(Z)] / n
.
Practitioners of Bayesian statistics often find themselves wanting to estimate
E_p[f(Z)]
when the distribution p
is known only up to a constant. For
example, the joint distribution p(z, x)
may be known, but the evidence
p(x) = \int p(z, x) dz
may be intractable. In that case, a parameterized
distribution family q_lambda(z)
may be chosen, and the optimal lambda
is the
one minimizing the KL divergence between q_lambda(z)
and
p(z | x)
. We only know p(z, x)
, but that is sufficient to find lambda
.
Log-space evaluation and subtracting the maximum.
Care must be taken when the random variable lives in a high dimensional space.
For example, the naive importance sample estimate E_q[f(Z) p(Z) / q(Z)]
involves the ratio of two terms p(Z) / q(Z)
, each of which must have tails
dropping off faster than O(|z|^{-(k + 1)})
in order to have finite integral.
This ratio would often be zero or infinity up to numerical precision.
For that reason, we write
Log E_q[ f(Z) p(Z) / q(Z) ]
= Log E_q[ exp{Log[f(Z)] + Log[p(Z)] - Log[q(Z)] - C} ] + C, where
C := Max[ Log[f(Z)] + Log[p(Z)] - Log[q(Z)] ].
The maximum value of the exponentiated term will be 0.0, and the the expectation can be evaluated in a stable manner.
Ops
tf.contrib.bayesflow.monte_carlo.expectation(f, p, z=None, n=None, seed=None, name='expectation')
Monte Carlo estimate of an expectation: E_p[f(Z)]
with sample mean.
This Op
returns
n^{-1} sum_{i=1}^n f(z_i), where z_i ~ p
\approx E_p[f(Z)]
User supplies either Tensor
of samples z
, or number of samples to draw n
Args:
f
: Callable mapping samples fromp
toTensors
.p
:tf.contrib.distributions.BaseDistribution
.z
:Tensor
of samples fromp
, produced byp.sample_n
.n
: IntegerTensor
. Number of samples to generate ifz
is not provided.seed
: Python integer to seed the random number generator.name
: A name to give thisOp
.
Returns:
A Tensor
with the same dtype
as p
.
Example
:
N_samples = 10000
distributions = tf.contrib.distributions
dist = distributions.Uniform([0.0, 0.0], [1.0, 2.0])
elementwise_mean = lambda x: x
mean_sum = lambda x: tf.reduce_sum(x, 1)
estimate_elementwise_mean_tf = monte_carlo.expectation(elementwise_mean,
dist,
n=N_samples)
estimate_mean_sum_tf = monte_carlo.expectation(mean_sum,
dist,
n=N_samples)
with tf.Session() as sess:
estimate_elementwise_mean, estimate_mean_sum = (
sess.run([estimate_elementwise_mean_tf, estimate_mean_sum_tf]))
print estimate_elementwise_mean
>>> np.array([ 0.50018013 1.00097895], dtype=np.float32)
print estimate_mean_sum
>>> 1.49571
tf.contrib.bayesflow.monte_carlo.expectation_importance_sampler(f, log_p, sampling_dist_q, z=None, n=None, seed=None, name='expectation_importance_sampler')
Monte Carlo estimate of E_p[f(Z)] = E_q[f(Z) p(Z) / q(Z)]
.
With p(z) := exp{log_p(z)}
, this Op
returns
n^{-1} sum_{i=1}^n [ f(z_i) p(z_i) / q(z_i) ], z_i ~ q,
\approx E_q[ f(Z) p(Z) / q(Z) ]
= E_p[f(Z)]
This integral is done in log-space with max-subtraction to better handle the
often extreme values that f(z) p(z) / q(z)
can take on.
If f >= 0
, it is up to 2x more efficient to exponentiate the result of
expectation_importance_sampler_logspace
applied to Log[f]
.
User supplies either Tensor
of samples z
, or number of samples to draw n
Args:
f
: Callable mapping samples fromsampling_dist_q
toTensors
with shape broadcastable toq.batch_shape
. For example,f
works "just like"q.log_prob
.log_p
: Callable mapping samples fromsampling_dist_q
toTensors
with shape broadcastable toq.batch_shape
. For example,log_p
works "just like"sampling_dist_q.log_prob
.sampling_dist_q
: The sampling distribution.tf.contrib.distributions.BaseDistribution
.float64
dtype
recommended.log_p
andq
should be supported on the same set.z
:Tensor
of samples fromq
, produced byq.sample_n
.n
: IntegerTensor
. Number of samples to generate ifz
is not provided.seed
: Python integer to seed the random number generator.name
: A name to give thisOp
.
Returns:
The importance sampling estimate. Tensor
with shape
equal
to batch shape of q
, and dtype
= q.dtype
.
tf.contrib.bayesflow.monte_carlo.expectation_importance_sampler_logspace(log_f, log_p, sampling_dist_q, z=None, n=None, seed=None, name='expectation_importance_sampler_logspace')
Importance sampling with a positive function, in log-space.
With p(z) := exp{log_p(z)}
, and f(z) = exp{log_f(z)}
, this Op
returns
Log[ n^{-1} sum_{i=1}^n [ f(z_i) p(z_i) / q(z_i) ] ], z_i ~ q,
\approx Log[ E_q[ f(Z) p(Z) / q(Z) ] ]
= Log[E_p[f(Z)]]
This integral is done in log-space with max-subtraction to better handle the
often extreme values that f(z) p(z) / q(z)
can take on.
In contrast to expectation_importance_sampler
, this Op
returns values in
log-space.
User supplies either Tensor
of samples z
, or number of samples to draw n
Args:
log_f
: Callable mapping samples fromsampling_dist_q
toTensors
with shape broadcastable toq.batch_shape
. For example,log_f
works "just like"sampling_dist_q.log_prob
.log_p
: Callable mapping samples fromsampling_dist_q
toTensors
with shape broadcastable toq.batch_shape
. For example,log_p
works "just like"q.log_prob
.sampling_dist_q
: The sampling distribution.tf.contrib.distributions.BaseDistribution
.float64
dtype
recommended.log_p
andq
should be supported on the same set.z
:Tensor
of samples fromq
, produced byq.sample_n
.n
: IntegerTensor
. Number of samples to generate ifz
is not provided.seed
: Python integer to seed the random number generator.name
: A name to give thisOp
.
Returns:
Logarithm of the importance sampling estimate. Tensor
with shape
equal
to batch shape of q
, and dtype
= q.dtype
.