Evaluate log(1 - normal_cdf(x)) in a numerically stable way

459 views Asked by At

How to evaluate log(1 - normal_cdf(x)) in a numerically stable way? Here, normal_cdf is the cumulative distribution function of the standard Normal distribution.

For example, in Python:

import scipy 
from scipy.stats import norm

np.log(1 - norm.cdf(10))

gives -inf with RuntimeWarning: divide by zero encountered in log since norm.cdf(10) is almost equal to 1. Is there a function like logsumexp that can avoid numerical under-flow?

2

There are 2 answers

0
Hong Ooi On BEST ANSWER

Since the normal distribution is symmetric about 0, we have

1 - F(x) = P(X > x)
         = P(X < -x)
         = F(-x)

Hence

np.log(1 - norm.cdf(10)) = np.log(norm.cdf(-10))
                         = norm.logcdf(-10)
0
Warren Weckesser On

@HongOoi's suggestion to take advantage of the symmetry is great. But for an arbitrary distribution in scipy.stats (including norm), you can use the method logsf for exactly this computation. sf stands for survival function, which is the name of the function 1 - cdf(x).

For example,

In [25]: import numpy as np

In [26]: from scipy.stats import norm, gamma

Here's an example of norm.logsf:

In [27]: norm.logsf(3, loc=1, scale=1.5)
Out[27]: -2.3945773661586434

In [28]: np.log(1 - norm.cdf(3, loc=1, scale=1.5))
Out[28]: -2.3945773661586434

And here's an example of gamma.logsf:

In [29]: gamma.logsf(1.2345, a=2, scale=1.8)
Out[29]: -0.16357333194167956

In [30]: np.log(1 - gamma.cdf(1.2345, a=2, scale=1.8))
Out[30]: -0.16357333194167956

This shows why one would want to use logsf(x) instead of log(1 - cdf(x)):

In [35]: norm.logsf(50, loc=1, scale=1.5)
Out[35]: -537.96178420294677

In [36]: np.log(1 - norm.cdf(50, loc=1, scale=1.5))
/Users/warren/miniconda3scipy/bin/ipython:1: RuntimeWarning: divide by zero encountered in log
  #!/Users/warren/miniconda3scipy/bin/python
Out[36]: -inf