Truncated normal distribution

Probability density function Probability density function for the truncated normal distribution for different sets of parameters. In all cases, a = −10 and b = 10. For the black: μ = −8, σ = 2; blue: μ = 0, σ = 2; red: μ = 9, σ = 10; orange: μ = 0, σ = 10.
Cumulative distribution function Cumulative distribution function for the truncated normal distribution for different sets of parameters. In all cases, a = −10 and b = 10. For the black: μ = −8, σ = 2; blue: μ = 0, σ = 2; red: μ = 9, σ = 10; orange: μ = 0, σ = 10.
Notation	$\xi ={\frac {x-\mu }{\sigma }},\ \alpha ={\frac {a-\mu }{\sigma }},\ \beta ={\frac {b-\mu }{\sigma }}$ $Z=\Phi (\beta )-\Phi (\alpha )$
Parameters	μ ∈ R — location σ² ≥ 0 — squared scale a ∈ R — minimum value b ∈ R — maximum value
Support	x ∈ [a,b]
PDF	$f(x;\mu ,\sigma ,a,b)={\frac {\phi (\xi )}{\sigma Z}}\,$ ^[1]
CDF	$F(x;\mu ,\sigma ,a,b)={\frac {\Phi (\xi )-\Phi (\alpha )}{Z}}$
Mean	$\mu +{\frac {\phi (\alpha )-\phi (\beta )}{Z}}\sigma$
Mode	$\left\{{\begin{array}{ll}a,&\mathrm {if} \ \mu <a\\\mu ,&\mathrm {if} \ a\leq \mu \leq b\\b,&\mathrm {if} \ \mu >b\end{array}}\right.$
Variance	$\sigma ^{2}\left[1+{\frac {\alpha \phi (\alpha )-\beta \phi (\beta )}{Z}}-\left({\frac {\phi (\alpha )-\phi (\beta )}{Z}}\right)^{2}\right]$
Entropy	$\ln({\sqrt {2\pi e}}\sigma Z)+{\frac {\alpha \phi (\alpha )-\beta \phi (\beta )}{2Z}}$
MGF	$e^{\mu t+\sigma ^{2}t^{2}/2}*\left[{\frac {\Phi (\beta -\sigma t)-\Phi (\alpha -\sigma t)}{\Phi (\beta )-\Phi (\alpha )}}\right]$

In probability and statistics, the truncated normal distribution is the probability distribution of a normally distributed random variable whose value is either bounded below or above (or both). The truncated normal distribution has wide applications in statistics and econometrics. For example, it is used to model the probabilities of the binary outcomes in the probit model and to model censored data in the Tobit model.

Definition

Suppose $X\sim N(\mu ,\sigma ^{2})$ has a normal distribution and lies within the interval $X\in (a,b),\;-\infty \leq a<b\leq \infty$ . Then $X$ conditional on $a<X<b$ has a truncated normal distribution.

Its probability density function, $f$ , for $a\leq x\leq b$ , is given by

f(x;\mu ,\sigma ,a,b)={\frac {{\frac {1}{\sigma }}\phi ({\frac {x-\mu }{\sigma }})}{\Phi ({\frac {b-\mu }{\sigma }})-\Phi ({\frac {a-\mu }{\sigma }})}}

and by $f=0$ otherwise.

Here,

\phi (\xi )={\frac {1}{\sqrt {2\pi }}}\exp \left(-{\frac {1}{2}}\xi ^{2}\right)

is the probability density function of the standard normal distribution and $\Phi (\cdot )$ is its cumulative distribution function. There is an understanding that if $b=\infty$ , then $\Phi \left({\tfrac {b-\mu }{\sigma }}\right)=1$ , and similarly, if $a=-\infty$ , then $\Phi \left({\tfrac {a-\mu }{\sigma }}\right)=0$ .

Moments

Let $\alpha =(a-\mu )/\sigma$ and $\beta =(b-\mu )/\sigma .$

Two sided truncation:^[2]

\operatorname {E} (X\mid a<X<b)=\mu +\sigma {\frac {\phi ({\frac {a-\mu }{\sigma }})-\phi ({\frac {b-\mu }{\sigma }})}{\Phi ({\frac {b-\mu }{\sigma }})-\Phi ({\frac {a-\mu }{\sigma }})}}\!=\mu +\sigma {\frac {\phi (\alpha )-\phi (\beta )}{\Phi (\beta )-\Phi (\alpha )}}\!

\operatorname {Var} (X\mid a<X<b)=\sigma ^{2}\left[1+{\frac {{\frac {a-\mu }{\sigma }}\phi ({\frac {a-\mu }{\sigma }})-{\frac {b-\mu }{\sigma }}\phi ({\frac {b-\mu }{\sigma }})}{\Phi ({\frac {b-\mu }{\sigma }})-\Phi ({\frac {a-\mu }{\sigma }})}}-\left({\frac {\phi ({\frac {a-\mu }{\sigma }})-\phi ({\frac {b-\mu }{\sigma }})}{\Phi ({\frac {b-\mu }{\sigma }})-\Phi ({\frac {a-\mu }{\sigma }})}}\right)^{2}\right]\!=\sigma ^{2}\left[1+{\frac {\alpha \phi (\alpha )-\beta \phi (\beta )}{\Phi (\beta )-\Phi (\alpha )}}-\left({\frac {\phi (\alpha )-\phi (\beta )}{\Phi (\beta )-\Phi (\alpha )}}\right)^{2}\right]\!

One sided truncation (upper tail):^[3]

\operatorname {E} (X\mid X>a)=\mu +\sigma \lambda (\alpha )\!

\operatorname {Var} (X\mid X>a)=\sigma ^{2}[1-\delta (\alpha )],\!

where $\alpha =(a-\mu )/\sigma ,\;\lambda (\alpha )=\phi (\alpha )/[1-\Phi (\alpha )]\;$ and $\;\delta (\alpha )=\lambda (\alpha )[\lambda (\alpha )-\alpha ]$ .

One sided truncation (lower tail):

\operatorname {E} (X\mid X<b)=\mu -\sigma {\frac {\phi (\beta )}{\Phi (\beta )}}\!

\operatorname {Var} (X\mid X<b)=\sigma ^{2}\left[1-\beta {\frac {\phi (\beta )}{\Phi (\beta )}}-\left({\frac {\phi (\beta )}{\Phi (\beta )}}\right)^{2}\right],\!

Barr and Sherrill (1999) give a simpler expression for the variance of one sided truncations. Their formula is in terms of the chi-square CDF, which is implemented in standard software libraries. Bebu and Mathew (2009) provide formulas for (generalized) confidence intervals around the truncated moments.

Differential equation

$\left\{\sigma ^{2}f'(x)+f(x)(x-\mu )=0,f(0)={\frac {{\sqrt {\frac {2}{\pi }}}e^{-{\frac {\mu ^{2}}{2\sigma ^{2}}}}}{\sigma \left({\text{erf}}\left({\frac {\mu -a}{{\sqrt {2}}\sigma }}\right)-{\text{erf}}\left({\frac {\mu -b}{{\sqrt {2}}\sigma }}\right)\right)}}\right\}$

A recursive formula

As for the non-truncated case, there is a neat recursive formula for the truncated moments. See.^[4]

Simulating

A random variate x defined as $x=\Phi ^{-1}(\Phi (\alpha )+U\cdot (\Phi (\beta )-\Phi (\alpha )))\sigma +\mu$ with $\Phi$ the cumulative distribution function and $\Phi ^{-1}$ its inverse, $U$ a uniform random number on $(0,1)$ , follows the distribution truncated to the range $(a,b)$ . This is simply the inverse transform method for simulating random variables. Although one of the simplest, this method can either fail when sampling in the tail of the normal distribution,^[5] or be much too slow. Thus, in practice, one has to find alternative methods of simulation.

One such truncated normal generator (implemented in Matlab and in R (programming language) as trandn.R ) is based on an acceptance rejection idea due to Marsaglia.^[6] Despite the slightly suboptimal acceptance rate of Marsaglia (1964) in comparison with Robert (1995), Marsaglia's method is typically faster, because it does not require the costly numerical evaluation of the exponential function.

For more on simulating a draw from the truncated normal distribution, see Robert (1995), Lynch (2007) Section 8.1.3 (pages 200–206), Devroye (1986). The MSM package in R has a function, rtnorm, that calculates draws from a truncated normal. The truncnorm package in R also has functions to draw from a truncated normal.

Chopin (2011) proposed (arXiv) an algorithm inspired from the Ziggurat algorithm of Marsaglia and Tsang (1984, 2000), which is usually considered as the fastest Gaussian sampler, and is also very close to Ahrens’s algorithm (1995). Implementations can be found in C, C++, Matlab and Python.

Sampling from the multivariate truncated normal distribution is considerably more difficult.^[7] Exact or perfect simulation is only feasible in the case of truncation of the normal distribution to a polytope region.^[7] In more general cases, Damien and Walker (2001) introduce a general methodology for sampling truncated densities within a Gibbs sampling framework. Their algorithm introduces one latent variable and, within a Gibbs sampling framework, it is more computationally efficient than the algorithm of Robert (1995).

References

↑ "Lecture 4: Selection" (pdf). web.ist.utl.pt. Instituto Superior Técnico. November 11, 2002. p. 1. Retrieved 14 July 2015.
↑ Johnson, N.L., Kotz, S., Balakrishnan, N. (1994) Continuous Univariate Distributions, Volume 1, Wiley. ISBN 0-471-58495-9 (Section 10.1)
↑ Greene, William H. (2003). Econometric Analysis (5th ed.). Prentice Hall. ISBN 0-13-066189-9.
↑ Document by Eric Orjebin, "http://www.smp.uq.edu.au/people/YoniNazarathy/teaching_projects/studentWork/EricOrjebin_TruncatedNormalMoments.pdf"
↑ Kroese, D. P.; Taimre, T.; Botev, Z. I. (2011). Handbook of Monte Carlo methods. John Wiley & Sons.
↑ Marsaglia, George (1964). "Generating a variable from the tail of the normal distribution". Technometrics. 6 (1): 101–102. doi:10.2307/1266749.
1 2 Botev, Z. I. (2016). "The normal law under linear restrictions: simulation and estimation via minimax tilting". Journal of the Royal Statistical Society: Series B (Statistical Methodology). doi:10.1111/rssb.12162.

Greene, William H. (2003). Econometric Analysis (5th ed.). Prentice Hall. ISBN 0-13-066189-9.
Norman L. Johnson and Samuel Kotz (1970). Continuous univariate distributions-1, chapter 13. John Wiley & Sons.
Lynch, Scott (2007). Introduction to Applied Bayesian Statistics and Estimation for Social Scientists. New York: Springer. ISBN 978-1-4419-2434-6.
Robert, Christian P. (1995). "Simulation of truncated normal variables". Statistics and Computing. 5 (2): 121–125. doi:10.1007/BF00143942.
Barr, Donald R.; Sherrill, E.Todd (1999). "Mean and variance of truncated normal distributions". The American Statistician. 53 (4): 357–361. doi:10.1080/00031305.1999.10474490.
Bebu, Ionut; Mathew, Thomas (2009). "Confidence intervals for limited moments and truncated moments in normal and lognormal models". Statistics and Probability Letters. 79: 375–380. doi:10.1016/j.spl.2008.09.006.
Damien, Paul; Walker, Stephen G. (2001). "Sampling truncated normal, beta, and gamma densities". Journal of Computational and Graphical Statistics. 10 (2): 206–215. doi:10.1198/10618600152627906.
Nicolas Chopin, "Fast simulation of truncated Gaussian distributions". Statistics and Computing 21(2): 275-288, 2011, doi:10.1007/s11222-009-9168-1

Probability distributions

List

Discrete univariate with finite support	Benford Bernoulli beta-binomial binomial categorical hypergeometric Poisson binomial Rademacher discrete uniform Zipf Zipf–Mandelbrot

Discrete univariate with infinite support	beta negative binomial Borel Conway–Maxwell–Poisson discrete phase-type Delaporte extended negative binomial Gauss–Kuzmin geometric logarithmic negative binomial parabolic fractal Poisson Skellam Yule–Simon zeta

Continuous univariate supported on a bounded interval	arcsine ARGUS Balding–Nichols Bates beta beta rectangular Irwin–Hall Kumaraswamy logit-normal noncentral beta raised cosine reciprocal triangular U-quadratic uniform Wigner semicircle

Continuous univariate supported on a semi-infinite interval	Benini Benktander 1st kind Benktander 2nd kind beta prime Burr chi-squared chi Dagum Davis exponential-logarithmic Erlang exponential F folded normal Flory–Schulz Fréchet gamma gamma/Gompertz generalized inverse Gaussian Gompertz half-logistic half-normal Hotelling's T-squared hyper-Erlang hyperexponential hypoexponential inverse chi-squared scaled inverse chi-squared inverse Gaussian inverse gamma Kolmogorov Lévy log-Cauchy log-Laplace log-logistic log-normal Lomax matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami noncentral chi-squared Pareto phase-type poly-Weibull Rayleigh relativistic Breit–Wigner Rice shifted Gompertz truncated normal type-2 Gumbel Weibull Discrete Weibull Wilks's lambda

Continuous univariate supported on the whole real line	Cauchy exponential power Fisher's z Gaussian q generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson's S_U Landau Laplace asymmetric Laplace logistic noncentral t normal (Gaussian) normal-inverse Gaussian skew normal slash stable Student's t type-1 Gumbel Tracy–Widom variance-gamma Voigt

Continuous univariate with support whose type varies	generalized extreme value generalized Pareto Tukey lambda q-Gaussian q-exponential q-Weibull shifted log-logistic

Mixed continuous-discrete univariate	rectified Gaussian

Multivariate (joint)	Discrete Ewens multinomial Dirichlet-multinomial negative multinomial Continuous Dirichlet generalized Dirichlet multivariate normal multivariate stable multivariate t normal-inverse-gamma normal-gamma Matrix-valued inverse matrix gamma inverse-Wishart matrix normal matrix t matrix gamma normal-inverse-Wishart normal-Wishart Wishart

Directional	Univariate (circular) directional Circular uniform univariate von Mises wrapped normal wrapped Cauchy wrapped exponential wrapped asymmetric Laplace wrapped Lévy Bivariate (spherical) Kent Bivariate (toroidal) bivariate von Mises Multivariate von Mises–Fisher Bingham

Degenerate and singular	Degenerate Dirac delta function Singular Cantor

Families	Circular compound Poisson elliptical exponential natural exponential location-scale maximum entropy mixture Pearson Tweedie wrapped

This article is issued from Wikipedia - version of the 7/19/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.

Truncated normal distribution

Definition

Moments

Simulating

See also

References