Multivariate normal distribution

In probability theory and statistics, a random vector X = (X1, ..., Xn) follows a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution (in honor of Carl Friedrich Gauss, who was not the first to write about the normal distribution), if it satisfies the following equivalent conditions:

  • there is a random vector Z=(Z1, ..., Zm), whose components are independent standard normal random variables, a vector μ = (μ1, ..., μn) and an n×m matrix A such that X = A Z + μ.
  • there is a vector μ and a symmetric, positive semi-definite matrix Γ such that the characteristic function of X is
φX(u)=exp(iμTu − (½) uT Γ u).

The following is not quite equivalent to the conditions above, since it fails to allow for a singular matrix as the variance:

  • there is a vector μ=(μ1, ..., μn) and a symmetric, positive definite matrix Σ such that X has density
f_X(x_1,\ldots,x_n)\, dx_1\ldots dx_n= \frac{1}{(2\pi)^{n/2}|\Sigma|^{1/2}} \exp\left(-\frac{1}{2}({\mathbf x}-{\mathbf\mu})^T{\mathbf\Sigma}^{-1}({\mathbf x}-{\mathbf\mu}) \right)dx_1\ldots dx_n

where \left|A\right| is the determinant of A. Note how the equation above reduces to that of the univariate normal distribution if Σ is a 1\times 1 matrix (ie a real number).

The vector μ in these conditions is the expected value of X and the matrix {\mathbf\Sigma}={\mathbf A}{\mathbf A}^T is the covariance matrix of the components Xi. It is important to realize that the covariance matrix must be allowed to be singular. That case arises frequently in statistics; for example, in the distribution of the vector of residuals in ordinary linear regression problems. Note also that the Xi are in general not independent; they can be seen as the result of applying the linear transformation A to a collection of independent Gaussian variables Z.

Contents

Linear transformation

If {\mathbf y}={\mathbf B}{\mathbf x} is a linear transformation of {\mathbf x} where {\mathbf B} is a rank m m\times p matrix with m\leq p then {\mathbf y} has a multivariate normal distribution with a mean of {\mathbf B}{\mathbf\mu} and a covariance matrix {\mathbf B}{\mathbf\Sigma}{\mathbf B}^T.

Corollary: any subset of the xi has a marginal distribution that is also multivariate normal. To see this consider the following example: to extract the subset (x1,x2,x4)T, use

{\mathbf B}= \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & \ldots & 0\\ 0 & 1 & 0 & 0 & 0 & \ldots & 0\\ 0 & 0 & 0 & 1 & 0 & \ldots & 0 \end{bmatrix}

which extracts the desired elements directly.

Marginal distributions

If {\mathbf x} is partitioned into {\mathbf x}_1 and {\mathbf x}_2 (so {\mathbf x}=({\mathbf x}_1,{\mathbf x}_2)^T (note that vectors are column vectors by default). Say {\mathbf x}_1 has q elements, so {\mathbf x}_2 has p - q elements.

Conditional distributions

Then if {\mathbf\mu} and {\mathbf\Sigma} are partitioned as follows

{\mathbf\mu}=\left(\begin{matrix} {\mathbf\mu}_1\\ {\mathbf\mu}_2 \end{matrix} \right) \qquad {\mathbf\Sigma}= \begin{bmatrix} {\mathbf\Sigma}_{11} & {\mathbf\Sigma}_{12} \\ {\mathbf\Sigma}_{21} & {\mathbf\Sigma}_{22} \end{bmatrix}

then the distribution of {\mathbf x}_1 conditional on {\mathbf x}_2={\mathbf a} is multivariate normal with mean

{\mathbf\mu}_1+{\mathbf\Sigma}_{12}{\mathbf\Sigma}_{22}^{-1}\left({\mathbf a}-{\mathbf\mu}_2\right)

and covariance matrix

{\mathbf\Sigma}_{11}- {\mathbf\Sigma}_{12} {\mathbf\Sigma}_{22}^{-1} {\mathbf\Sigma}_{21}.

This matrix is the Schur complement of {\mathbf\Sigma_{22}} in {\mathbf\Sigma}.

Note that knowing the value of {\mathbf x}_2 to be {\mathbf a} alters the variance; perhaps more suprisingly, the mean is shifted by {\mathbf\Sigma}_{12}{\mathbf\Sigma}_{22}^{-1}\left({\mathbf a}-{\mathbf\mu}_2\right); compare this with the situation of not knowing the value of {\mathbf a}, in which case {\mathbf x}_1 would have distribution N_q\left({\mathbf\mu}_1,{\mathbf\Sigma}_{11}\right).

The matrix {\mathbf\Sigma}_{12}{\mathbf\Sigma}_{22}^{-1} is known as the matrix of regression coefficients.

Estimation of parameters

The derivation of the maximum-likelihood estimator of the covariance matrix of a multivariate normal distribution is perhaps surprisingly subtle and elegant. See estimation of covariance matrices.


This article is licensed under the GNU Free Documentation License. It uses material from Wikipedia article. Browse Wikipedia for more information.