Thanks to convolution, we can obtain the probability distribution of a sum of independent random variables.

The Convolution Series

  1. Definition of convolution and intuition behind it
  2. Mathematical properties of convolution
  3. Convolution property of Fourier, Laplace, and z-transforms
  4. Identity element of the convolution
  5. Star notation of the convolution
  6. Circular vs. linear convolution
  7. Fast convolution
  8. Convolution vs. correlation
  9. Convolution in MATLAB, NumPy, and SciPy
  10. Deconvolution: Inverse convolution
  11. Convolution in probability: Sum of independent random variables

So far, we have looked into various aspects of convolution. One of its important applications is in probability: thanks to the convolution, we can obtain the probability density function (pdf) of a sum of two independent random variables (RVs). It turns out that the pdf of that sum is a convolution of pdfs of the two random variables.

In this article, we will show the proof of this theorem. This proof takes advantage of the convolution property of the Fourier transform.

Convolution Theorem in Probability

The probability density function of a sum of statistically independent random variables is the convolution of the contributing probability density functions.

Proof

Before we conduct the actual proof, we need to introduce the concept of the characteristic function.

The Characteristic Function

The characteristic function ΦX(jω)\Phi_X(j \omega) of a random variable XX is the Fourier transform of its probability density function fXf_X with a negated argument xx:

ΦX(jω)=E[ejωX]=fX(x)ejωxdx=fX(x)ejωxdx=F{fX(x)}.(1)\Phi_X(j \omega) = \mathbb{E} \left[ e^{j\omega X} \right] = \int \limits_{-\infty}^{\infty} f_X(x) e ^{j\omega x} dx \newline = \int \limits_{-\infty}^{\infty} f_X(-x) e ^{-j \omega x} dx = \mathcal{F} \{f_X(-x)\}. \quad (1)

Let us observe that

ΦX(jω)=F{fX(x)}.(2)\Phi_X(-j \omega) = \mathcal{F} \{f_X(x)\}. \quad (2)

Another building block of the proof is the independence assumption which we examine next.

Independence of Random Variables

Two random variables are called statistically independent if their joint probability density function factorizes into the respective pdfs of the RVs.

If we have two RVs, XX and YY, they are independent if and only if

fXY(x,y)=fX(x)fY(y),(3) f_{XY}(x,y) = f_X(x)f_Y(y), \quad (3)

where fXYf_{XY} is the joint pdf of XX and YY (probability density of all possible combinations of XX and YY values).

Sum of Two Independent Random Variables

Now to the main part of the proof!

We have two independent random variables, XX and YY, with probability density functions fXf_X and fYf_Y respectively. We want to know what is the probability density function of the sum of XX and YY, i.e., what is the formula for fX+Yf_{X+Y}. To discover that formula, we calculate the characteristic function of X+YX+Y:

ΦX+Y(jω)=E[ejω(X+Y)]=fXY(x,y)ejω(x+y)dxdy=fX(x)ejωxdxfY(y)ejωydy=E[ejωX]E[ejωY]=ΦX(jω)ΦY(jω).(4)\Phi_{X+Y}(j \omega) = \mathbb{E} \left[ e^{j\omega (X+Y)} \right] \newline = \int \limits_{-\infty}^{\infty} \int \limits_{-\infty}^{\infty} f_{XY}(x, y) e ^{j\omega (x+y)} dxdy \newline = \int \limits_{-\infty}^{\infty} f_X(x) e ^{j\omega x} dx \int \limits_{-\infty}^{\infty} f_Y(y) e ^{j\omega y} dy \newline = \mathbb{E} \left[ e^{j\omega X} \right] \mathbb{E} \left[ e^{j\omega Y} \right] = \Phi_X(j \omega) \Phi_Y(j \omega). \quad (4)

Note that we could separate the integrals only thanks to the independence of the two random variables: splitting fXYf_{XY} into a product of fXf_X and fYf_Y.

Convolution Property of the Fourier Transform

We found out that the characteristic function of a sum of two independent random variables is equal to the product of the individual characteristic functions of these random variables (Equation 4). Additionally, the characteristic function of a random variable with a negated argument is the Fourier transform of this RV’s probability density function (Equation 3). We thus have

fX+Y(x)FΦX+Y(jω),(5)f_{X+Y}(x) \stackrel{\mathcal{F}}{\longleftrightarrow} \Phi_{X+Y}(-j \omega), \quad (5)

and

ΦX+Y(jω)=ΦX(jω)ΦY(jω),(6)\Phi_{X+Y}(-j \omega) = \Phi_{X}(-j \omega) \Phi_{Y}(-j \omega) , \quad (6)

The convolution property of the Fourier transform tells us that the multiplication in the Fourier domain is equivalent to convolution in the other domain (here: the domain of the random variable). Therefore,

ΦX+Y(jω)=ΦX(jω)ΦY(jω)FfX(x)fY(x)=fX+Y(x),(7)\Phi_{X+Y}(-j \omega) = \Phi_{X}(-j \omega) \Phi_{Y}(-j \omega) \stackrel{\mathcal{F}}{\longleftrightarrow} f_X(x) \ast f_Y(x) \newline = f_{X+Y}(x), \quad (7)

what concludes the proof \Box.

Note: xx is used instead of yy as the argument of fYf_Y in Equation 7 because it doesn’t matter what letter we use; fXf_X, fYf_Y, and fX+Yf_{X+Y} are all pdfs of one-dimensional random variables.

Final Remark

This proof can be extended to arbitrarily many random variables with the requirement that all of them are mutually independent.

Summary

In this article, we have proven that the probability distribution of a sum of independent random variables is a convolution of probability distributions of these random variables.

Bibliography

[1] Walter Kellermann, Statistical Signal Processing Lecture Notes, Winter Semester 2019/2020, University of Erlangen-Nürnberg.