Inspecting the mathematical properties of convolution leads to interesting conclusions regarding digital signal processing.

In the previous article we discussed the definition of the convolution operation. Now, it is time to look more closely at its mathematical properties in the context of digital signal processing.

The Convolution Series

  1. Definition of convolution and intuition behind it
  2. Mathematical properties of convolution
  3. Convolution property of Fourier, Laplace, and z-transforms
  4. Identity element of the convolution
  5. Star notation of the convolution
  6. Circular vs. linear convolution
  7. Fast convolution
  8. Convolution vs. correlation
  9. Convolution in MATLAB, NumPy, and SciPy
  10. Deconvolution: Inverse convolution
  11. Convolution in probability: Sum of independent random variables


Let us recap the definition of the discrete convolution. With discrete signals x[n],h[n]x[n], h[n] being square-summable, their convolution is defined as x[n]h[n]=k=x[k]h[nk]=y[n],nZ.(1) x[n] \ast h[n] = \sum_{k=-\infty}^{\infty} x[k] h[n - k] = y[n], \quad n \in \mathbb{Z}. \quad (1) Convolution for continuous, square-integrable x,hx, h is defined as follows x(t)h(t)=x(τ)h(tτ)dτ,tR.(2) x(t) \ast h(t) = \int \limits_{-\infty}^{\infty} x(\tau) h(t - \tau) d\tau, \quad t \in \mathbb{R}. \quad (2)

In the following considerations we assume, that xx is some signal (e.g., an audio signal) and hh, h1h_1, h2h_2 are impulse responses of some filters.

Properties of convolution

In this article the following properties of the convolution are discussed

  • commutativity: xh=hxx \ast h = h \ast x
  • associativity: x(h1h2)=(xh1)h2x \ast (h_1 \ast h_2) = (x \ast h_1) \ast h_2
  • linearity:
    (αx1+βx2)h=α(x1h)+β(x2h)(\alpha x_1 + \beta x_2) \ast h = \alpha (x_1\ast h) + \beta (x_2 \ast h)

(Here, xx and hh can be both discrete or both continuous.)

Their formulations and proofs are provided for the discrete as well as continuous cases.


Commutativity of an operation means that its operands can be exchanged without affecting the result

xh=hx.(3) x \ast h = h \ast x. \quad (3)

This property has a very interesting interpretation in the context of signal processing. It turns out, we can interpret a system’s impact on a signal as the signal’s impact on the system’s impulse response. In particular, the filtering operation can be viewed as if the input signal was filtering the filter’s impulse response (Figure 1). As we have already seen in the previous article, it is completely true: the output of a filter is a sum of its repeatedly scaled and delayed (=filtered) impulse response.

Figure 1. Commutativity of convolution enables an exchange of filter with its input.

Another, even more profound interpretation of the commutativity of convolution is shown in Figure 2.

Figure 2. Commutativity of convolution makes the ordering of filters in a series irrelevant.

The commutativity property means that we can exchange the order in which we apply filters (here represented by impulse responses h1[n]h_1[n] and h2[n]h_2[n]). It doesn’t matter whether we filter input with h1h_1 and then with h2h_2 or the other way around; the result will be the same.

Proof for the discrete case

x[n]h[n]=k=x[k]h[nk]=k=nk;k=nk=k=x[nk]h[k]=k=h[k]x[nk]=h[n]x[n]. x[n] \ast h[n] = \sum_{k=-\infty}^{\infty} x[k] h[n - k] = | k' = n-k; k = n - k'| \\ =\sum_{k'=-\infty}^{\infty} x[n-k'] h[k'] = \sum_{k'=-\infty}^{\infty} h[k'] x[n-k'] = h[n] \ast x[n]. \quad \Box

Proof for the continuous case

x(t)h(t)=x(τ)h(tτ)dτ=τ=tτ;τ=tτ;dτ=dτ=x(tτ)h(τ)dτ=h(τ)x(tτ)dτ=h(t)x(t). x(t) \ast h(t) = \int \limits_{-\infty}^{\infty} x(\tau) h(t - \tau) d\tau \\ = | \tau' = t - \tau; \tau = t - \tau'; d\tau' = -d\tau | \\ = - \int \limits_{\infty}^{-\infty} x(t - \tau') h(\tau') d\tau' = \int \limits_{-\infty}^{\infty} h(\tau') x(t - \tau') d\tau' \\= h(t) \ast x(t). \quad \Box

Note the inversion of boundaries and the resulting change of the sign.


Associativity of an operation ensures that we can calculate the results of this operation in any order when given a couple of them in series x(h1h2)=(xh1)h2.(4)x \ast (h_1 \ast h_2) = (x \ast h_1) \ast h_2. \quad (4)

A practical interpretation of this would be that a series of filters applied one after another constitutes an equivalent system to a system whose impulse response is a convolution of the filters’ impulse responses. So the impulse response of filters arranged in a series is a convolution of their impulse responses (Figure 3).

Figure 3. Associativity of the convolution enables us to exchange successive filters with a single filter whose impulse response is a convolution of the initial filters’ impulse responses.

Proof for the discrete case

x[n](h1[n]h2[n])=x[n]k=h2[k]h1[nk]=l=k=x[l]h2[k]h1[nlk]=k=h2[k]l=x[l]h1[nkl]=k=h2[k](xh1)[nk]=((xh1)h2)[n]=(x[n]h1[n])h2[n].x[n] \ast (h_1[n] \ast h_2[n]) = x[n] \ast \sum_{k=-\infty}^{\infty} h_2[k]h_1[n-k] \\= \sum_{l=-\infty}^{\infty}\sum_{k=-\infty}^{\infty} x[l]h_2[k]h_1[n-l-k] \\= \sum_{k=-\infty}^{\infty} h_2[k] \sum_{l=-\infty}^{\infty} x[l]h_1[n-k-l] \\= \sum_{k=-\infty}^{\infty} h_2[k] (x \ast h_1)[n-k] \\= ((x \ast h_1) \ast h_2) [n] \\= (x[n] \ast h_1[n]) \ast h_2[n]. \quad \Box

Proof for the continuous case

x(t)(h1(t)h2(t))=x(t)h2(τ)h1(tτ)dτ=x(ψ)h2(τ)h1(tψτ)dτdψ=h2(τ)x(ψ)h1(tτψ)dψdτ=h2(τ)(xh1)(tτ)dτ=((xh1)h2)(t)=(x(t)h1(t))h2(t).x(t) \ast (h_1(t) \ast h_2(t)) \\= x(t) \ast \int \limits_{-\infty}^{\infty} h_2(\tau) h_1(t - \tau) d\tau \\= \int \limits_{-\infty}^{\infty} x(\psi) \int \limits_{-\infty}^{\infty} h_2(\tau) h_1(t - \psi - \tau) d\tau d\psi \\= \int \limits_{-\infty}^{\infty} h_2(\tau) \int \limits_{-\infty}^{\infty} x(\psi)h_1(t - \tau - \psi) d \psi d \tau \\= \int \limits_{-\infty}^{\infty} h_2(\tau) (x \ast h_1)(t - \tau) d\tau = ((x \ast h_1) \ast h_2)(t) \\= (x(t) \ast h_1(t)) \ast h_2(t). \quad \Box


The last property to be examined and proved is the linearity property of the convolution a(x(h1+h2))=(axh1)+(axh2)(5) a(x \ast (h_1 + h_2)) = (ax \ast h_1) + (ax \ast h_2) \quad (5) which consists of distributivity x(h1+h2)=xh1+xh2(6) x \ast (h_1 + h_2) = x \ast h_1 + x \ast h_2 \quad (6) and associativity with scalar multiplication a(xh)=(ax)h.(7) a (x \ast h) = (ax) \ast h. \quad (7)

Distributivity means that filtering a signal with a superposition (a sum) of filters is equivalent to summing the results of filtering with the individual filters. It is as if the signal was filtered by them independently (in parallel). Thus, summing in digital signal processing corresponds to joining two independent paths of processing.

Thanks to this property we can filter the signal only once (with filters’ superposition) rather than run the signal through each filter separately and only then sum the outputs. As you can imagine, it can be a significant optimization advantage.

Sometimes, however, it may be easier to analyze a complex filter by splitting its impulse response into constituent impulse responses, effectively replacing a filter with a set of parallel filters. This is the idea underlying the filterbanks. It is also used in synthesis of a room impulse response for artificial reverberation (so-called modal decomposition).

Associativity with scalar multiplication simply means that it doesn’t matter if we scale the input signal (before filtering) or the output signal (after filtering); the resulting signal will be the same.

These equivalencies are depicted in Figure 4.

Figure 4. Distributivity means that a signal filtered in parallel processing paths is effectively filtered by a superposition of these paths. Associativity with scalar multiplication enables us to scale either the input or the output of a system with an equal effect. These two properties determine the linearity of the convolution.

Proof for the discrete case

a(x[n](h1[n]+h2[n]))=ak=x[k](h1[nk]+h2[nk])=k=ax[k]h1[nk]+ax[k]h2[nk]=k=ax[k]h1[nk]+k=ax[k]h2[nk]=(ax[n])h1[n]+(ax[n])h2[n].a(x[n] \ast (h_1[n] + h_2[n])) \\= a \sum_{k=-\infty}^{\infty} x[k] (h_1[n-k] + h_2[n-k]) \\= \sum_{k=-\infty}^{\infty} ax[k]h_1[n-k] + ax[k]h_2[n-k] \\= \sum_{k=-\infty}^{\infty} ax[k]h_1[n-k] + \sum_{k=-\infty}^{\infty} ax[k]h_2[n-k] \\= (ax[n]) \ast h_1[n] + (ax[n]) \ast h_2[n]. \quad \Box

Proof for the continuous case

a(x(t)(h1(t)+h2(t)))=ax(τ)(h1(tτ)+h2(tτ))dτ=(ax(τ)h1(tτ)+ax(τ)h2(tτ))dτ=ax(τ)h1(tτ)dτ+ax(τ)h2(tτ)dτ=(ax(t))h1(t)+(ax(t))h2(t).a(x(t) \ast (h1(t) + h2(t))) = a \int \limits_{-\infty}^{\infty} x(\tau) (h_1(t - \tau) + h_2(t - \tau)) d\tau \\= \int \limits_{-\infty}^{\infty} (ax(\tau) h_1(t - \tau) + ax(\tau)h_2(t - \tau))d \tau \\= \int \limits_{-\infty}^{\infty} a x(\tau) h_1(t - \tau) d\tau + \int \limits_{-\infty}^{\infty} ax(\tau) h_2(t - \tau) d\tau \\= (ax(t)) \ast h_1(t) + (ax(t)) \ast h_2(t). \quad \Box


In this article we reviewed the most important mathematical properties of the convolution, namely

  • commutativity,
  • associativity, and
  • linearity.

These properties will prove themselves useful in our future considerations of convolution.


[1] Convolution on Wikipedia. Retrieved: 09.03.2021.

[2] Alan V Oppenheim, Ronald W. Schafer Discrete-Time Signal Processing, 3rd Edition, Pearson 2010.

[3] Alan V. Oppenheim, Alan S. Willsky, with S. Hamid Signals and Systems, 2nd Edition, Pearson 1997.

[4] Commutativity proof for the continuous case on Mathematics StackExchange.