Using Chebyshev’s inequality and the Borel-Cantelli lemma to study the variation of Brownian motion


Chebyshev’s inequality and the Borel-Cantelli lemma are seemingly disparate results from probability theory but they combine beautifully in demonstrating a curious property of Brownian motion: that it has finite quadratic variation even though it has unbounded linear variation. Not only do the proofs of Chebyshev’s inequality and the Borel-Cantelli lemma have some interesting features themselves, but their application to the variation of Brownian motion also provides a nice example of how to prove explicitly that a sequence of random variables converges almost surely by showing that the probability of the divergence set is zero. In this note I want to bring out this last aspect in particular.

There are two equivalent forms of Chebyshev’s inequality. Letting X denote an integrable random variable with mean \mu and finite non-zero variance \sigma^2, one form of Chebyshev’s inequality says

P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}

where k > 0 is a real number.

Equivalently, if we let \epsilon = k\sigma, then we can write Chebyshev’s inequality as

P(|X - \mu| \geq \epsilon) \leq \frac{\sigma^2}{\epsilon^2}

Proof of Chebyshev’s inequality

The following is a proof of the second form. Since \epsilon > 0, we have

|X - \mu| \geq \epsilon \iff (X - \mu)^2 \geq \epsilon^2

Let g(t) = (t - \mu)^2 so that \mathbb{E}[g(X)] = \sigma^2.


A = \{\omega : g(X(\omega)) \geq \epsilon^2\}

\equiv \{\omega : (X - \mu)^2 \geq \epsilon^2\}

\equiv \{\omega : |X - \mu| \geq \epsilon\}

If \omega \in A then I_A(\omega) = 1 and

\epsilon^2I_A(\omega) \leq g(X(\omega))

(by definition of A). However, if \omega \in A^c then I_A(\omega) = 0 so it is still true that

\epsilon^2I_A(\omega) \leq g(X(\omega))

Thus, this inequality always holds. Taking expectations we get

\mathbb{E}[\epsilon^2I_A(\omega)] = \epsilon^2P(A) = \epsilon^2P(|X - \mu| \geq \epsilon) \leq \mathbb{E}[g(X(\omega))] = \sigma^2

Division of both sides by \epsilon^2 gives Chebyshev’s inequality. QED

The Borel-Cantelli lemma actually consists of a pair of results. Let \{A_k : 1 \leq k\} be a sequence of events, i.e., a sequence of subsets of \Omega. Then

\limsup \limits_{ k } A_k \equiv \bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty} A_k

is the set of all elements \omega \in \Omega that are in an infinity of the subsets A_k. The Borel-Cantelli lemma says the following:

(a) If \sum_{k=1}^{\infty} P(A_k) < \infty then P\bigg(\limsup \limits_{ k } A_k\bigg) = 0

(b) If \{A_k : 1 \leq k\} is independent and \sum_{k=1}^{\infty}P(A_k) = \infty then P\bigg(\limsup \limits_{ k } A_k\bigg) = 1

Proof of the Borel-Cantelli lemma

(a) For any integer n \geq 1, we can write

0 \leq P\bigg(\limsup \limits_{ k } A_k\bigg) = P(\bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty} A_k) \leq P(\bigcup_{k=n}^{\infty}) \leq \sum_{k=n}^{\infty}P(A_k)

(the last inequality by subadditivity). Since \sum_{k=1}^{\infty} P(A_k) converges, the tail sums \sum_{k=n}^{\infty}P(A_k) \rightarrow 0 as n \rightarrow \infty. Therefore (by a ‘sandwich’ argument) P\bigg(\limsup \limits_{ k } A_k\bigg) = 0.

(b) We have

\bigg(\limsup \limits_{ k } A_k\bigg)^c = (\bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty} A_k)^c

= \bigcup_{n=1}^{\infty} (\bigcup_{k=n}^{\infty} A_k)^c = \bigcup_{n=1}^{\infty} \bigcap_{k=n}^{\infty} A_k^c

\equiv \bigcup_{n=1}^{\infty} F_n

(by De Morgan’s laws). Therefore

1 - P\bigg(\limsup \limits_{ k } A_k\bigg) = P\bigg(\bigg(\limsup \limits_{ k } A_k\bigg)^c\bigg) = P(\bigcup_{n=1}^{\infty} F_n) \leq \sum_{n=1}^{\infty}P(F_n)

By the product rule for independent events,

P(F_n) = P(\bigcap_{k=n}^{\infty} A_k^c) = \prod_{k=n}^{\infty}P(A_k^c) = \prod_{k=n}^{\infty}(1 - P(A_k))

Now, for all a \geq 0 we have 1 - a \leq e^{-a}. (To see this, let f(a) = e^{-a} - (1 - a). Then f(0) = 0 and f^{\prime}(a) = -e^{-a} + 1 > 0. Therefore f(a) is an increasing function, so f(a) \geq f(0) for a \geq 0). It follows that

1 - P(A_k) \leq \exp(-P(A_k))

and therefore also

\prod_{k=n}^{N}(1 - P(A_k)) \leq \exp(-\sum_{k=n}^N P(A_k))

But \exp(-\sum_{k=n}^N P(A_k)) \rightarrow 0 as N \rightarrow \infty. Therefore P(F_n) = 0 and \sum_{n=1}^{\infty}P(F_n) = 0, so 1 - P\bigg(\limsup \limits_{ k } A_k\bigg) = 0 \implies P\bigg(\limsup \limits_{ k } A_k\bigg) = 1. QED

To apply these results to the variation of Brownian motion, let

\triangle_n \equiv \{a = t_0 < t_1 < \cdots < t_n = b\}

be a partition of the interval [a, b]. Define the mesh of the partition \triangle_n by

||\triangle_n|| \equiv \max_{1 \leq i \leq n}(t_i - t_{i-1})

Let us restrict ourselves only to sequences of partitions \{\triangle_n : 1 \leq n\} for which ||\triangle_n|| \rightarrow 0, and in particular for which \sum_{n=1}^{\infty} ||\triangle_n|| < \infty. For every p > 0 let

Q_p(f; a, b, \triangle_n) \equiv \sum_{i=1}^{n}|f(t_i) - f(t_{i-1})|^p

and let

V_p(f; a, b) \equiv \lim \limits_{||\triangle_n|| \rightarrow 0} Q_p(f; a, b, \triangle_n)

where f is a real-valued function defined on [a, b]. If V_p(f; a, b) < \infty we will say that f has finite p^{th} variation on [a, b]. In the case p = 1, if V_1(f; a, b) < \infty we will say that f is of bounded linear variation on [a, b]. In the case p = 2, if V_2(f; a, b) < \infty we will say that f is of bounded quadratic variation on [a, b]

If we replace f in the above by a Brownian motion B, then we find a curious result: B is of unbounded linear variation almost surely, but it has a finite quadratic variation equal to b - a almost surely. In other words,

V_1(B; a, b) = \infty    (a.s.)

V_2(B; a, b) = b - a    (a.s.)

We will use Chebyshev’s inequality and the Borel-Cantelli lemma to prove these.

Theorem 1

If \{\triangle_n : 1 \leq n\} is a sequence of partitions of [a, b] with \sum_{n=1}^{\infty} ||\triangle_n|| < \infty (and therefore ||\triangle_n|| \rightarrow 0), then

Q_2(B; a, b, \triangle_n) = \sum_{i=1}^n |B(t_i) - B(t_{i-1})|^2 \rightarrow b - a (a.s.)

as n \rightarrow \infty. In other words, Brownian motion has bounded quadratic variation on [a, b] equal to b - a, almost surely.


Observe that, by ‘telescoping’,

\sum_{i=1}^{n}(t_i - t_{i-1}) = b - a


Y_n = \sum_{i=1}^{n}(B(t_i) - B(t_{i-1}))^2 - (b - a)

= \sum_{i=1}^{n}\big\{(B(t_i) - B(t_{i-1}))^2 - (t_i - t_{i-1})\big\}

Note that \mathbb{E}[Y_n] = 0 because \mathbb{E}[(B(t_i) - B(t_{i-1}))^2] = t_i - t_{i-1}. Therefore

\mathbb{V}[Y_n] = \mathbb{E}[Y_n^2]

= \mathbb{E}\big[\big(\sum_{i=1}^n \big\{(B(t_i) - B(t_{i-1}))^2 - (t_i - t_{i-1})\big\} \big)^2 \big]

= \sum_{i=1}^n \mathbb{E}\big[\big\{(B(t_i) - B(t_{i-1}))^2 - (t_i - t_{i-1})\big\}^2 \big]

(cross product terms vanish since Brownian motion has independent increments)

= \sum_{i=1}^n \mathbb{E}\big[(B(t_i) - B(t_{i-1}))^4 - 2(t_i - t_{i-1})(B(t_i) - B(t_{i-1}))^2 + (t_i - t_{i-1})^2 \big]

= \sum_{i=1}^n \big(3(t_i - t_{i-1})^2 - 2(t_i - t_{i-1})^2 + (t_i - t_{i-1})^2 \big)

(since the fourth moment of a zero-mean normal random variable is 3\sigma^4 and here \sigma^2 = (t_i - t_{i-1}))

= \sum_{i=1}^n 2(t_i - t_{i-1})^2

\leq 2 ||\triangle_n||\sum_{i=1}^n (t_i - t_{i-1})

= 2(b - a)||\triangle_n||

(Since ||\triangle_n|| \rightarrow 0, we see already at this stage that Y_n \rightarrow 0 in \mathcal{L}^2).

By the second form of Chebyshev’s inequality we can now write for any \epsilon > 0 that

\sum_{n=1}^{\infty}P(|Y_n| > \epsilon) \leq \sum_{n=1}^{\infty} \frac{\mathbb{E}[Y_n^2]}{\epsilon^2}

\leq \frac{2(b - a)}{\epsilon^2}\sum_{n=1}^{\infty}||\triangle_n|| < \infty

In particular, we can write for any integer q \geq 1 that

\sum_{n=1}^{\infty} P\big(|Y_n| > \frac{1}{q}\big) < \infty

By the first result in the Borel-Cantelli lemma we can then say

P\bigg(\limsup \limits_{ k } A_k(q)\bigg) = 0


A_k(q) = \big\{\omega : |Y_k| > \frac{1}{q}\big\}


\limsup \limits_{ k } A_k(q) = \bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty}A_k(q)

so if P\bigg(\limsup \limits_{ k } A_k(q)\bigg) = 0 then we must also have

P\big(\bigcup_{q=1}^{\infty} \bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty}A_k(q)\big) = 0


\bigcup_{q=1}^{\infty} \bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty}A_k(q)

is the divergence set when considering the almost sure convergence of Y_n to zero. Since the probability of the divergence set is zero, we conclude that Y_n \rightarrow 0 a.s., and therefore Q_p(B; a, b, \triangle_n) \rightarrow b - a (a.s.). QED

Theorem 2

If \{\triangle_n : 1 \leq n\} is a sequence of partitions of [a, b] with ||\triangle_n|| \rightarrow 0, then

Q_1(B; a, b, \triangle_n) = \sum_{i=1}^n |B(t_i) - B(t_{i-1})| \rightarrow \infty (a.s.)

as n \rightarrow \infty. In other words, Brownian motion has unbounded linear variation on [a, b], almost surely.


By contradiction. Suppose Brownian motion has bounded linear variation, i.e., V_1(B; a, b) < \infty. Then we can write

\sum_{i=1}^n |B(t_i) - B(t_{i-1})|^2 \leq \max_{1 \leq i \leq n}|B(t_i) - B(t_{i-1})|\sum_{i=1}^n |B(t_i) - B(t_{i-1})|

\leq V_1(B; a, b)\max_{1 \leq i \leq n}|B(t_i) - B(t_{i-1})|

Since B is uniformly continuous on [a, b] (B is continuous on [a, b] and any continuous function is uniformly continuous when restricted to a compact set) we can write

\max_{1 \leq i \leq n}|B(t_i) - B(t_{i-1})| \rightarrow 0 as ||\triangle_n|| \rightarrow 0

which if V_1(B; a, b) < \infty implies

\sum_{i=1}^n |B(t_i) - B(t_{i-1})|^2 \rightarrow 0 (a.s.)

This contradicts Theorem 1. Therefore B cannot have bounded linear variation. QED