# Using Chebyshev’s inequality and the Borel-Cantelli lemma to study the variation of Brownian motion

Chebyshev’s inequality and the Borel-Cantelli lemma are seemingly disparate results from probability theory but they combine beautifully in demonstrating a curious property of Brownian motion: that it has finite quadratic variation even though it has unbounded linear variation. Not only do the proofs of Chebyshev’s inequality and the Borel-Cantelli lemma have some interesting features themselves, but their application to the variation of Brownian motion also provides a nice example of how to prove explicitly that a sequence of random variables converges almost surely by showing that the probability of the divergence set is zero. In this note I want to bring out this last aspect in particular.

There are two equivalent forms of Chebyshev’s inequality. Letting $X$ denote an integrable random variable with mean $\mu$ and finite non-zero variance $\sigma^2$, one form of Chebyshev’s inequality says

$P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}$

where $k > 0$ is a real number.

Equivalently, if we let $\epsilon = k\sigma$, then we can write Chebyshev’s inequality as

$P(|X - \mu| \geq \epsilon) \leq \frac{\sigma^2}{\epsilon^2}$

Proof of Chebyshev’s inequality

The following is a proof of the second form. Since $\epsilon > 0$, we have

$|X - \mu| \geq \epsilon \iff (X - \mu)^2 \geq \epsilon^2$

Let $g(t) = (t - \mu)^2$ so that $\mathbb{E}[g(X)] = \sigma^2$.

Let

$A = \{\omega : g(X(\omega)) \geq \epsilon^2\}$

$\equiv \{\omega : (X - \mu)^2 \geq \epsilon^2\}$

$\equiv \{\omega : |X - \mu| \geq \epsilon\}$

If $\omega \in A$ then $I_A(\omega) = 1$ and

$\epsilon^2I_A(\omega) \leq g(X(\omega))$

(by definition of $A$). However, if $\omega \in A^c$ then $I_A(\omega) = 0$ so it is still true that

$\epsilon^2I_A(\omega) \leq g(X(\omega))$

Thus, this inequality always holds. Taking expectations we get

$\mathbb{E}[\epsilon^2I_A(\omega)] = \epsilon^2P(A) = \epsilon^2P(|X - \mu| \geq \epsilon) \leq \mathbb{E}[g(X(\omega))] = \sigma^2$

Division of both sides by $\epsilon^2$ gives Chebyshev’s inequality. QED

The Borel-Cantelli lemma actually consists of a pair of results. Let $\{A_k : 1 \leq k\}$ be a sequence of events, i.e., a sequence of subsets of $\Omega$. Then

$\limsup \limits_{ k } A_k \equiv \bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty} A_k$

is the set of all elements $\omega \in \Omega$ that are in an infinity of the subsets $A_k$. The Borel-Cantelli lemma says the following:

(a) If $\sum_{k=1}^{\infty} P(A_k) < \infty$ then $P\bigg(\limsup \limits_{ k } A_k\bigg) = 0$

(b) If $\{A_k : 1 \leq k\}$ is independent and $\sum_{k=1}^{\infty}P(A_k) = \infty$ then $P\bigg(\limsup \limits_{ k } A_k\bigg) = 1$

Proof of the Borel-Cantelli lemma

(a) For any integer $n \geq 1$, we can write

$0 \leq P\bigg(\limsup \limits_{ k } A_k\bigg) = P(\bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty} A_k) \leq P(\bigcup_{k=n}^{\infty}) \leq \sum_{k=n}^{\infty}P(A_k)$

(the last inequality by subadditivity). Since $\sum_{k=1}^{\infty} P(A_k)$ converges, the tail sums $\sum_{k=n}^{\infty}P(A_k) \rightarrow 0$ as $n \rightarrow \infty$. Therefore (by a ‘sandwich’ argument) $P\bigg(\limsup \limits_{ k } A_k\bigg) = 0$.

(b) We have

$\bigg(\limsup \limits_{ k } A_k\bigg)^c = (\bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty} A_k)^c$

$= \bigcup_{n=1}^{\infty} (\bigcup_{k=n}^{\infty} A_k)^c = \bigcup_{n=1}^{\infty} \bigcap_{k=n}^{\infty} A_k^c$

$\equiv \bigcup_{n=1}^{\infty} F_n$

(by De Morgan’s laws). Therefore

$1 - P\bigg(\limsup \limits_{ k } A_k\bigg) = P\bigg(\bigg(\limsup \limits_{ k } A_k\bigg)^c\bigg) = P(\bigcup_{n=1}^{\infty} F_n) \leq \sum_{n=1}^{\infty}P(F_n)$

By the product rule for independent events,

$P(F_n) = P(\bigcap_{k=n}^{\infty} A_k^c) = \prod_{k=n}^{\infty}P(A_k^c) = \prod_{k=n}^{\infty}(1 - P(A_k))$

Now, for all $a \geq 0$ we have $1 - a \leq e^{-a}$. (To see this, let $f(a) = e^{-a} - (1 - a)$. Then $f(0) = 0$ and $f^{\prime}(a) = -e^{-a} + 1 > 0$. Therefore $f(a)$ is an increasing function, so $f(a) \geq f(0)$ for $a \geq 0$). It follows that

$1 - P(A_k) \leq \exp(-P(A_k))$

and therefore also

$\prod_{k=n}^{N}(1 - P(A_k)) \leq \exp(-\sum_{k=n}^N P(A_k))$

But $\exp(-\sum_{k=n}^N P(A_k)) \rightarrow 0$ as $N \rightarrow \infty$. Therefore $P(F_n) = 0$ and $\sum_{n=1}^{\infty}P(F_n) = 0$, so $1 - P\bigg(\limsup \limits_{ k } A_k\bigg) = 0 \implies P\bigg(\limsup \limits_{ k } A_k\bigg) = 1$. QED

To apply these results to the variation of Brownian motion, let

$\triangle_n \equiv \{a = t_0 < t_1 < \cdots < t_n = b\}$

be a partition of the interval $[a, b]$. Define the mesh of the partition $\triangle_n$ by

$||\triangle_n|| \equiv \max_{1 \leq i \leq n}(t_i - t_{i-1})$

Let us restrict ourselves only to sequences of partitions $\{\triangle_n : 1 \leq n\}$ for which $||\triangle_n|| \rightarrow 0$, and in particular for which $\sum_{n=1}^{\infty} ||\triangle_n|| < \infty$. For every $p > 0$ let

$Q_p(f; a, b, \triangle_n) \equiv \sum_{i=1}^{n}|f(t_i) - f(t_{i-1})|^p$

and let

$V_p(f; a, b) \equiv \lim \limits_{||\triangle_n|| \rightarrow 0} Q_p(f; a, b, \triangle_n)$

where $f$ is a real-valued function defined on $[a, b]$. If $V_p(f; a, b) < \infty$ we will say that $f$ has finite $p^{th}$ variation on $[a, b]$. In the case $p = 1$, if $V_1(f; a, b) < \infty$ we will say that $f$ is of bounded linear variation on $[a, b]$. In the case $p = 2$, if $V_2(f; a, b) < \infty$ we will say that $f$ is of bounded quadratic variation on $[a, b]$

If we replace $f$ in the above by a Brownian motion $B$, then we find a curious result: $B$ is of unbounded linear variation almost surely, but it has a finite quadratic variation equal to $b - a$ almost surely. In other words,

$V_1(B; a, b) = \infty$    (a.s.)

$V_2(B; a, b) = b - a$    (a.s.)

We will use Chebyshev’s inequality and the Borel-Cantelli lemma to prove these.

Theorem 1

If $\{\triangle_n : 1 \leq n\}$ is a sequence of partitions of $[a, b]$ with $\sum_{n=1}^{\infty} ||\triangle_n|| < \infty$ (and therefore $||\triangle_n|| \rightarrow 0$), then

$Q_2(B; a, b, \triangle_n) = \sum_{i=1}^n |B(t_i) - B(t_{i-1})|^2 \rightarrow b - a$ (a.s.)

as $n \rightarrow \infty$. In other words, Brownian motion has bounded quadratic variation on $[a, b]$ equal to $b - a$, almost surely.

Proof:

Observe that, by ‘telescoping’,

$\sum_{i=1}^{n}(t_i - t_{i-1}) = b - a$

Let

$Y_n = \sum_{i=1}^{n}(B(t_i) - B(t_{i-1}))^2 - (b - a)$

$= \sum_{i=1}^{n}\big\{(B(t_i) - B(t_{i-1}))^2 - (t_i - t_{i-1})\big\}$

Note that $\mathbb{E}[Y_n] = 0$ because $\mathbb{E}[(B(t_i) - B(t_{i-1}))^2] = t_i - t_{i-1}$. Therefore

$\mathbb{V}[Y_n] = \mathbb{E}[Y_n^2]$

$= \mathbb{E}\big[\big(\sum_{i=1}^n \big\{(B(t_i) - B(t_{i-1}))^2 - (t_i - t_{i-1})\big\} \big)^2 \big]$

$= \sum_{i=1}^n \mathbb{E}\big[\big\{(B(t_i) - B(t_{i-1}))^2 - (t_i - t_{i-1})\big\}^2 \big]$

(cross product terms vanish since Brownian motion has independent increments)

$= \sum_{i=1}^n \mathbb{E}\big[(B(t_i) - B(t_{i-1}))^4 - 2(t_i - t_{i-1})(B(t_i) - B(t_{i-1}))^2 + (t_i - t_{i-1})^2 \big]$

$= \sum_{i=1}^n \big(3(t_i - t_{i-1})^2 - 2(t_i - t_{i-1})^2 + (t_i - t_{i-1})^2 \big)$

(since the fourth moment of a zero-mean normal random variable is $3\sigma^4$ and here $\sigma^2 = (t_i - t_{i-1})$)

$= \sum_{i=1}^n 2(t_i - t_{i-1})^2$

$\leq 2 ||\triangle_n||\sum_{i=1}^n (t_i - t_{i-1})$

$= 2(b - a)||\triangle_n||$

(Since $||\triangle_n|| \rightarrow 0$, we see already at this stage that $Y_n \rightarrow 0$ in $\mathcal{L}^2$).

By the second form of Chebyshev’s inequality we can now write for any $\epsilon > 0$ that

$\sum_{n=1}^{\infty}P(|Y_n| > \epsilon) \leq \sum_{n=1}^{\infty} \frac{\mathbb{E}[Y_n^2]}{\epsilon^2}$

$\leq \frac{2(b - a)}{\epsilon^2}\sum_{n=1}^{\infty}||\triangle_n|| < \infty$

In particular, we can write for any integer $q \geq 1$ that

$\sum_{n=1}^{\infty} P\big(|Y_n| > \frac{1}{q}\big) < \infty$

By the first result in the Borel-Cantelli lemma we can then say

$P\bigg(\limsup \limits_{ k } A_k(q)\bigg) = 0$

where

$A_k(q) = \big\{\omega : |Y_k| > \frac{1}{q}\big\}$

Now,

$\limsup \limits_{ k } A_k(q) = \bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty}A_k(q)$

so if $P\bigg(\limsup \limits_{ k } A_k(q)\bigg) = 0$ then we must also have

$P\big(\bigcup_{q=1}^{\infty} \bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty}A_k(q)\big) = 0$

But

$\bigcup_{q=1}^{\infty} \bigcap_{n=1}^{\infty} \bigcup_{k=n}^{\infty}A_k(q)$

is the divergence set when considering the almost sure convergence of $Y_n$ to zero. Since the probability of the divergence set is zero, we conclude that $Y_n \rightarrow 0$ a.s., and therefore $Q_p(B; a, b, \triangle_n) \rightarrow b - a$ (a.s.). QED

Theorem 2

If $\{\triangle_n : 1 \leq n\}$ is a sequence of partitions of $[a, b]$ with $||\triangle_n|| \rightarrow 0$, then

$Q_1(B; a, b, \triangle_n) = \sum_{i=1}^n |B(t_i) - B(t_{i-1})| \rightarrow \infty$ (a.s.)

as $n \rightarrow \infty$. In other words, Brownian motion has unbounded linear variation on $[a, b]$, almost surely.

Proof:

By contradiction. Suppose Brownian motion has bounded linear variation, i.e., $V_1(B; a, b) < \infty$. Then we can write

$\sum_{i=1}^n |B(t_i) - B(t_{i-1})|^2 \leq \max_{1 \leq i \leq n}|B(t_i) - B(t_{i-1})|\sum_{i=1}^n |B(t_i) - B(t_{i-1})|$

$\leq V_1(B; a, b)\max_{1 \leq i \leq n}|B(t_i) - B(t_{i-1})|$

Since $B$ is uniformly continuous on $[a, b]$ ($B$ is continuous on $[a, b]$ and any continuous function is uniformly continuous when restricted to a compact set) we can write

$\max_{1 \leq i \leq n}|B(t_i) - B(t_{i-1})| \rightarrow 0$ as $||\triangle_n|| \rightarrow 0$

which if $V_1(B; a, b) < \infty$ implies

$\sum_{i=1}^n |B(t_i) - B(t_{i-1})|^2 \rightarrow 0$ (a.s.)

This contradicts Theorem 1. Therefore $B$ cannot have bounded linear variation. QED

# A note on Brownian motion and fractional Brownian motion as self-similar stochastic processes

While delving into some papers in mathematical finance recently, I was surprised to discover the extent to which the literature involving fractional Brownian motion has expanded over the past twenty years or so. Fractional Brownian motion actually has a long history, having been first introduced in the 1940s by the great Andrey Kolmogorov (pictured) and then reintroduced and further developed in a seminal paper by Mandelbrot and Van Ness in 1968 (for an interesting personal account of the history of fractional Browninan motion, and further references, see Taqqu, M, 2013, Benoit Mandelbrot and Fractional Browninan Motion, arXiv: 1302.5237v1). The reason for the recent interest is that fractional Brownian motion has a ‘long memory’ property that standard Brownian motion lacks (unlike standard Brownian motion, fractional Brownian motion does not have independent increments, as will be shown below), making it useful for modelling processes that exhibit long-term persistence of effects. Thus, for example, one finds in the recent literature that the standard Ornstein-Uhlenbeck process with stochastic differential equation

$dX(t) = \theta(\mu - X(t))dt + \sigma dB(t)$

driven by the standard Brownian motion term $dB(t)$ is modified to produce a fractional version with stochastic differential equation

$dX(t) = \theta(\mu - X(t))dt + \sigma dB_H(t)$

driven by the fractional Brownian motion term $dB_H(t)$ with Hurst parameter $\frac{1}{2} < H < 1$ (to be explained below). Similarly, one finds numerous attempts to modify the famous Black-Scholes framework of mathematical finance to incorporate fractional Brownian motion with a view to capturing more long-term memory effects on market movements.

The problem appears to be that stochastic calculus with fractional Brownian motion is much more difficult than with standard Brownian motion, for which we have the well-developed Itô calculus. Itô calculus cannot be used directly with fractional Brownian motion because the long memory property that makes fractional Brownian motion useful in applications also destroys the 'semimartingale' property of standard Brownian motion, which is needed for Itô calculus to work. Numerous attempts have been made in the last two decades to develop a stochastic calculus for fractional Brownian motion which is analogous to Itô's calculus. This aspect of the literature still seems to me to be exploratory and fraught with technical difficulties, compared to the Itô calculus framework.

Standard Brownian motion turns out to be a special case of fractional Brownian motion, being characterised by Hurst parameter $H = \frac{1}{2}$, as will be shown below. Both are self-similar processes (i.e., fractals, or ‘scale invariant’, or ‘self-affine’ – numerous equivalent terms are used in the literature). In this note I want to clarify for myself some basic aspects of standard Brownian motion and fractional Brownian motion as self-similar stochastic processes.

Mathematical characterisation of self-similarity

To highlight the self-similarity property of Brownian motion which stochastic differential equations capture, it is useful to contrast stochastic differential equations with ordinary differential equations in this regard. The typical solution of a ‘nice’ ordinary differential equation is a differentiable function like the one shown top left in the figure below.

Such a differentiable function lacks self-similarity because when we ‘zoom in’ at finer and finer resolutions, as shown in the remaining plates in the figure, we always end up with a straight line. Newtonian calculus is essentially based on this idea – finding the derivative of a smooth curve at a particular point means finding the slope of the straight line that is tangent to the curve at that point. In contrast, the typical solution of a stochastic differential equation driven by Brownian motion is a ‘jumpy’ curve which is actually nowhere differentiable, like the sample path of a standard Brownian motion shown top left in the figure below.

This curve does exhibit self-similarity in the sense that when we zoom in at finer and finer resolutions, as shown in the remaining plates in the figure, we do not get a straight line but rather just an equally ‘jumpy’ Brownian motion. Itô calculus is essentially based on this idea that ‘zooming in’ does not lead to a straight line in the case of Brownian motion, as will be shown when discussing Itô’s formula below.

No matter what scaling we apply in the x-direction, a Brownian motion remains a Brownian motion. This is the idea of self-similarity which is captured mathematically as follows:

Definition 1: A real-valued stochastic process $\{X(t): t \geq 0\}$ is self-similar if there exists a unique $H > 0$ (called a Hurst parameter) such that for any $a > 0$ we have

$\{X(at)\} \,{\buildrel d \over =}\, \{a^HX(t)\}$

(Note that this also implies $X(0) = 0$, a.s.)

Another key feature that standard Brownian motion and fractional Brownian motion have in common is that their increments are stationary. This can be formally defined as follows:

Definition 2: A stochastic process is said to have stationary increments if the distributions of $\{X(t + h) - X(h)\}$ are independent of $h$.

It is usual in the literature to refer to a self-similar stochastic process with Hurst parameter H as being H-ss, and one which also has stationary increments as being H-sssi. Both standard Brownian motion and fractional Brownian motion are H-sssi, as will be shown shortly.

Result 1: Let $\{X(t)\}$ be H-sssi, and suppose $\mathbb{E}[X^2(1)] < \infty$. Then

$\mathbb{E}[X(t)X(s)] = \frac{1}{2}\{t^{2H} + s^{2H} - |t - s|^{2H}\}\mathbb{E}[X^2(1)]$

Proof: Observe that by virtue of the H-ss property we have

$\mathbb{E}[X^2(t)] = \mathbb{E}[X(t)X(t)] = \mathbb{E}[t^H X(1)t^H X(1)] = t^{2H}\mathbb{E}[X^2(1)]$

and due to the additional si property we have

$\mathbb{E}[(X(t) - X(s))^2] = \mathbb{E}[X^2(|t - s|)] = |t - s|^{2H}\mathbb{E}[X^2(1)]$

Therefore

$\mathbb{E}[X(t)X(s)] \equiv \frac{1}{2}\{\mathbb{E}[(X^2(t)] + \mathbb{E}[(X^2(s)] - \mathbb{E}[(X(t) - X(s))^2]\}$

$= \mathbb{E}[X(t)X(s)] = \frac{1}{2}\{t^{2H} + s^{2H} - |t - s|^{2H}\}\mathbb{E}[X^2(1)]$

as claimed.

Standard Brownian motion

The key difference between standard Brownian motion and fractional Brownian motion is that the former has independent increments, the latter does not.

Definition 3: A real-valued stochastic process $\{X(t): t \geq 0\}$ is said to have independent increments if for any $m \geq 1$ and for any partition $0 \leq t_0 < t_1 < \cdots < t_m$, $X(t_1) - x(t_0)$, . . ., $X(t_m) - x(t_{m-1})$ are independent.

Definition 4: A real-valued stochastic process $\{B(t): t \geq 0\}$ is a standard Brownian motion if it satisfies the following four conditions:
(i) $B(0) = 0$, a.s.
(ii) it has independent and stationary increments
(iii) for each $t > 0$, $B(t) \sim N(0, t)$
(iv) its sample paths are continuous, a.s.

Result 2: Standard Brownian motion $B(t)$ is $\frac{1}{2}$-sssi.

Proof: We need to prove that for any $a > 0$ we have

$\{B(at)\} \,{\buildrel d \over =}\, \{a^{\frac{1}{2}}B(t)\}$

It is actually easier to prove that

$\{a^{-\frac{1}{2}}B(at)\} \,{\buildrel d \over =}\, \{B(t)\}$

which is equivalent. It is obvious by inspection that conditions (i), (ii) and (iv) in Definition 4 apply to $\{a^{-\frac{1}{2}}B(at)\}$. With regard to (iii), the normality and mean zero of $\{a^{-\frac{1}{2}}B(at)\}$ are obvious by inspection and the variance is

$\mathbb{E}[(a^{-\frac{1}{2}}B(at))^2] = a^{-1} \cdot at = t$

Therefore $\{a^{-\frac{1}{2}}B(at)\} \,{\buildrel d \over =}\, \{B(t)\}$ as required.

Result 3: $\mathbb{E}[B(t)B(s)] = \text{min}\{t, s\}$.

Proof: Since standard Brownian motion is $\frac{1}{2}$-sssi, we have by Result 1 that

$\mathbb{E}[B(t)B(s)] = \frac{1}{2}\{t + s - |t - s|\} \equiv \text{min}\{t, s\}$.

With regard to the martingale property of Brownian motion, we have the following definitions and result.

Definition 5: Let $(\Omega, \mathscr{F}, \mathbb{P})$ be a probability space on which is defined a stochastic process $\{X(t) : t \geq 0\}$. A filtration for the stochastic process is a collection of $\sigma$-algebras $\mathbb{F} = \{\mathscr{F}(t) : t \geq 0\}$ satisfying:
(i) $\mathscr{F}(s) \subset \mathscr{F}(t)$ for $0 \leq s < t$
(ii) for each $t \geq 0$, the stochastic process $X(t)$ at time $t$ is $\mathscr{F}(t)$-measurable.
$(\Omega, \mathscr{F}, \mathbb{F}, \mathbb{P})$ is then called the filtered probability space for the process.

Intuitively, a filtration at time $t$ is the history of the wanderings of the stochastic process up to that time. Condition (i) says that there is at least as much information in the later $\sigma$-algebra $\mathscr{F}(t)$ as there is in any earlier $\sigma$-algebra $\mathscr{F}(s)$. Condition (ii) says that the information available at time $t$ is always enough to evaluate the stochastic process at that time (a condition known as the ‘adaptivity’ of $X(t)$ to $\mathscr{F}(t)$).

Definition 6: A stochastic process $\{X(t) : t \geq 0\}$ is a martingale if it is integrable (i.e., it has a finite expected value at each $t$) and for any $s > 0$ we have

$\mathbb{E}[X(t + s) | \mathscr{F}(t)] = X(t)$, a.s.

where $\mathscr{F}(t)$ is the information about the process up to time $t$, that is, $\mathbb{F} = \{\mathscr{F}(t) : t \geq 0\}$ is a filtration.

Intuitively, a stochastic process is a martingale if the best guess at time $t$ of its future value is simply its current value at time $t$. (Note that the martingale property is measure-specific: the stochastic process can be a martingale with respect to a measure $\mathbb{P}$, while failing to be a martingale with respect to a different measure $\mathbb{Q}$. It is often possible to ‘convert’ a Brownian-motion-based stochastic process which is not a martingale into a martingale by changing the probability measure appropriately. The conditions for being able to do this are given by a theorem which is well known in mathematical finance, called the Cameron-Martin-Girsanov theorem).

Result 4: Standard Brownian motion $\{B(t)\}$ is a martingale.

Proof: By definition, $B(t) \sim N(0, t)$, guaranteeing that the process is at all times integrable with $\mathbb{E}[B(t)] = 0$ for all $t$. Furthermore, for $s < t$ we have

$\mathbb{E}[B(t) | \mathscr{F}(s)] \equiv \mathbb{E}[B(s) | \mathscr{F}(s)] + \mathbb{E}[B(t) - B(s) | \mathscr{F}(s)]$

$= B(s) + \mathbb{E}[B(t) - B(s)]$

$= B(s)$

where the first term in the second equality follows from the fact that $B(s)$ is known for certain when $\mathscr{F}(s)$ is known, and the second term in the second equality follows from the independent increments property of Brownian motion which implies that future increments are independent of all past information.

With regard to stochastic calculus for functions of standard Brownian motion, the key result is Itô’s formula.

Result 5: (Itô’s formula). If $f$ is a deterministic twice continuously differentiable function, then for any $t$ we have

$f(B(t)) = f(B(0)) + \int_0^t f^{\prime}(B(u))dB(u) + \frac{1}{2}\int_0^t f^{\prime \prime}(B(u))du$

or equivalently, in stochastic differential form,

$df(B(t)) = f^{\prime}(B(t))dB(t) + \frac{1}{2}f^{\prime \prime}(B(t))dt$

Proof: (Sketch). An infinitesimal Taylor series expansion of $f(B(t))$ gives

$df(B(t)) = f^{\prime}(B(t))dB(t) + \frac{1}{2}f^{\prime \prime}(B(t))(dB(t))^2 + \frac{1}{3!}f^{\prime \prime \prime}(B(t))(dB(t))^3 + \cdots$

In a Newtonian context (as we saw above when considering the graph of a differentiable function) only the linear term is relevant as we ‘zoom in’, which in the context of this Taylor expansion would seem to mean that we could discard the terms involving $(dB(t))^2$, $(dB(t))^3$, and higher. However, ‘zooming in’ on the path of a Brownian motion does not lead to a straight line, and in the context of this Taylor expansion that translates into the fact that we can no longer discard the $(dB(t))^2$ term, although we can discard the higher order terms. To see this, suppose we divide the time interval $[0, t]$ into a partition

$\{0, \frac{t}{n}, \frac{2t}{n}, \ldots, \frac{(n-1)t}{n}, t\}$

for some $n > 0$. Then we have

$\int_0^t (dB(u))^2 = \lim\limits_{n \rightarrow \infty} \sum_{i=1}^n \big(B\big(\frac{ti}{n}\big) - B\big(\frac{t(i-1)}{n}\big)\big)^2 = \lim\limits_{n \rightarrow \infty} \sum_{i=1}^n \big(B\big(\frac{t}{n}\big)\big)^2$

$= \lim\limits_{n \rightarrow \infty} t \sum_{i=1}^n \frac{1}{n}\bigg(\frac{B\big(\frac{t}{n}\big)}{\sqrt{\frac{t}{n}}}\bigg)^2 = t \mathbb{E}[Z^2] = t \mathbb{V}[Z] = t$

where

$Z = \frac{B\big(\frac{t}{n}\big)}{\sqrt{\frac{t}{n}}} \sim N(0, 1)$

The differential form of $\int_0^t (dB(u))^2 = t$ is $(dB(t))^2 = dt$, which shows that in the case of Brownian motion the $(dB(t))^2$ term cannot be discarded in the above Taylor series expansion because it equals $dt$, not zero. However, the higher order terms can be discarded because for integers $m > 2$ we get

$\int_0^t (dB(u))^m = \lim\limits_{n \rightarrow \infty} \sum_{i=1}^n \big(B\big(\frac{t}{n}\big)\big)^m = \lim\limits_{n \rightarrow \infty} t^{m/2} \sum_{i=1}^n \frac{1}{n^{(m-2)/2}}\frac{1}{n}\bigg(\frac{B\big(\frac{t}{n}\big)}{\sqrt{\frac{t}{n}}}\bigg)^m$

$= t^{m/2} \bigg(\lim\limits_{n \rightarrow \infty} \frac{1}{n^{(m-2)/2}}\bigg) \cdot(\mathbb{E}[Z^m]) = 0$

Therefore in differential form we have $(dB(t))^m = 0$ for integers $m > 2$. (Note that now we can also immediately deduce that $dB(t)dt = 0$ from the fact that $(dB(t))^3 = 0$, since $(dB(t))^3 = (dB(t))^2dB(t) = dtdB(t)$). Using these results in the above Taylor series expansion we get the differential form of Itô’s formula. QED

Itô’s formula greatly facilitates the problem of finding stochastic differential equations for Brownian-motion-based stochastic processes, and also the reverse problem of integrating stochastic differential equations to obtain the corresponding stochastic processes. For example, suppose we modify the standard Brownian motion process by giving it a volatility $\sigma$, to yield the stochastic process

$X(t) = \sigma B(t)$

and suppose we want to know the stochastic differential equation which the stochastic process $Y(t) = \exp(X(t))$ obeys. Taking the function $f$ in Itô’s formula to be the exponential function and noting that $(d(X(t))^2 = \sigma^2 (dB(t))^2 = \sigma^2 dt$ we get

$dY(t) = f^{\prime}(X(t))dX(t) + \frac{1}{2}f^{\prime \prime}(X(t))(dX(t))^2 = \sigma Y(t) dB(t) + \frac{\sigma^2}{2}Y(t)dt$

As another example, suppose we modify the standard Brownian motion process by giving it a deterministic drift in addition to the volatility $\sigma$, to yield the stochastic process

$X(t) = \sigma B(t) + \mu t$

and suppose that we again want to know the stochastic differential equation which the stochastic process $Y(t) = \exp(X(t))$ obeys. Taking the function $f$ in Itô’s formula to be the exponential function and noting that

$(d(X(t))^2 = (\sigma dB(t) + \mu dt)^2 = \sigma^2(dB(t))^2 + 2 \sigma \mu dB(t)dt + \mu^2 (dt)^2 = \sigma^2 dt$

we get

$dY(t) = f^{\prime}(X(t))dX(t) + \frac{1}{2}f^{\prime \prime}(X(t))(dX(t))^2$

$= Y(t)(\sigma dB(t) + \mu dt) + \frac{\sigma^2}{2}Y(t)dt$

$= \sigma Y(t) dB(t) + (\mu + \frac{\sigma^2}{2})Y(t)dt$

Fractional Brownian motion

Definition 7: Let $0 < H < 1$. A mean-zero Gaussian process $\{B_H(t): t \geq 0\}$ is called a fractional Brownian motion with Hurst parameter $H$ if

$\mathbb{E}[B_H(t)B_H(s)] = \frac{1}{2}\{t^{2H} + s^{2H} - |t - s|^{2H}\}\mathbb{E}[B_H^2(1)]$

Note that this covariance structure reduces to that of the standard Brownian motion in Result 3 when $H = \frac{1}{2}$.

Result 6: Fractional Brownian motion $\{B_H(t): t \geq 0\}$ is H-sssi, but unlike standard Brownian motion, fractional Brownian motion with $H \neq \frac{1}{2}$ does not have independent increments.

Proof: To prove si, simply note that

$\mathbb{E}[(B_H(t) - B_H(s))^2] = \mathbb{E}[B_H^2(t)] + \mathbb{E}[B_H^2(s)] - 2\mathbb{E}[B_H(t)B_H(s)]$

$= |t - s|^{2H}\mathbb{E}[B_H^2(1)]$

Therefore for every $h > 0$, the two processes $\{B_H(t + h) - B_H(h)\}$ and $\{B_H(t)\}$ have the same distribution. To prove that fractional Brownian motion with $H \neq \frac{1}{2}$ does not have independent increments, define the increments

$\{b_H(j)\} = \{B_H(j+1) - B_H(j)\}$

for integers $j = 0, \pm 1, \ldots$. The increments $\{b_H(j)\}$ have zero mean, variance

$\mathbb{E}[b_H^2(j)] = \mathbb{E}[(B_H(j+1) - B_H(j))^2] = \mathbb{E}[B_H^2(1)]$

and autocovariance

$\mathbb{E}[b_H(j) b_H(k)]$

$= \mathbb{E}[B_H(j+1)B_H(k+1) - B_H(j+1)B_H(k) - B_H(k+1)B_H(j) + B_H(j)B_H(k)]$

$= \frac{1}{2} \bigg\{|j+1|^{2H} + |k+1|^{2H} - |j - k|^{2H}$

$-|j+1|^{2H} - |k|^{2H} + |j + 1 - k|^{2H}$

$-|k+1|^{2H} - |j|^{2H} + |k + 1 - j|^{2H}$

$+ |j|^{2H} + |k|^{2H} - |j - k|^{2H}\bigg\}\mathbb{E}[B_H^2(1)]$

$= \bigg\{|j - k + 1|^{2H} - 2|j - k|^{2H} + |j - k - 1|^{2H}\bigg\}\mathbb{E}[B_H^2(1)]$

which is nonzero unless $H = \frac{1}{2}$. (To see that it is zero when $H = \frac{1}{2}$, suppose without loss of generality that $j > k$. Then the expression reduces to

$\mathbb{E}[b_H(j) b_H(k)] = \bigg\{(j - k + 1)^{2H} - 2(j - k)^{2H} + (j - k - 1)^{2H}\bigg\}\mathbb{E}[B_H^2(1)]$

which is zero when $H = \frac{1}{2}$). QED

The largest class of stochastic processes to which Itô calculus is applicable is the class of semimartingales, within which martingales (such as Brownian motion) are included. However, fractional Brownian motion with $H \neq \frac{1}{2}$ is not a semimartingale, and therefore not amenable to Itô’s formula. (Clarifying why fractional Brownian motion is not a semimartingale requires the definition of some additional concepts from stochastic analysis which will not be done in this note). However, it is worth noting that fractional Brownian motion does have a stochastic integral representation in terms of standard Brownian motion, as follows.

Result 7: For $0 < H < 1$ and $t > 0$, fractional Brownian motion $\{B_H(t): t \geq 0\}$ has a stochastic integral representation of the form

$Z(t) = c_H\bigg\{\int_{-\infty}^0 ((t - u)^{H - \frac{1}{2}} - (-u)^{H - \frac{1}{2}})dB(u) + \int_0^t (t - u)^{H - \frac{1}{2}}dB(u)\bigg\}$

$= c_H\int_{\mathbb{R}} (I_{\{u \leq t\}}(t - u)^{H - \frac{1}{2}} - I_{\{u \leq 0\}}(-u)^{H - \frac{1}{2}})dB(u)$

where $c_H$ is a constant.

Proof: Let

$Z(t) = c_H\int_{\mathbb{R}} (I_{\{u \leq t\}}(t - u)^{H - \frac{1}{2}} - I_{\{u \leq 0\}}(-u)^{H - \frac{1}{2}})dB(u)$

Then $Z(t)$ is H-ss. To see this, make the change of variable $u = ts$. We get

$Z(t) \,{\buildrel d \over =}\, c_H\int_{\mathbb{R}} (I_{\{ts \leq t\}}(t - ts)^{H - \frac{1}{2}} - I_{\{ts \leq 0\}}(-ts)^{H - \frac{1}{2}})dB(ts)$

$\,{\buildrel d \over =}\, t^{H - \frac{1}{2}}c_H\int_{\mathbb{R}} (I_{\{s \leq 1\}}(1 - s)^{H - \frac{1}{2}} - I_{\{s \leq 0\}}(-s)^{H - \frac{1}{2}})t^{\frac{1}{2}}dB(s)$

$\,{\buildrel d \over =}\, t^H c_H\int_{\mathbb{R}} (I_{\{s \leq 1\}}(1 - s)^{H - \frac{1}{2}} - I_{\{s \leq 0\}}(-s)^{H - \frac{1}{2}})t^{\frac{1}{2}}dB(s)$

$\,{\buildrel d \over =}\, t^H Z(1)$

(Note that $dB(ts) = t^{\frac{1}{2}}dB(s)$ because $(dB(ts))^2 = d(ts) = tds = (t^{\frac{1}{2}}dB(s))^2$). The process $Z(t)$ is also si. To see this, observe that

$\mathbb{E}[(Z(t) - Z(s))^2]$

$= \mathbb{E}\bigg[c_H\int_{\mathbb{R}} [I_{\{u \leq t\}}(t - u)^{H - \frac{1}{2}} - I_{\{u \leq s\}}(s - u)^{H - \frac{1}{2}}]^2du \bigg]$

Making the change of variable $u = s + m$ this becomes

$= \mathbb{E}\bigg[c_H\int_{\mathbb{R}} [I_{\{m \leq t - s\}}((t - s) - m)^{H - \frac{1}{2}} - I_{\{m \leq 0\}}(-m)^{H - \frac{1}{2}}]^2du \bigg]$

$= |t - s|^{2H}\mathbb{E}[Z^2(1)]$

Therefore $Z(t)$ is a H-sssi mean zero Gaussian process and from Result 1 we can conclude that it has the same covariance structure as fractional Brownian motion with Hurst parameter $H$. QED