A note on the structure of the higher-order terms in multivariate Taylor expansions

I recently had to think a bit more deeply than is usually necessary about the nature of the higher-order terms in multivariate Taylor expansions of smooth functions of the form f : \mathbb{R}^m \rightarrow \mathbb{R}. The result was a realisation that the fundamental structure of these higher-order terms is actually quite simple and that working them out manually brings up some interesting combinatorial issues, including use of the partition function from advanced number theory. I want to record some of my thoughts about this here.

For a univariate function f : \mathbb{R} \rightarrow \mathbb{R}, Taylor’s theorem says that if f is n times differentiable at some point a \in \mathbb{R} then there exists a function h_n : \mathbb{R} \rightarrow \mathbb{R} such that

f(x) - f(a) =

f^{\prime}(a)(x - a) + \frac{1}{2!} f^{\prime \prime}(a)(x - a)^2 + \cdots + \frac{1}{n!}f^{(n)}(a)(x - a)^n + h_n(x)(x - a)^n

where h_n(x) \rightarrow 0 as x \rightarrow a. The n-th order Taylor polynomial

f(a) + f^{\prime}(a)(x - a) + \frac{1}{2!} f^{\prime \prime}(a)(x - a)^2 + \cdots + \frac{1}{n!}f^{(n)}(a)(x - a)^n

then serves as an approximation to the function f(x). If the function f is infinitely many times differentiable, we can approximate f to any desired level of accuracy by choosing the Taylor polynomial to be of high enough degree.

In the univariate case the n-th order term in the Taylor polynomial is

\frac{1}{n!}f^{(n)}(a)(x - a)^n

In the present note I am interested in how this generalises to the multivariate case f : \mathbb{R}^m \rightarrow \mathbb{R}, where the Taylor expansion is about a vector \mathbf{a} = (a_1, a_2, \ldots, a_m). In the bivariate case m = 2, where the Taylor expansion is about the point \mathbf{a} = (a_1, a_2), the n-th order term in the Taylor expansion is given by the formula

\frac{1}{n!}\sum_{k=0}^{n} \binom{n}{k} \frac{\partial^k f}{\partial x_1^{n-k} \partial x_2^{k}}\bigg|_{\mathbf{a}} (x_1 - a_1)^{n-k}(x_2 - a_2)^k

This enables any higher-order term in the Taylor polynomial to be quickly obtained (and notice that the binomial coefficient values can be quickly read from the appropriate line of Pascal’s triangle). For example, the fourth-order term would be

\frac{1}{4!} \bigg[\frac{\partial^4f}{\partial x_1^4}\bigg|_{\mathbf{a}} (x_1 - a_1)^4 + 4 \frac{\partial^4 f}{\partial x_1^3 \partial x_2}\bigg|_{\mathbf{a}} (x_1 - a_1)^3 (x_2 - a_2) \ \ +

6 \frac{\partial^4 f}{\partial x_1^2 \partial x_2^2}\bigg|_{\mathbf{a}} (x_1 - a_1)^2 (x_2 - a_2)^2 + 4 \frac{\partial^4 f}{\partial x_1 \partial x_2^3}\bigg|_{\mathbf{a}} (x_1 - a_1) (x_2 - a_2)^3 \  \  +

\frac{\partial^4f}{\partial x_2^4}\bigg|_{\mathbf{a}} (x_2 - a_2)^4\bigg]

In order to generalise this as easily as possible to multivariate cases with m > 2, I found it convenient to adopt the following notation:

\partial_i \equiv \frac{\partial f}{\partial x_i} \bigg|_{\mathbf{a}}

\partial_{ij} \equiv \frac{\partial^2 f}{\partial x_i \partial x_j} \bigg|_{\mathbf{a}}

\partial_{ijk} \equiv \frac{\partial^3 f}{\partial x_i \partial x_j \partial x_k} \bigg|_{\mathbf{a}}

and so on, and

v_i \equiv (x_i - a_i)

Given the function f : \mathbb{R}^m \rightarrow \mathbb{R}, we allow the indices above to take values from 1 to m. To my surprise, I then realised that if we use the Einstein summation convention, it is possible to represent any Taylor expansion as

f(\mathbf{x}) = f(\mathbf{a}) + \partial_i v_i + \frac{1}{2!} \partial_{ij} v_i v_j + \frac{1}{3!} \partial_{ijk} v_i v_j v_k + \frac{1}{4!} \partial_{ijkl} v_i v_j v_k v_l + \cdots

This generic formula will apply to any number of variables m. Notice that the number of indices in each term corresponds to the order of that term. So, for example, the fifth-order term in the Taylor expansion will involve five indices, the sixth-order term will involve six indices, and so on.

To test out this formula, let us use it to work out the fourth-order term for the bivariate case m = 2 again. The relevant term in the above form of the Taylor expansion is

\frac{1}{4!} \partial_{ijkl} v_i v_j v_k v_l

Since each of the indices i, j, k, l will be allowed to take values 1 or 2, the above expression represents a sum involving 2^4 = 16 terms. In general, the n-th order term in the Taylor expansion of a function f : \mathbb{R}^m \rightarrow \mathbb{R} will involve m^n terms. The 16 terms in the above case are

\frac{1}{4!} \bigg[ \partial_{1111} v_1 v_1 v_1 v_1 + \partial_{1112} v_1 v_1 v_1 v_2 + \partial_{1121} v_1 v_1 v_2 v_1 + \partial_{1122} v_1 v_1 v_2 v_2 \ +

\partial_{1211} v_1 v_2 v_1 v_1 + \partial_{1212} v_1 v_2 v_1 v_2 + \partial_{1221} v_1 v_2 v_2 v_1 + \partial_{1222} v_1 v_2 v_2 v_2 \ +

\partial_{2111} v_2 v_1 v_1 v_1 + \partial_{2112} v_2 v_1 v_1 v_2 + \partial_{2121} v_2 v_1 v_2 v_1 + \partial_{2122} v_2 v_1 v_2 v_2 \ +

\partial_{2211} v_2 v_2 v_1 v_1 + \partial_{2212} v_2 v_2 v_1 v_2 + \partial_{2221} v_2 v_2 v_2 v_1 + \partial_{2222} v_2 v_2 v_2 v_2\bigg]

However, due to the equality of cross-partials involving the same index numbers, this reduces to

\frac{1}{4!}\bigg[\partial_{1111} v_1^4 + 4 \partial_{1112} v_1^3 v_2 + 6 \partial_{1122} v_1^2 v_2^2 + 4 \partial_{1222} v_1 v_2^3 + \partial_{2222} v_2^4\bigg]

which matches what we previously obtained using the formula.

In a similar way, we could proceed to systematically calculate the fourth-order term in the trivariate case m = 3. The relevant term in the generic Taylor expansion is the same one as before,  namely

\frac{1}{4!} \partial_{ijkl} v_i v_j v_k v_l

However, since each of the indices will take values from 1 to 3 in the trivariate case, the above expression now represents a sum of 3^4 = 81 terms. The 81 terms are

\frac{1}{4!} \bigg[\partial_{1111} v_1 v_1 v_1 v_1 + \partial_{1121} v_1 v_1 v_2 v_1 + \partial_{1131} v_1 v_1 v_3 v_1 \ +

\partial_{1112} v_1 v_1 v_1 v_2 + \partial_{1122} v_1 v_1 v_2 v_2 + \partial_{1132} v_1 v_1 v_3 v_2 \ +

\partial_{1113} v_1 v_1 v_1 v_3 + \partial_{1123} v_1 v_1 v_2 v_3 + \partial_{1133} v_1 v_1 v_3 v_3 \ +

\partial_{1211} v_1 v_2 v_1 v_1 + \partial_{1221} v_1 v_2 v_2 v_1 + \partial_{1231} v_1 v_2 v_3 v_1 \ +

\partial_{1212} v_1 v_2 v_1 v_2 + \partial_{1222} v_1 v_2 v_2 v_2 + \partial_{1232} v_1 v_2 v_3 v_2 \ +

\partial_{1213} v_1 v_2 v_1 v_3 + \partial_{1223} v_1 v_2 v_2 v_3 + \partial_{1233} v_1 v_2 v_3 v_3 \ +

\partial_{1311} v_1 v_3 v_1 v_1 + \partial_{1321} v_1 v_3 v_2 v_1 + \partial_{1331} v_1 v_3 v_3 v_1 \ +

\partial_{1312} v_1 v_3 v_1 v_2 + \partial_{1322} v_1 v_3 v_2 v_2 + \partial_{1332} v_1 v_3 v_3 v_2 \ +

\partial_{1313} v_1 v_3 v_1 v_3 + \partial_{1323} v_1 v_3 v_2 v_3 + \partial_{1333} v_1 v_3 v_3 v_3 \ +

\partial_{2111} v_2 v_1 v_1 v_1 + \partial_{2121} v_2 v_1 v_2 v_1 + \partial_{2131} v_2 v_1 v_3 v_1 \ +

\partial_{2112} v_2 v_1 v_1 v_2 + \partial_{2122} v_2 v_1 v_2 v_2 + \partial_{2132} v_2 v_1 v_3 v_2 \ +

\partial_{2113} v_2 v_1 v_1 v_3 + \partial_{2123} v_2 v_1 v_2 v_3 + \partial_{2133} v_2 v_1 v_3 v_3 \ +

\partial_{2211} v_2 v_2 v_1 v_1 + \partial_{2221} v_2 v_2 v_2 v_1 + \partial_{2231} v_2 v_2 v_3 v_1 \ +

\partial_{2212} v_2 v_2 v_1 v_2 + \partial_{2222} v_2 v_2 v_2 v_2 + \partial_{2232} v_2 v_2 v_3 v_2 \ +

\partial_{2213} v_2 v_2 v_1 v_3 + \partial_{2223} v_2 v_2 v_2 v_3 + \partial_{2233} v_2 v_2 v_3 v_3 \ +

\partial_{2311} v_2 v_3 v_1 v_1 + \partial_{2321} v_2 v_3 v_2 v_1 + \partial_{2331} v_2 v_3 v_3 v_1 \ +

\partial_{2312} v_2 v_3 v_1 v_2 + \partial_{2322} v_2 v_3 v_2 v_2 + \partial_{2332} v_2 v_3 v_3 v_2 \ +

\partial_{2313} v_2 v_3 v_1 v_3 + \partial_{2323} v_2 v_3 v_2 v_3 + \partial_{2333} v_2 v_3 v_3 v_3 \ +

\partial_{3111} v_3 v_1 v_1 v_1 + \partial_{3121} v_3 v_1 v_2 v_1 + \partial_{3131} v_3 v_1 v_3 v_1 \ +

\partial_{3112} v_3 v_1 v_1 v_2 + \partial_{3122} v_3 v_1 v_2 v_2 + \partial_{3132} v_3 v_1 v_3 v_2 \ +

\partial_{3113} v_3 v_1 v_1 v_3 + \partial_{3123} v_3 v_1 v_2 v_3 + \partial_{3133} v_3 v_1 v_3 v_3 \ +

\partial_{3211} v_3 v_2 v_1 v_1 + \partial_{3221} v_3 v_2 v_2 v_1 + \partial_{3231} v_3 v_2 v_3 v_1 \ +

\partial_{3212} v_3 v_2 v_1 v_2 + \partial_{3222} v_3 v_2 v_2 v_2 + \partial_{3232} v_3 v_2 v_3 v_2 \ +

\partial_{3213} v_3 v_2 v_1 v_3 + \partial_{3223} v_3 v_2 v_2 v_3 + \partial_{3233} v_3 v_2 v_3 v_3 \ +

\partial_{3311} v_3 v_3 v_1 v_1 + \partial_{3321} v_3 v_3 v_2 v_1 + \partial_{3331} v_3 v_3 v_3 v_1 \ +

\partial_{3312} v_3 v_3 v_1 v_2 + \partial_{3322} v_3 v_3 v_2 v_2 + \partial_{3332} v_3 v_3 v_3 v_2 \ +

\partial_{3313} v_3 v_3 v_1 v_3 + \partial_{3323} v_3 v_3 v_2 v_3 + \partial_{3333} v_3 v_3 v_3 v_3 \bigg]

Again due to equality of cross-partials involving the same index numbers, this reduces to

\frac{1}{4!} \bigg[ \partial_{1111} v_1^4 + \partial_{2222} v_2^4 + \partial_{3333} v_3^4 + 4 \partial_{1112} v_1^3 v_2 + 4 \partial_{1113} v_1^3 v_3 +

4 \partial_{2223} v_2^3 v_3 + 4 \partial_{1222} v_1 v_2^3 + 4 \partial_{1333} v_1 v_3^3 + 4 \partial_{2333} v_2 v_3^3 +

6 \partial_{1122} v_1^2 v_2^2 + 6 \partial_{1133} v_1^2 v_3^2 + 6 \partial_{2233} v_2^2 v_3^2 + 12 \partial_{1123} v_1^2 v_2 v_3 + 12 \partial_{1223} v_1 v_2^2 v_3 + 12 \partial_{1233} v_1 v_2 v_3^2 \bigg]

Note that the coefficient of each of the terms is the multinomial coefficient

\binom{k}{k_1, k_2, \ldots, k_m} = \frac{k!}{k_1! k_2! \dots k_m!}

where k_1 + k_2 + \cdots + k_m = k. This gives the number of ordered arrangements of k objects in which there are k_1 objects of type 1, k_2 objects of type 2, \ldots, k_m objects of type m.

In our case we have k = 4 and m = 3. Thus, the coefficient of \partial_{1122} v_1^2 v_2^2 for example is 6, because there are two 1s, two 2s and zero 3s, and setting k_1 = 2, k_2 = 2, k_3 = 0 we get

\binom{k}{k_1, k_2, k_3} = \frac{k!}{k_1! k_2! k_3!} = \frac{4!}{2! 2! 0!} = \frac{24}{4} = 6

Similarly, in the case of \partial_{1123} v_1^2 v_2 v_3 the coefficient is 12 because k_1 = 2, k_2 = 1, k_3 = 1, so

\binom{k}{k_1, k_2, k_3} = \frac{4!}{2! 1! 1!} = \frac{24}{2} = 12

Note also that within the square brackets in the reduced expression above there are 15 different `types’ of terms. Why 15? The answer has to do with the concept of partitions from number theory. A partition of a positive integer is a way of writing that integer as a sum of positive integers. Two sums that differ only in the order of the integers are considered to be the same partition. Now, the number 4 has five partitions, namely

4

3 + 1

2 + 2

2 + 1 + 1

1 + 1 + 1 + 1

In the multinomial coefficients in the square brackets in the reduced expression above we are trying to obtain the sum k = 4 in all possible ways by assigning values to k_1, k_2 and k_3. There are 3 ways of achieving this with the partition 4 (one example is k_1 = 4, k_2 = 0, k_3 = 0); there are 6 ways of achieving this with the partition 3 + 1 (one example is k_1 = 3, k_2 = 1, k_3 = 0); there are 3 ways of achieving this with the partition 2 + 2 (one example is k_1 = 2, k_2 = 2, k_3 = 0); there are 3 ways of achieving this with the partition 2 + 1 + 1 (one example is k_1 = 2, k_2 =1, k_3 = 1); and finally it is impossible to use the partition 1 + 1 + 1 + 1 since we are only summing three terms. Thus there are 15 ways to achieve the sum k = 4 by applying the partitions of 4 to the values of k_1, k_2 and k_3, which explains why there are 15 terms in the square brackets in the reduced expression above.

Can we use these insights to quickly work out the fourth-order term in the Taylor expansion of the quadvariate function f : \mathbb{R}^4 \rightarrow \mathbb{R}? Since m = 4 and n = 4, this fourth-order term will be a sum of 4^4 = 256 terms. Can we work out how many terms there will be in the reduced expression in which equivalent cross-partials are grouped together? Yes we can, using the partitions of 4 again. In the multinomial coefficients in the square brackets in the reduced expression we will be trying to obtain the sum k = 4 in all possible ways by assigning values to k_1, k_2, k_3 and k_4. There are 4 ways of achieving this with the partition 4 (one example is k_1 = 4, k_2 = 0, k_3 = 0, k_4 = 0); there are 12 ways of achieving this with the partition 3 + 1 (one example is k_1 = 3, k_2 = 1, k_3 = 0, k_4 = 0); there are 6 ways of achieving this with the partition 2 + 2 (one example is k_1 = 2, k_2 = 2, k_3 = 0, k_4 = 0); there are 12 ways of achieving this with the partition 2 + 1 + 1 (one example is k_1 = 2, k_2 =1, k_3 = 1, k_4 = 0); and finally there is exactly one way to use the partition 1 + 1 + 1 + 1, namely k_1 = 1, k_2 = 1, k_3 = 1, k_4 = 1. Thus there are 35 ways to achieve the sum k = 4 by applying the partitions of 4 to the values of k_1, k_2, k_3 and k_4, so there should be 35 different types of terms in the reduced expression. These will be

\frac{1}{4!} \bigg[\partial_{1111} v_1^4 + \partial_{2222} v_2^4 + \partial_{3333} v_3^4 + \partial_{4444} v_4^4 \ +

4 \partial_{1112} v_1^3 v_2 + 4 \partial_{1222} v_1 v_2^3 + 4 \partial_{2223} v_2^3 v_3 \ +

4 \partial_{3444} v_3 v_4^3 + 4 \partial_{1113} v_1^3 v_3 + 4 \partial_{1333} v_1 v_3^3 \ +

4 \partial_{2333} v_2 v_3^3 + 4 \partial_{2224} v_2^3 v_4 + 4 \partial_{1114} v_1^3 v_4 \ +

4 \partial_{1444} v_1 v_4^3 + 4 \partial_{3334} v_3^3 v_4 + 4 \partial_{2444} v_2 v_4^3 \ +

6 \partial_{1122} v_1^2 v_2^2 + 6 \partial_{2233} v_2^2 v_3^2 + 6 \partial_{1133} v_1^2 v_3^2 \ +

6 \partial_{2244} v_2^2 v_4^2 + 6 \partial_{1144} v_1^2 v_4^2 + 6 \partial_{3344} v_3^2 v_4^2 \ +

12 \partial_{1123} v_1^2 v_2 v_3 + 12 \partial_{1124} v_1^2 v_2 v_4 + 12 \partial_{1134} v_1^2 v_3 v_4 \ +

12 \partial_{1223} v_1 v_2^2 v_3 + 12 \partial_{1224} v_1 v_2^2 v_4 + 12 \partial_{1334} v_1 v_3^2 v_4 \ +

12 \partial_{1344} v_1 v_3 v_4^2 + 12 \partial_{1244} v_1 v_2 v_4^2 + 12 \partial_{1233} v_1 v_2 v_3^2 \ +

12 \partial_{2334} v_2 v_3^2 v_4 + 12 \partial_{2234} v_2^2 v_3 v_4 + 12 \partial_{2344} v_2 v_3 v_4^2 \ +

24 \partial_{1234} v_1 v_2 v_3 v_4 \bigg]

The coefficient on each of these terms is the relevant multinomial coefficient, and note that these coefficients add up to 256. The following are examples of how I calculated the multinomial coefficients in this quadvariate case:

For the term involving \partial_{1112} we have k_1 = 3, k_2 = 1, k_3 = 0, k_4 = 0, so

\binom{k}{k_1, k_2, k_3, k_4} = \frac{4!}{3! 1! 0! 0!} = \frac{24}{6} = 4

For the term involving \partial_{1122} we have k_1 = 2, k_2 = 2, k_3 = 0, k_4 = 0, so

\binom{k}{k_1, k_2, k_3, k_4} = \frac{4!}{2! 2! 0! 0!} = \frac{24}{4} = 6

For the term involving \partial_{1123} we have k_1 = 2, k_2 = 1, k_3 = 1, k_4 = 0, so

\binom{k}{k_1, k_2, k_3, k_4} = \frac{4!}{2! 1! 1! 0!} = \frac{24}{2} = 12

Finally, for the term involving \partial_{1234} we have k_1 = 1, k_2 = 1, k_3 = 1, k_4 = 1, so

\binom{k}{k_1, k_2, k_3, k_4} = \frac{4!}{1! 1! 1! 1!} = \frac{24}{1} = 24