Alternative approaches for a differential equation involving the square of a derivative

In the course of working out a calculus of variations problem I came across the nonlinear first-order differential equation

\big( \frac{\mathrm{d} y}{\mathrm{d} x}\big)^2 + y = c

where c is a constant. The boundary conditions of the original variational problem were y(0) = 0 and y(1) = 1. Differential equations involving squares of derivatives can be tricky and I decided to take the opportunity to try to find at least a couple of different ways of solving this relatively simple one that might be applicable in harder cases. I want to record these here. (The reader should pause at this point and try to solve this equation by himself/herself before continuing).

One approach that worked nicely and that is generalisable to more difficult cases is to differentiate the equation, giving

2y^{\prime} y^{\prime \prime} + y^{\prime} = 0

\iff

y^{\prime}(2 y^{\prime \prime} + 1) = 0

This yields two simpler differential equations, namely

y^{\prime} = 0

and

2y^{\prime \prime} + 1 = 0

The first one implies that y is a constant but this is not consistent with the boundary conditions. We therefore ignore this one and the problem then reduces to solving the second simple second-order differential equation, which can be written as

\frac{\mathrm{d}^2 y}{\mathrm{d} x^2} = -\frac{1}{2}

Integrating twice we get the general solution

y = -\frac{1}{4} x^2 + c_1 x + c_2

where c_1 and c_2 are arbitrary constants. Using y(0) = 0 and y(1) = 1 we find that c_1 = \frac{5}{4} and c_2 = 0. Therefore the required final solution of the differential equation with the squared derivative is

y = -\frac{1}{4} x^2 + \frac{5}{4} x

The other approach I tried and that worked in this case was to rearrange the equation with the squared derivative directly in order to use separation of variables. Again this is something that might be worth trying in more difficult situations. In the present case we have

\big( \frac{\mathrm{d} y}{\mathrm{d} x}\big)^2 + y = c_1

so

(\mathrm{d} y)^2 = (c_1 - y) (\mathrm{d} x)^2

or

\big(\frac{\mathrm{d} y}{\sqrt{c_1 - y}}\big)^2 = (\mathrm{d} x)^2

and therefore

\big(\int \frac{\mathrm{d} y}{\sqrt{c_1 - y}}\big)^2 = (\int \mathrm{d} x)^2

So the problem reduces to solving

\int \frac{\mathrm{d} y}{\sqrt{c_1 - y}} = \int \mathrm{d} x

Carrying out the integrations we get

-2 \sqrt{c_1 - y} = x + c_2

which yields the general solution

y = c_1 - \frac{1}{4} x^2 - \frac{1}{2} c_2 x - \frac{1}{4} c_2^2

Using the boundary conditions we find that c_1 = \frac{25}{16} and c_2 = -\frac{5}{2}. Substituting these into the general solution we again find the final solution of the differential equation with the squared derivative to be

y = -\frac{1}{4} x^2 + \frac{5}{4} x

A note on the structure of the higher-order terms in multivariate Taylor expansions

I recently had to think a bit more deeply than is usually necessary about the nature of the higher-order terms in multivariate Taylor expansions of smooth functions of the form f : \mathbb{R}^m \rightarrow \mathbb{R}. The result was a realisation that the fundamental structure of these higher-order terms is actually quite simple and that working them out manually brings up some interesting combinatorial issues, including use of the partition function from advanced number theory. I want to record some of my thoughts about this here.

For a univariate function f : \mathbb{R} \rightarrow \mathbb{R}, Taylor’s theorem says that if f is n times differentiable at some point a \in \mathbb{R} then there exists a function h_n : \mathbb{R} \rightarrow \mathbb{R} such that

f(x) - f(a) =

f^{\prime}(a)(x - a) + \frac{1}{2!} f^{\prime \prime}(a)(x - a)^2 + \cdots + \frac{1}{n!}f^{(n)}(a)(x - a)^n + h_n(x)(x - a)^n

where h_n(x) \rightarrow 0 as x \rightarrow a. The n-th order Taylor polynomial

f(a) + f^{\prime}(a)(x - a) + \frac{1}{2!} f^{\prime \prime}(a)(x - a)^2 + \cdots + \frac{1}{n!}f^{(n)}(a)(x - a)^n

then serves as an approximation to the function f(x). If the function f is infinitely many times differentiable, we can approximate f to any desired level of accuracy by choosing the Taylor polynomial to be of high enough degree.

In the univariate case the n-th order term in the Taylor polynomial is

\frac{1}{n!}f^{(n)}(a)(x - a)^n

In the present note I am interested in how this generalises to the multivariate case f : \mathbb{R}^m \rightarrow \mathbb{R}, where the Taylor expansion is about a vector \mathbf{a} = (a_1, a_2, \ldots, a_m). In the bivariate case m = 2, where the Taylor expansion is about the point \mathbf{a} = (a_1, a_2), the n-th order term in the Taylor expansion is given by the formula

\frac{1}{n!}\sum_{k=0}^{n} \binom{n}{k} \frac{\partial^k f}{\partial x_1^{n-k} \partial x_2^{k}}\bigg|_{\mathbf{a}} (x_1 - a_1)^{n-k}(x_2 - a_2)^k

This enables any higher-order term in the Taylor polynomial to be quickly obtained (and notice that the binomial coefficient values can be quickly read from the appropriate line of Pascal’s triangle). For example, the fourth-order term would be

\frac{1}{4!} \bigg[\frac{\partial^4f}{\partial x_1^4}\bigg|_{\mathbf{a}} (x_1 - a_1)^4 + 4 \frac{\partial^4 f}{\partial x_1^3 \partial x_2}\bigg|_{\mathbf{a}} (x_1 - a_1)^3 (x_2 - a_2) \ \ +

6 \frac{\partial^4 f}{\partial x_1^2 \partial x_2^2}\bigg|_{\mathbf{a}} (x_1 - a_1)^2 (x_2 - a_2)^2 + 4 \frac{\partial^4 f}{\partial x_1 \partial x_2^3}\bigg|_{\mathbf{a}} (x_1 - a_1) (x_2 - a_2)^3 \  \  +

\frac{\partial^4f}{\partial x_2^4}\bigg|_{\mathbf{a}} (x_2 - a_2)^4\bigg]

In order to generalise this as easily as possible to multivariate cases with m > 2, I found it convenient to adopt the following notation:

\partial_i \equiv \frac{\partial f}{\partial x_i} \bigg|_{\mathbf{a}}

\partial_{ij} \equiv \frac{\partial^2 f}{\partial x_i \partial x_j} \bigg|_{\mathbf{a}}

\partial_{ijk} \equiv \frac{\partial^3 f}{\partial x_i \partial x_j \partial x_k} \bigg|_{\mathbf{a}}

and so on, and

v_i \equiv (x_i - a_i)

Given the function f : \mathbb{R}^m \rightarrow \mathbb{R}, we allow the indices above to take values from 1 to m. To my surprise, I then realised that if we use the Einstein summation convention, it is possible to represent any Taylor expansion as

f(\mathbf{x}) = f(\mathbf{a}) + \partial_i v_i + \frac{1}{2!} \partial_{ij} v_i v_j + \frac{1}{3!} \partial_{ijk} v_i v_j v_k + \frac{1}{4!} \partial_{ijkl} v_i v_j v_k v_l + \cdots

This generic formula will apply to any number of variables m. Notice that the number of indices in each term corresponds to the order of that term. So, for example, the fifth-order term in the Taylor expansion will involve five indices, the sixth-order term will involve six indices, and so on.

To test out this formula, let us use it to work out the fourth-order term for the bivariate case m = 2 again. The relevant term in the above form of the Taylor expansion is

\frac{1}{4!} \partial_{ijkl} v_i v_j v_k v_l

Since each of the indices i, j, k, l will be allowed to take values 1 or 2, the above expression represents a sum involving 2^4 = 16 terms. In general, the n-th order term in the Taylor expansion of a function f : \mathbb{R}^m \rightarrow \mathbb{R} will involve m^n terms. The 16 terms in the above case are

\frac{1}{4!} \bigg[ \partial_{1111} v_1 v_1 v_1 v_1 + \partial_{1112} v_1 v_1 v_1 v_2 + \partial_{1121} v_1 v_1 v_2 v_1 + \partial_{1122} v_1 v_1 v_2 v_2 \ +

\partial_{1211} v_1 v_2 v_1 v_1 + \partial_{1212} v_1 v_2 v_1 v_2 + \partial_{1221} v_1 v_2 v_2 v_1 + \partial_{1222} v_1 v_2 v_2 v_2 \ +

\partial_{2111} v_2 v_1 v_1 v_1 + \partial_{2112} v_2 v_1 v_1 v_2 + \partial_{2121} v_2 v_1 v_2 v_1 + \partial_{2122} v_2 v_1 v_2 v_2 \ +

\partial_{2211} v_2 v_2 v_1 v_1 + \partial_{2212} v_2 v_2 v_1 v_2 + \partial_{2221} v_2 v_2 v_2 v_1 + \partial_{2222} v_2 v_2 v_2 v_2\bigg]

However, due to the equality of cross-partials involving the same index numbers, this reduces to

\frac{1}{4!}\bigg[\partial_{1111} v_1^4 + 4 \partial_{1112} v_1^3 v_2 + 6 \partial_{1122} v_1^2 v_2^2 + 4 \partial_{1222} v_1 v_2^3 + \partial_{2222} v_2^4\bigg]

which matches what we previously obtained using the formula.

In a similar way, we could proceed to systematically calculate the fourth-order term in the trivariate case m = 3. The relevant term in the generic Taylor expansion is the same one as before,  namely

\frac{1}{4!} \partial_{ijkl} v_i v_j v_k v_l

However, since each of the indices will take values from 1 to 3 in the trivariate case, the above expression now represents a sum of 3^4 = 81 terms. The 81 terms are

\frac{1}{4!} \bigg[\partial_{1111} v_1 v_1 v_1 v_1 + \partial_{1121} v_1 v_1 v_2 v_1 + \partial_{1131} v_1 v_1 v_3 v_1 \ +

\partial_{1112} v_1 v_1 v_1 v_2 + \partial_{1122} v_1 v_1 v_2 v_2 + \partial_{1132} v_1 v_1 v_3 v_2 \ +

\partial_{1113} v_1 v_1 v_1 v_3 + \partial_{1123} v_1 v_1 v_2 v_3 + \partial_{1133} v_1 v_1 v_3 v_3 \ +

\partial_{1211} v_1 v_2 v_1 v_1 + \partial_{1221} v_1 v_2 v_2 v_1 + \partial_{1231} v_1 v_2 v_3 v_1 \ +

\partial_{1212} v_1 v_2 v_1 v_2 + \partial_{1222} v_1 v_2 v_2 v_2 + \partial_{1232} v_1 v_2 v_3 v_2 \ +

\partial_{1213} v_1 v_2 v_1 v_3 + \partial_{1223} v_1 v_2 v_2 v_3 + \partial_{1233} v_1 v_2 v_3 v_3 \ +

\partial_{1311} v_1 v_3 v_1 v_1 + \partial_{1321} v_1 v_3 v_2 v_1 + \partial_{1331} v_1 v_3 v_3 v_1 \ +

\partial_{1312} v_1 v_3 v_1 v_2 + \partial_{1322} v_1 v_3 v_2 v_2 + \partial_{1332} v_1 v_3 v_3 v_2 \ +

\partial_{1313} v_1 v_3 v_1 v_3 + \partial_{1323} v_1 v_3 v_2 v_3 + \partial_{1333} v_1 v_3 v_3 v_3 \ +

\partial_{2111} v_2 v_1 v_1 v_1 + \partial_{2121} v_2 v_1 v_2 v_1 + \partial_{2131} v_2 v_1 v_3 v_1 \ +

\partial_{2112} v_2 v_1 v_1 v_2 + \partial_{2122} v_2 v_1 v_2 v_2 + \partial_{2132} v_2 v_1 v_3 v_2 \ +

\partial_{2113} v_2 v_1 v_1 v_3 + \partial_{2123} v_2 v_1 v_2 v_3 + \partial_{2133} v_2 v_1 v_3 v_3 \ +

\partial_{2211} v_2 v_2 v_1 v_1 + \partial_{2221} v_2 v_2 v_2 v_1 + \partial_{2231} v_2 v_2 v_3 v_1 \ +

\partial_{2212} v_2 v_2 v_1 v_2 + \partial_{2222} v_2 v_2 v_2 v_2 + \partial_{2232} v_2 v_2 v_3 v_2 \ +

\partial_{2213} v_2 v_2 v_1 v_3 + \partial_{2223} v_2 v_2 v_2 v_3 + \partial_{2233} v_2 v_2 v_3 v_3 \ +

\partial_{2311} v_2 v_3 v_1 v_1 + \partial_{2321} v_2 v_3 v_2 v_1 + \partial_{2331} v_2 v_3 v_3 v_1 \ +

\partial_{2312} v_2 v_3 v_1 v_2 + \partial_{2322} v_2 v_3 v_2 v_2 + \partial_{2332} v_2 v_3 v_3 v_2 \ +

\partial_{2313} v_2 v_3 v_1 v_3 + \partial_{2323} v_2 v_3 v_2 v_3 + \partial_{2333} v_2 v_3 v_3 v_3 \ +

\partial_{3111} v_3 v_1 v_1 v_1 + \partial_{3121} v_3 v_1 v_2 v_1 + \partial_{3131} v_3 v_1 v_3 v_1 \ +

\partial_{3112} v_3 v_1 v_1 v_2 + \partial_{3122} v_3 v_1 v_2 v_2 + \partial_{3132} v_3 v_1 v_3 v_2 \ +

\partial_{3113} v_3 v_1 v_1 v_3 + \partial_{3123} v_3 v_1 v_2 v_3 + \partial_{3133} v_3 v_1 v_3 v_3 \ +

\partial_{3211} v_3 v_2 v_1 v_1 + \partial_{3221} v_3 v_2 v_2 v_1 + \partial_{3231} v_3 v_2 v_3 v_1 \ +

\partial_{3212} v_3 v_2 v_1 v_2 + \partial_{3222} v_3 v_2 v_2 v_2 + \partial_{3232} v_3 v_2 v_3 v_2 \ +

\partial_{3213} v_3 v_2 v_1 v_3 + \partial_{3223} v_3 v_2 v_2 v_3 + \partial_{3233} v_3 v_2 v_3 v_3 \ +

\partial_{3311} v_3 v_3 v_1 v_1 + \partial_{3321} v_3 v_3 v_2 v_1 + \partial_{3331} v_3 v_3 v_3 v_1 \ +

\partial_{3312} v_3 v_3 v_1 v_2 + \partial_{3322} v_3 v_3 v_2 v_2 + \partial_{3332} v_3 v_3 v_3 v_2 \ +

\partial_{3313} v_3 v_3 v_1 v_3 + \partial_{3323} v_3 v_3 v_2 v_3 + \partial_{3333} v_3 v_3 v_3 v_3 \bigg]

Again due to equality of cross-partials involving the same index numbers, this reduces to

\frac{1}{4!} \bigg[ \partial_{1111} v_1^4 + \partial_{2222} v_2^4 + \partial_{3333} v_3^4 + 4 \partial_{1112} v_1^3 v_2 + 4 \partial_{1113} v_1^3 v_3 +

4 \partial_{2223} v_2^3 v_3 + 4 \partial_{1222} v_1 v_2^3 + 4 \partial_{1333} v_1 v_3^3 + 4 \partial_{2333} v_2 v_3^3 +

6 \partial_{1122} v_1^2 v_2^2 + 6 \partial_{1133} v_1^2 v_3^2 + 6 \partial_{2233} v_2^2 v_3^2 + 12 \partial_{1123} v_1^2 v_2 v_3 + 12 \partial_{1223} v_1 v_2^2 v_3 + 12 \partial_{1233} v_1 v_2 v_3^2 \bigg]

Note that the coefficient of each of the terms is the multinomial coefficient

\binom{k}{k_1, k_2, \ldots, k_m} = \frac{k!}{k_1! k_2! \dots k_m!}

where k_1 + k_2 + \cdots + k_m = k. This gives the number of ordered arrangements of k objects in which there are k_1 objects of type 1, k_2 objects of type 2, \ldots, k_m objects of type m.

In our case we have k = 4 and m = 3. Thus, the coefficient of \partial_{1122} v_1^2 v_2^2 for example is 6, because there are two 1s, two 2s and zero 3s, and setting k_1 = 2, k_2 = 2, k_3 = 0 we get

\binom{k}{k_1, k_2, k_3} = \frac{k!}{k_1! k_2! k_3!} = \frac{4!}{2! 2! 0!} = \frac{24}{4} = 6

Similarly, in the case of \partial_{1123} v_1^2 v_2 v_3 the coefficient is 12 because k_1 = 2, k_2 = 1, k_3 = 1, so

\binom{k}{k_1, k_2, k_3} = \frac{4!}{2! 1! 1!} = \frac{24}{2} = 12

Note also that within the square brackets in the reduced expression above there are 15 different `types’ of terms. Why 15? The answer has to do with the concept of partitions from number theory. A partition of a positive integer is a way of writing that integer as a sum of positive integers. Two sums that differ only in the order of the integers are considered to be the same partition. Now, the number 4 has five partitions, namely

4

3 + 1

2 + 2

2 + 1 + 1

1 + 1 + 1 + 1

In the multinomial coefficients in the square brackets in the reduced expression above we are trying to obtain the sum k = 4 in all possible ways by assigning values to k_1, k_2 and k_3. There are 3 ways of achieving this with the partition 4 (one example is k_1 = 4, k_2 = 0, k_3 = 0); there are 6 ways of achieving this with the partition 3 + 1 (one example is k_1 = 3, k_2 = 1, k_3 = 0); there are 3 ways of achieving this with the partition 2 + 2 (one example is k_1 = 2, k_2 = 2, k_3 = 0); there are 3 ways of achieving this with the partition 2 + 1 + 1 (one example is k_1 = 2, k_2 =1, k_3 = 1); and finally it is impossible to use the partition 1 + 1 + 1 + 1 since we are only summing three terms. Thus there are 15 ways to achieve the sum k = 4 by applying the partitions of 4 to the values of k_1, k_2 and k_3, which explains why there are 15 terms in the square brackets in the reduced expression above.

Can we use these insights to quickly work out the fourth-order term in the Taylor expansion of the quadvariate function f : \mathbb{R}^4 \rightarrow \mathbb{R}? Since m = 4 and n = 4, this fourth-order term will be a sum of 4^4 = 256 terms. Can we work out how many terms there will be in the reduced expression in which equivalent cross-partials are grouped together? Yes we can, using the partitions of 4 again. In the multinomial coefficients in the square brackets in the reduced expression we will be trying to obtain the sum k = 4 in all possible ways by assigning values to k_1, k_2, k_3 and k_4. There are 4 ways of achieving this with the partition 4 (one example is k_1 = 4, k_2 = 0, k_3 = 0, k_4 = 0); there are 12 ways of achieving this with the partition 3 + 1 (one example is k_1 = 3, k_2 = 1, k_3 = 0, k_4 = 0); there are 6 ways of achieving this with the partition 2 + 2 (one example is k_1 = 2, k_2 = 2, k_3 = 0, k_4 = 0); there are 12 ways of achieving this with the partition 2 + 1 + 1 (one example is k_1 = 2, k_2 =1, k_3 = 1, k_4 = 0); and finally there is exactly one way to use the partition 1 + 1 + 1 + 1, namely k_1 = 1, k_2 = 1, k_3 = 1, k_4 = 1. Thus there are 35 ways to achieve the sum k = 4 by applying the partitions of 4 to the values of k_1, k_2, k_3 and k_4, so there should be 35 different types of terms in the reduced expression. These will be

\frac{1}{4!} \bigg[\partial_{1111} v_1^4 + \partial_{2222} v_2^4 + \partial_{3333} v_3^4 + \partial_{4444} v_4^4 \ +

4 \partial_{1112} v_1^3 v_2 + 4 \partial_{1222} v_1 v_2^3 + 4 \partial_{2223} v_2^3 v_3 \ +

4 \partial_{3444} v_3 v_4^3 + 4 \partial_{1113} v_1^3 v_3 + 4 \partial_{1333} v_1 v_3^3 \ +

4 \partial_{2333} v_2 v_3^3 + 4 \partial_{2224} v_2^3 v_4 + 4 \partial_{1114} v_1^3 v_4 \ +

4 \partial_{1444} v_1 v_4^3 + 4 \partial_{3334} v_3^3 v_4 + 4 \partial_{2444} v_2 v_4^3 \ +

6 \partial_{1122} v_1^2 v_2^2 + 6 \partial_{2233} v_2^2 v_3^2 + 6 \partial_{1133} v_1^2 v_3^2 \ +

6 \partial_{2244} v_2^2 v_4^2 + 6 \partial_{1144} v_1^2 v_4^2 + 6 \partial_{3344} v_3^2 v_4^2 \ +

12 \partial_{1123} v_1^2 v_2 v_3 + 12 \partial_{1124} v_1^2 v_2 v_4 + 12 \partial_{1134} v_1^2 v_3 v_4 \ +

12 \partial_{1223} v_1 v_2^2 v_3 + 12 \partial_{1224} v_1 v_2^2 v_4 + 12 \partial_{1334} v_1 v_3^2 v_4 \ +

12 \partial_{1344} v_1 v_3 v_4^2 + 12 \partial_{1244} v_1 v_2 v_4^2 + 12 \partial_{1233} v_1 v_2 v_3^2 \ +

12 \partial_{2334} v_2 v_3^2 v_4 + 12 \partial_{2234} v_2^2 v_3 v_4 + 12 \partial_{2344} v_2 v_3 v_4^2 \ +

24 \partial_{1234} v_1 v_2 v_3 v_4 \bigg]

The coefficient on each of these terms is the relevant multinomial coefficient, and note that these coefficients add up to 256. The following are examples of how I calculated the multinomial coefficients in this quadvariate case:

For the term involving \partial_{1112} we have k_1 = 3, k_2 = 1, k_3 = 0, k_4 = 0, so

\binom{k}{k_1, k_2, k_3, k_4} = \frac{4!}{3! 1! 0! 0!} = \frac{24}{6} = 4

For the term involving \partial_{1122} we have k_1 = 2, k_2 = 2, k_3 = 0, k_4 = 0, so

\binom{k}{k_1, k_2, k_3, k_4} = \frac{4!}{2! 2! 0! 0!} = \frac{24}{4} = 6

For the term involving \partial_{1123} we have k_1 = 2, k_2 = 1, k_3 = 1, k_4 = 0, so

\binom{k}{k_1, k_2, k_3, k_4} = \frac{4!}{2! 1! 1! 0!} = \frac{24}{2} = 12

Finally, for the term involving \partial_{1234} we have k_1 = 1, k_2 = 1, k_3 = 1, k_4 = 1, so

\binom{k}{k_1, k_2, k_3, k_4} = \frac{4!}{1! 1! 1! 1!} = \frac{24}{1} = 24

A study of Euler’s untranslated paper `De integratione aequationum differentialium’

UntitledThe following are some translations and annotations I made while studying a previously unstranslated paper in Latin by Leonhard Euler (1707-1783) entitled De integratione aequationum differentialium (On the integration of differential equations), 1763. (Eneström index number E269, obtained from the Euler archive). This contains, among other things, some interesting treatments of linear first-order equations as well as nonlinear first-order equations such as Bernoulli’s equation and Riccati’s equation. I was curious to see what Euler had to say about these, particularly Riccati’s equation. (Jacopo Francesco Riccati (1676-1754) was a Venetian mathematician who introduced an important class of nonlinear first-order differential equations which now bear his name, and which Euler investigated in greater depth).

To begin with, I will translate the first paragraph of the introduction to the paper as it gives a nice overview of the central idea as well as giving a flavour of Euler’s writing style:

1.

I consider herein differential equations of first order, involving only two variables, which one may represent in the general form M dx + N dy = 0, where M and N denote any functions of a pair of variables x and y. It is demonstrated that such equations always imply a certain relationship between the variables x and y, such that for any given value of one of them, the values for the other can be determined. Since moreover this definite relationship between the two variables is to be found through integration, the integral equation, regarded in its greatest generality, will accommodate a new constant quantity which provided that it depends completely on our choice enables as it were an infinite number of integral equations to be encompassed, which should all be equally compatible with the differential equation. 

Mathematical comments: What Euler is essentially saying in this first paragraph is that he is going to consider methods for integrating first-order differential equations of the form M(x, y) dx + N(x, y) dy = 0 in order to obtain corresponding relationships between the variables x and y which he calls `integral equations’ (aequationes integrales) of the form V(x, y) = Const. The term on the right-hand side is an arbitrary constant and no matter what value is assigned to this constant, the differential of this `integral equation’ will always be of the same form M dx + N dy = 0. \square

After some further introductory comments, the rest of Euler’s long paper consists of a multitude of problems, theorems, corollaries and examples concerning differential equations of this type. The first problem that Euler considers is important because it more or less sets the scene for the rest of the paper:

Problem 1.

7. Given that the differential equation M dx + N dy = 0 is such that \big(\frac{d M}{d y} \big) = \big(\frac{d N}{d x} \big), find its integral equation.   

Solution.

If it is to be that \big(\frac{d M}{d y} \big) = \big(\frac{d N}{d x} \big), then a function must exist defined over a pair of variables x and y, which when it is differentiated yields M dx + N dy. Let V be this function, and given that dV = M dx + N dy, M dx will be the differential of V when only the variable x is considered, and N dy will be its differential when only the variable y is considered. From here therefore V will be found, if either M dx is integrated with y regarded as a constant, or N dy is integrated with x regarded as a constant: and in this way the task is reduced to the integration of a differential formula involving a single variable, which might be achieved either by algebraic methods, or else quadrature of curves might be needed. But since it is the case that the quantity V can be found in two ways, and one integration treats as a constant any function whatever of y, while the other treats as a constant any function whatever of x, so it may then be that both V = \int M dx + Y and V = \int N dy + X hold, and it is always permissible to define these functions Y of y, and X of x, such that \int M dx + Y = \int N dy + X, and this is in all cases readily guaranteed. When this is done it is evident that since the quantity V is to be the integral of the formula M dx + N dy, the complete integral equation of the proposed equation M dx + N dy = 0 will then be V = Const., as this involves a constant which is dependent on our choice. 

Mathematical comments: The first thing to observe here is that Euler uses the notation \big(\frac{d M}{d y} \big) and \big(\frac{d N}{d x} \big) to represent the partial derivatives \frac{\partial M}{\partial y} and \frac{\partial N}{\partial x}. At the start of the solution he says that if \frac{\partial M}{\partial y} = \frac{\partial N}{\partial x} then a function V(x, y) must exist which yields M dx + N dy when it is totally differentiated. The reason for this immediately becomes clear if we totally differentiate V(x, y) to get dV = \frac{\partial V}{\partial x} dx + \frac{\partial V}{\partial y} dy. We see that M = \frac{\partial V}{\partial x} and N = \frac{\partial V}{\partial y} so the condition \frac{\partial M}{\partial y} = \frac{\partial N}{\partial x} simply amounts to saying that the cross-partials of V are equal, i.e., \frac{\partial^2 V}{\partial x \partial y} = \frac{\partial^2 V}{\partial y \partial x}. This must be true for any well-behaved function. So if the condition \frac{\partial M}{\partial y} = \frac{\partial N}{\partial x} holds it will always be possible to find a function V whose total differential is M dx + N dy. Conversely, if it is not true that \frac{\partial M}{\partial y} = \frac{\partial N}{\partial x}, there can be no function whose total differential is M dx + N dy. Euler deals with this latter case in a subsequent problem (see below). Since M = \frac{\partial V}{\partial x} we see that V = \int M dx + Y where Y is any function of y. Similarly, since N = \frac{\partial V}{\partial y} we have that V = \int N dy + X where X is any function of x. Equating these two expressions for V we can then work out what the functions X and Y must be in order to make \int M dx + Y and \int N dy + X mutually compatible. This is essentially the strategy that Euler is advocating here. \square

Euler provides a number of examples to illustrate the above procedure. The following is the first one:

Example 1.

12. Integrate this differential equation:

2axy dx + axx dy - y^3 dx - 3xyy dy = 0

The equation compared with the form M dx + N dy = 0 implies

M = 2axy - y^3 and N = axx - 3xyy.

Then the first thing to be considered is whether in this case the problem can be solved ? to which end we inquire as to the values of:

\big(\frac{d M}{d y}\big) = 2ax - 3yy, and \big(\frac{d N}{d y}\big) = 2ax - 3yy,

which since they are equal, the suggested method must necessarily succeed. Then it will be discovered, with y taken to be constant:

\int M dx = axxy - y^3 x + Y

which is of such a form that if the differential is taken, treating x as a constant, it will yield:

axx dy - 3yyx dy + dY = N dy,

and upon replacing N with its value axx - 3xyy, the result will be dY = 0, from which we get Y = 0, or Y = const. Hence we will have the required integral equation:

axxy - y^3 x = Const.

In the next extract, Euler introduces an integrating factor method to deal with situations in which the condition \frac{\partial M}{\partial y} = \frac{\partial N}{\partial x} does not hold:

Theorem.

16. If for the differential equation M dx + N dy = 0 it is not the case that \big(\frac{d M}{d y}\big) = \big(\frac{d N}{d x}\big), a multiplier is always available by means of which multiplication of the formula M dx + N dy will make it become integrable. 

Proof.

Since it is not the case that \big(\frac{d M}{d y}\big) = \big(\frac{d N}{d x}\big), also the formula M dx + N dy will not be integrable, or rather a function of x and y does not exist whose differential is M dx + N dy. In truth however it is not the formula M dx + N dy, but the equation M dx + N dy = 0, whose integral is being sought; and since this equation still stands if it is multiplied by any function L of x and y, so that it may be written L M dx + L N dy = 0, it is demonstrated that it is always possible to find a function L of this kind, such that the formula L M dx + L N dy = 0 will be integrable. For this to come about it will be necessary that:

\big(\frac{d LM}{d y}\big) = \big(\frac{d LN}{d x}\big)

or if it is supposed that dL = P dx + Q dy, then since \big(\frac{d L}{d y}\big) = Q, and \big(\frac{d L}{d x}\big) = P, the function L should be constructed so that:

L \big(\frac{d M}{d y}\big) + M Q = L \big(\frac{d N}{d x}\big) + NP

But it is apparent that this condition is sufficient to define a function L, which when it multiplies the formula M dx + N dy will make it integrable. 

Mathematical comments: What Euler is doing here is deriving a procedure for finding an integrating factor L(x, y) which will make LM dx + LN d y = 0 integrable. If this is integrable there will be a function V(x, y) whose total differential is LM dx + LN d y. But the total differential of V is d V = \frac{\partial V}{\partial x} dx + \frac{\partial V}{\partial y} dy from which we see that

\frac{\partial V}{\partial x} = LM and \frac{\partial V}{\partial y} = LN

Therefore Euler’s condition

\frac{\partial LM}{\partial y} = \frac{\partial LN}{\partial x}

again amounts to saying that the cross-partials of V must be equal. Expanding Euler’s condition we get

L \frac{\partial M}{\partial y} + M \frac{\partial L}{\partial y} = L \frac{\partial N}{\partial x} + N \frac{\partial L}{\partial x}

which when it is assumed that dL = P dx + Q dy we can write as

L \frac{\partial M}{\partial y} + MQ = L \frac{\partial N}{\partial x} + NP

This is the equation at the end of Euler’s proof and we can see that it is nothing more than a statement of the equality of the cross-partials of V. Euler does not explicitly take the final step, saying simply that it is evident from this that the required function L can be defined, but what he means is that L can now be defined in terms of the functions M, N, P and Q as

L = \frac{NP - MQ}{\frac{\partial M}{\partial y} - \frac{\partial N}{\partial x}}

When M dx + N dy is multiplied through by a function L defined in this way, the result will be a differential equation that is integrable. \square

I will now jump ahead in Euler’s paper to an application of the above procedure in the familiar case of a linear first-order differential equation:

Problem 4.

34. If the proposed differential equation is to be 

P dx + Q y dx + R dy = 0

where P, Q and R denote any functions whatever of x, and moreover the other variable y does not have more than one dimension, find the multiplier which makes it integrable. 

Solution.

Comparison of this equation with the form M dx + N dy implies M = P + Qy and N = R, and so we get 

\big(\frac{d M}{d y}\big) = Q and \big(\frac{d N}{d x}\big) = \frac{d R}{d x}

If now L is to be set up as the multiplier in question, let dL = p dx + q dy, and then the equation ought to satisfy

\frac{Np - Mq}{L} = Q - \frac{d R}{d x} = \frac{Rp - (P + Qy)q}{L}

Now since Q - \frac{d R}{d x} is to be a function only of x, L will likewise have to be accepted as a function only of x, so that q = 0 and dL = p dx; whence it will be the case that:

Q - \frac{d R}{d x} = \frac{Rp}{L}, or rather Q dx - dR = \frac{R dL}{L}

and therefore \frac{d L}{L} = \frac{Q dx}{R} - \frac{d R}{R}. Hence upon integration will be obtained lL = \int \frac{Q dx}{R} - lR, and taking e to be the number whose hyperbolic logarithm equals unity, it turns out that 

L = \frac{1}{R} e^{\int \frac{Q dx}{R}}

Then as a result of this discovered multiplier the integral equation will be:

\int \frac{P dx}{R} e^{\int \frac{Q dx}{R}} + y e^{\int \frac{Q dx}{R}} = Const.

Mathematical comments: In view of the previous theorem introducing the integrating factor approach using Euler’s idiosyncratic notation, Euler’s explanation of the solution to this problem is largely self-explanatory. Perhaps only the final step needs clarification. Having found the multiplier L we can multiply the original differential equation by it to get

L P dx + L Q y dx + L R dy = 0

or

\frac{P dx}{R} e^{\int \frac{Q dx}{R}} + \frac{Q dx}{R} y e^{\int \frac{Q dx}{R}} + e^{\int \frac{Q dx}{R}} dy = 0

or

\frac{P dx}{R} e^{\int \frac{Q dx}{R}} + d \big(y e^{\int \frac{Q dx}{R}}\big) = 0

This can now be integrated directly to get the final equation obtained by Euler. Although straightforward, it is interesting to compare Euler’s approach here with the type of argument usually given in modern treatments of linear first-order differential equations. In modern treatments the differential equation would usually be given as

\frac{d y}{d x} + Py = Q

We would then seek an integrating factor L(x) which would enable us to write

\frac{d}{dx}(y L) = QL

and this could then be integrated directly to find y. The unknown function L is found by first expanding the left-hand side to get

L \frac{d y}{d x} + y \frac{d L}{d x} = QL

Dividing through by L we get

\frac{d y}{d x} + \big(\frac{d L/d x}{L}\big)y = Q

Comparing this with the original equation we see that

\frac{1}{L} \frac{d L}{d x} = P

and so

L = e^{\int P dx}

Putting this into \frac{d}{dx}(y L) = QL and integrating directly we then get the required solution

y e^{\int P dx} = \int Q e^{\int P dx} dx + Const.

which is equivalent to the one obtained by Euler. \square

In the next problem Euler considers a nonlinear first-order differential equation which today is known as Bernoulli’s equation:

Problem 5.

37. If the proposed differential equation is to be: 

P y^n dx + Q y dx + R dy = 0

where P, Q and R denote any functions whatever of x, find the multiplier which makes it integrable. 

Solution.

Then it will be that M = P y^n + Qy and N = R, and hence

\big(\frac{d M}{d y}\big) = n P y^{n - 1} + Q, and \big(\frac{d N}{d x}\big) = \frac{d R}{d x}

Therefore assuming that the multiplier in question is L and dL = p dx + q dy, based on our previous findings it will be the case that:

 \frac{Rp - P y^n q - Q y q}{L} = n P y^{n - 1} + Q - \frac{d R}{d x}

If it is imagined that L = S y^m, where S is a function only of x, it will be the case that p = \frac{y^m d S}{d x}, and q = m S y^{m - 1}, so that substituting these values will yield:

\frac{R d S}{S d x} - m P y^{n - 1} - m Q = n P y^{n - 1} + Q - \frac{d R}{d x}

In order for this equation to hold it needs to take m = -n, so it will become

\frac{R d S}{S d x} = (1 - n) Q - \frac{d R}{d x} , or \frac{d S}{S} = \frac{(1 - n) Q dx}{R} - \frac{d R}{R}

Then since integration will yield S = \frac{1}{R} e^{(1 - n) \int \frac{Q dx}{R}}, the required multiplier, with m = -n, will be

L = \frac{y^{-n}}{R} e^{(1 - n) \int \frac{Q dx}{R}}

and the integral equation will be

\frac{y^{1 - n}}{1 - n} e^{(1 - n) \int \frac{Q dx}{R}} + \int \frac{P dx}{R} e^{(1 - n) \int \frac{Q dx}{R}} =  Const.

Mathematical comments: Again in view of the previous material, Euler’s solution to this problem is largely self-explanatory. To get the final equation we multiply the original differential equation by the multiplier to get

L P y^n dx + L Q y dx + L R dy = 0

or

\frac{P dx}{R} e^{(1 - n) \int \frac{Q dx}{R}} + \frac{Q dx}{R} y^{1 - n} e^{(1 - n) \int \frac{Q dx}{R}} + y^{-n} e^{(1 - n) \int \frac{Q dx}{R}} dy = 0

or

\frac{P dx}{R} e^{(1 - n) \int \frac{Q dx}{R}} + d \big(\frac{y^{1 - n}}{1 - n} e^{(1 - n) \int \frac{Q dx}{R}}\big) = 0

and this last equation can now be integrated directly. As with the previous problem, it is interesting to compare Euler’s approach to solving this problem with the way it would be done in modern treatments. Prior to Euler, the equation in this problem was also studied by two Bernoulli brothers, James Bernoulli (1654-1705) and John Bernoulli (1667-1748), as well as by Gottfried Wilhelm Leibniz (1646-1716). Ironically, the modern approach is essentially the one developed by John Bernoulli before Euler tackled it (John Bernoulli was Euler’s tutor). In modern times the equation would usually be written as

\frac{d y}{d x} + P y = Q y^n

and John Bernoulli’s approach was to make the change of variable z = y^{1 - n}. Then

\frac{d z}{d x} = \frac{1 - n}{y^n} \frac{d y}{d x}

and we can rewrite the original equation using this new variable as

\frac{d z}{d x} + (1 - n) P z = (1 - n) Q

This is now a first-order linear equation of the type solved in the previous problem. Following the approach discussed there, we seek a multiplier L which enables us to write the equation as

\frac{d}{dx} (z L) = (1 - n) Q L

Expanding the left-hand side and comparing with the original equation we find that

\frac{1}{L} \frac{d L}{d x} = (1 - n) P

so the multiplier is

L = e^{(1 - n) \int P dx}

Then we can write

\frac{d}{dx}\big(z e^{(1 - n) \int P dx}\big) = (1 - n) Q e^{(1 - n) \int P dx}

This can be integrated directly to get

z e^{(1 - n) \int P dx} = (1 - n) \int Q e^{(1 - n) \int P dx} dx + Const.

or

y^{1-n} e^{(1 - n) \int P dx} = (1 - n) \int Q e^{(1 - n) \int P dx} dx + Const.

which is equivalent to Euler’s solution. \square

The final extract from Euler’s paper that I want to include here is a problem dealing with a nonlinear differential equation now known as Riccati’s equation. The equation was apparently first referred to as `Riccati’s equation’ in 1770 (only a few years after Euler wrote this paper) by the mathematician Jean-Baptiste le Rond d’Alembert (1717-1783):

Problem 9. 

56. For this proposed differential equation:

dy + P y dx + Q y y dx + R dx = 0

where P, Q and R are functions only of x, if it is determined that this equation satisfies y = v where v is a function of x, find the multipliers which make this equation integrable. 

Solution.

Since this equation is satisfied by the value y = v, it will be the case that

dv + P v dx + Q v v dx + R dx = 0;

if one then puts y = v + \frac{1}{z}, one will obtain

-\frac{d z}{zz} + \frac{P d x}{z} + \frac{2Qv d x}{z} + \frac{Q d x}{zz} = 0

or:

dz - (P + 2Qv)z dx - Qdx = 0

which is rendered integrable by means of the multiplier

e^{-\int(P + 2Qv)dx}.

This multiplier will thus fit the proposed equation when it itself is multiplied by zz. When therefore it is the case that z = \frac{1}{y - v}, the multiplier that will render the proposed equation integrable will be

\frac{1}{(y - v)^2} \ e^{-\int(P + 2Qv)dx}

For the sake of brevity let e^{-\int(P + 2Qv)dx} = S. Since the integral of the equation dz - (P + 2Qv)z dx - Qdx = 0 is

Sz - \int QS dx = Const.

all the required multipliers will be encompassed by this form:

\frac{S}{(y - v)^2} funct. \big(\frac{S}{y - v} - \int QS dx \big)

where by hypothesis v is a known function of x, and also S = e^{-\int(P + 2Qv)dx}

Mathematical comments: In this problem Euler has developed a method for solving Riccati’s equation when a particular integral is known. The basic idea is that if the particular solution v(x) is known, then the change of variables y = v + \frac{1}{z} gives a simple linear first-order equation for z of the form dz - (P + 2Qv)z dx - Qdx = 0. This is because substituting y = v + \frac{1}{z} into Riccati’s equation we get

dy + Py dx + Qyy dx + Rdx =

\big(dv - \frac{1}{zz} dz\big) + \big(Pv dx + \frac{P dx}{z}\big) + \big(Qvv dx + \frac{2Qvdx}{z} + \frac{Qdx}{zz}\big) + Rdx =0

which reduces to

-\frac{d z}{zz} + \frac{P d x}{z} + \frac{2Qv d x}{z} + \frac{Q d x}{zz} = 0

\iff

dz - (P + 2Qv)z dx - Qdx = 0

The integrating factor which makes this simple linear first-order differential equation integrable can be found straightforwardly using the method of Problem 4, and is e^{-\int(P + 2Qv)dx}. Then the integrating factor that makes the equation

-\frac{d z}{zz} + \frac{P d x}{z} + \frac{2Qv d x}{z} + \frac{Q d x}{zz} = 0

integrable will be

zz e^{-\int(P + 2Qv)dx} = \frac{1}{(y - v)^2} \ e^{-\int(P + 2Qv)dx}

To clarify the last part of Euler’s solution, note that if we multiply through the equation dz - (P + 2Qv)z dx - Qdx = 0 by S = e^{-\int(P + 2Qv)dx} we get

Sdz - S(P + 2Qv)z dx - SQdx = 0

\iff

d(Sz) - QSdx = 0

This last equation can be integrated directly to give

Sz - \int QS dx = Const.

or

\frac{S}{y - v} - \int QS dx = Const.

Euler makes the point that in general the integrating factor for all cases of Riccati’s equation in which a particular solution v(x) is known will be of the form

\frac{S}{(y - v)^2} f\big(\frac{S}{y - v} - \int QS dx\big)

where f is some unspecified function. This works because the argument of the unspecified function f is an arbitrary constant, so df = 0. Thus, for example, starting from

-\frac{d z}{zz} + \frac{P d x}{z} + \frac{2Qv d x}{z} + \frac{Q d x}{zz} = 0

we can multiply through by -\frac{Sf}{(y - v)^2} instead of just -\frac{S}{(y - v)^2} to get

Sf dz - Sf (P + 2Qv) z dx - SfQ dx = 0

and since df = 0 we can add the term zS df to get

Sf dz - Sf (P + 2Qv) z dx + zSdf - SfQ dx = 0

\iff

d\big(Sfz\big) - QSf dx = 0

and this can be integrated directly to get

Sfz - \int QSf dx = Const. \square

Changing variables in the square of the quantum mechanical angular momentum operator

orbitalA particle of mass m with position vector \mathbf{r} and velocity \mathbf{v} (with respect to some specified origin) has a linear momentum vector \mathbf{p} = m \mathbf{v} and angular momentum vector

\mathbf{L} = \mathbf{r} \times \mathbf{p} = m \mathbf{r} \times \mathbf{v}

where \times is the vector product operation. The magnitude of the angular momentum vector is L = rmv\sin \theta where \theta is the angle between \mathbf{r} and \mathbf{v}. The direction of \mathbf{L} is given by the right-hand rule when the vectors \mathbf{r} and \mathbf{v} are placed with their tails at the same point: righthandruleone curls the fingers of the right hand in the direction of rotation of \mathbf{r} into \mathbf{v} and the thumb then points in the direction of \mathbf{L}. One can find the components of \mathbf{L} in the x, y and z directions in Cartesian coordinates using

\mathbf{L} = \mathbf{r} \times \mathbf{p} = \begin{vmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ \ \\x & y & z \\ \ \\p_x & p_y & p_z \end{vmatrix}

= (yp_z - zp_y)\mathbf{i} + (zp_x - xp_z)\mathbf{j} + (xp_y - yp_x)\mathbf{k}

Therefore the components of \mathbf{L} in Cartesian coordinates are

L_x = yp_z - zp_y

L_y = zp_x - xp_z

L_z = xp_y - yp_x

In classical mechanics the angular momentum magnitude L can take a continuum of values but in quantum mechanics only certain discrete values for L are permissible. Furthermore, the linear momentum vector \mathbf{p} appearing in \mathbf{L} must obey Heisenberg’s uncertainty principle in each direction, i.e.,

\Delta x \Delta p_x \geq \frac{\hbar}{2}

in the x-direction and similarly for the y and z directions. These features of quantized variables like \mathbf{p} and \mathbf{L} make it necessary to do calculations with them in quantum mechanics using their operator representations. (For example, on the quantum scale one cannot calculate the expectation of momentum in the x-direction using an integral involving momentum as a function of x because no such function can exist: Heisenberg’s uncertainty principle prevents accurate knowledge of momentum when the value of x is known exactly. One must therefore use the operator representation of momentum rather than momentum as some function of x in order to calculate the expectation). It is a basic postulate of quantum mechanics that every observable quantity characterising a physical system can be represented by a quantum mechanical operator obtained by expressing the quantity in terms of \mathbf{r} and \mathbf{p} and then replacing the vector \mathbf{p} by - i \hbar \nabla where

\nabla = \mathbf{i} \frac{\partial}{\partial x} + \mathbf{j} \frac{\partial}{\partial y} + \mathbf{k} \frac{\partial}{\partial z}

and its components p_xp_y and p_z  by

p_x = - i \hbar \frac{\partial}{\partial x}

p_y = - i \hbar \frac{\partial}{\partial y}

p_z = - i \hbar \frac{\partial}{\partial z}

Taking this on board we can then write \mathbf{L} and L_x, L_y and L_z in quantum mechanical operator form in Cartesian coordinates as

\mathbf{L} = - i \hbar (\mathbf{r} \times \nabla)

L_x = - i \hbar \big( y \frac{\partial}{\partial z} - z \frac{\partial}{\partial y} \big)

L_y = - i \hbar \big( z \frac{\partial}{\partial x} - x \frac{\partial}{\partial z} \big)

L_x = - i \hbar \big( x \frac{\partial}{\partial y} - y \frac{\partial}{\partial x} \big)

In order to perform some calculations involving Schrödinger’s equation I needed to employ the square of the quantum mechanical angular momentum operator L^2, but in spherical polar coordinates rather than Cartesian coordinates, where

L^2 = L_x^2 + L_y^2 + L_z^2

I used the matrix calculus approach in my previous note to achieve the necessary change of variables in L^2. In the present note I want to record the details of this calculation as I have never seen this approach used elsewhere. (This change of variables can also be done using a scale factor method based on vector calculus which I will not go into here).

As in my previous note we begin the matrix calculus approach with the standard conversion equations for spherical polar coordinates:

x = r \sin \theta \cos \phi

y = r \sin \theta \sin \phi

z = r \cos \theta

Differentiating with respect to the vector (r, \theta, \phi) we get

\begin{bmatrix} \frac{\partial x}{\partial r} & \frac{\partial y}{\partial r} & \frac{\partial z}{\partial r}\\ \ \\ \frac{\partial x}{\partial \theta} & \frac{\partial y}{\partial \theta} & \frac{\partial z}{\partial \theta}\\ \ \\ \frac{\partial x}{\partial \phi} & \frac{\partial y}{\partial \phi} & \frac{\partial z}{\partial \phi}\end{bmatrix} =  \begin{bmatrix} \sin \theta \cos \phi & \sin \theta \sin \phi & \cos \theta\\ \ \\ r \cos \theta \cos \phi & r \cos \theta \sin \phi & - r \sin \theta\\ \ \\ - r \sin \theta \sin \phi & r \sin \theta \cos \phi & 0\end{bmatrix}

We can then use the matrix version of the chain rule to write

\begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi}\end{bmatrix} = \begin{bmatrix} \frac{\partial x}{\partial r} & \frac{\partial y}{\partial r} & \frac{\partial z}{\partial r}\\ \ \\ \frac{\partial x}{\partial \theta} & \frac{\partial y}{\partial \theta} & \frac{\partial z}{\partial \theta}\\ \ \\ \frac{\partial x}{\partial \phi} & \frac{\partial y}{\partial \phi} & \frac{\partial z}{\partial \phi}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial x}\\ \ \\ \frac{\partial F}{\partial y} \\ \ \\ \frac{\partial F}{\partial z} \end{bmatrix} = \begin{bmatrix} \sin \theta \cos \phi & \sin \theta \sin \phi & \cos \theta\\ \ \\ r \cos \theta \cos \phi & r \cos \theta \sin \phi & - r \sin \theta\\ \ \\ - r \sin \theta \sin \phi & r \sin \theta \cos \phi & 0\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial x}\\ \ \\ \frac{\partial F}{\partial y} \\ \ \\ \frac{\partial F}{\partial z} \end{bmatrix}

We can solve this system by inverting the coefficient matrix to get

\begin{bmatrix} \frac{\partial F}{\partial x}\\ \ \\ \frac{\partial F}{\partial y} \\ \ \\ \frac{\partial F}{\partial z}\end{bmatrix} = \begin{bmatrix} \sin \theta \cos \phi & \frac{\cos \theta \cos \phi}{r} & -\frac{\sin \phi}{r \sin \theta}\\ \ \\ \sin \theta \sin \phi & \frac{\cos \theta \sin \theta}{r} & \frac{\cos \phi}{r \sin \theta}\\ \ \\ \cos \theta & -\frac{\sin \theta}{r} & 0\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi} \end{bmatrix}

Using the equations in this system together with the standard conversion equations we then have

y \frac{\partial F}{\partial z} = r \sin \theta \sin \phi \big( \cos \theta \frac{\partial F}{\partial r} - \frac{\sin \theta}{r} \frac{\partial F}{\partial \theta} \big) = r \sin \theta \cos \theta \sin \phi \frac{\partial F}{\partial r} - \sin^2 \theta \sin \phi \frac{\partial F}{\partial \theta}

and

z \frac{\partial F}{\partial y} = r \cos \theta \big( \sin \theta \sin \phi \frac{\partial F}{\partial r} + \frac{\cos \theta \sin \phi}{r} \frac{\partial F}{\partial \theta} + \frac{\cos \phi}{r \sin \theta} \frac{\partial F}{\partial \phi}\big)

= r \cos \theta \sin \theta \sin \phi \frac{\partial F}{\partial r} + \cos^2 \theta \sin \phi \frac{\partial F}{\partial \theta} + \frac{\cos \theta \cos \phi}{\sin \theta}\frac{\partial F}{\partial \phi}

Subtracting the second expression from the first and ignoring the F in the numerators of the partial derivatives we can then write the angular momentum operator in the x-direction in terms of spherical polar coordinates as

L_x = - i \hbar \big( y \frac{\partial}{\partial z} - z \frac{\partial}{\partial y} \big)

= - i \hbar \big( - \cot \theta \cos \phi \frac{\partial}{\partial \phi} - \sin \phi \frac{\partial}{\partial \theta} \big)

= i \hbar \big(\cot \theta \cos \phi \frac{\partial}{\partial \phi} + \sin \phi \frac{\partial}{\partial \theta} \big)

Similarly we have

z \frac{\partial F}{\partial x} = r \cos \theta \big( \sin \theta \cos \phi \frac{\partial F}{\partial r} + \frac{\cos \theta \cos \phi}{r} \frac{\partial F}{\partial \theta} - \frac{\sin \phi}{r \sin \theta} \frac{\partial F}{\partial \phi}\big)

= r \cos \theta \sin \theta \cos \phi \frac{\partial F}{\partial r} + \cos^2 \theta \cos \phi \frac{\partial F}{\partial \theta} - \frac{\cos \theta \sin \phi}{\sin \theta}\frac{\partial F}{\partial \phi}

and

x \frac{\partial F}{\partial z} = r \sin \theta \cos \phi \big( \cos \theta \frac{\partial F}{\partial r} - \frac{\sin \theta}{r} \frac{\partial F}{\partial \theta} \big) = r \sin \theta \cos \theta \cos \phi \frac{\partial F}{\partial r} - \sin^2 \theta \cos \phi \frac{\partial F}{\partial \theta}

Therefore

L_y = - i \hbar \big( z \frac{\partial}{\partial x} - x \frac{\partial}{\partial z} \big)

= - i \hbar \big(\cos \phi \frac{\partial}{\partial \theta} - \cot \theta \sin \phi \frac{\partial}{\partial \phi} \big)

Finally, we have

x \frac{\partial F}{\partial y} = r \sin \theta \cos \phi \big( \sin \theta \sin \phi \frac{\partial F}{\partial r} + \frac{\cos \theta \sin \phi}{r} \frac{\partial F}{\partial \theta} + \frac{\cos \phi}{r \sin \theta} \frac{\partial F}{\partial \phi}\big)

= r \sin^2 \theta \cos \phi \sin \phi \frac{\partial F}{\partial r} + \sin \theta \cos \theta \sin \phi \cos \phi \frac{\partial F}{\partial \theta} + \cos^2 \phi \frac{\partial F}{\partial \phi}

and

y \frac{\partial F}{\partial x} = r \sin \theta \sin \phi \big( \sin \theta \cos \phi \frac{\partial F}{\partial r} + \frac{\cos \theta \cos \phi}{r} \frac{\partial F}{\partial \theta} - \frac{\sin \phi}{r \sin \theta} \frac{\partial F}{\partial \phi}\big)

= r \sin^2 \theta \cos \phi \sin \phi \frac{\partial F}{\partial r} + \sin \theta \cos \theta \sin \phi \cos \phi \frac{\partial F}{\partial \theta} - \sin^2 \phi \frac{\partial F}{\partial \phi}

Therefore

L_z = - i \hbar \big( x \frac{\partial}{\partial y} - y \frac{\partial}{\partial x} \big)

= - i \hbar \frac{\partial}{\partial \phi}

Having obtained the components of \mathbf{L} in spherical polar coordinates we can now finally calculate the operator representation of L^2 as

L^2 = L_x^2 + L_y^2 + L_z^2

= \big[ i \hbar \big(\cot \theta \cos \phi \frac{\partial}{\partial \phi} + \sin \phi \frac{\partial}{\partial \theta} \big) \big]^2 + \big[ - i \hbar \big(\cos \phi \frac{\partial}{\partial \theta} - \cot \theta \sin \phi \frac{\partial}{\partial \phi} \big) \big]^2 + \big[ - i \hbar \frac{\partial}{\partial \phi} \big]^2

= - \hbar^2 \big[ \sin^2 \phi \frac{\partial^2}{\partial \theta^2} + \big(\sin \phi \frac{\partial}{\partial \theta} \big) \big(\cot \theta \cos \phi \frac{\partial}{\partial \phi} \big) + \big(\cot \theta \cos \phi \frac{\partial}{\partial \phi} \big)\big(\sin \phi \frac{\partial}{\partial \theta} \big) + \cot^2 \theta \cos^2 \phi \frac{\partial^2}{\partial \phi^2} \big]

- \hbar^2 \big[ \cos^2 \phi \frac{\partial^2}{\partial \theta^2} - \big(\cos \phi \frac{\partial}{\partial \theta} \big) \big(\cot \theta \sin \phi \frac{\partial}{\partial \phi} \big) - \big(\cot \theta \sin \phi \frac{\partial}{\partial \phi} \big)\big(\cos \phi \frac{\partial}{\partial \theta} \big) + \cot^2 \theta \sin^2 \phi \frac{\partial^2}{\partial \phi^2} \big]

- \hbar^2 \frac{\partial^2}{\partial \phi^2}

= - \hbar^2 \big( \frac{\partial^2}{\partial \theta^2} + \cot^2 \theta \frac{\partial^2}{\partial \phi^2} + \frac{\partial^2}{\partial \phi^2} - \sin \phi \frac{1}{\sin^2 \theta} \cos \phi \frac{\partial}{\partial \phi} + \cot \theta \cos^2 \phi \frac{\partial}{\partial \theta}

+ \cos \phi \frac{1}{\sin^2 \theta} \sin \phi \frac{\partial}{\partial \phi} + \cot \theta \sin^2 \phi \frac{\partial}{\partial \theta}\big)

= - \hbar^2 \big( \frac{\partial^2}{\partial \theta^2} + \cot \theta \frac{\partial}{\partial \theta} + \cot^2 \theta \frac{\partial^2}{\partial \phi^2} + \frac{\partial^2}{\partial \phi^2}\big)

= - \hbar^2 \big( \frac{1}{\sin \theta}\frac{\partial}{\partial \theta} \big( \sin \theta \frac{\partial}{\partial \theta}\big) + \frac{1}{\sin^2 \theta}\frac{\partial^2}{\partial \phi^2} \big)

A matrix calculus approach to changing variables in Laplace’s equation

laplaceIn three-dimensional rectangular coordinates, the partial differential equation known as Laplace’s equation takes the form

\frac{\partial^2 F}{\partial x^2} + \frac{\partial^2 F}{\partial y^2} + \frac{\partial^2 F}{\partial z^2} = 0

This equation is applicable to a wide range of problems in physics but it is often necessary to make a change of variables from rectangular to spherical polar coordinates in order to better match the spherical symmetry of particular contexts. Until now I had only ever worked out this change of variables using a scale factor method based on the formula for the Laplacian of a function F in general orthogonal curvilinear coordinates:

\nabla^2 F = \frac{1}{h_1 h_2 h_3}\bigg[ \frac{\partial }{\partial x_1}\big( \frac{h_2 h_3}{h_1}\frac{\partial F}{\partial x_1} \big) + \frac{\partial }{\partial x_2}\big( \frac{h_1 h_3}{h_2}\frac{\partial F}{\partial x_2} \big) + \frac{\partial }{\partial x_3}\big( \frac{h_1 h_2}{h_3}\frac{\partial F}{\partial x_3} \big) \bigg]

This formula is derived using basic results from vector calculus and allows Laplace’s equation to be easily expressed in any orthogonal coordinate system once the form of the Euclidean metric is known in that coordinate system. (Orthogonal coordinate systems are those for which the coordinate basis vectors are always orthogonal at each point). The Euclidean metric in general orthogonal coordinates x_1, x_2, x_3 will take the form

ds^2 = h_1^2 dx_1^2 + h_2^2 dx_2^2 + h_3^2 dx_3^2

The coefficients h_1, h_2 and h_3 are the scale factors appearing in the Laplacian formula. In the metric, these convert the coordinate differentials dx_i into lengths h_i dx_i.

In the case of rectangular coordinates we have x_1 = x, x_2 = y, x_3 = z and h_1 = h_2 = h_3 = 1 so the Euclidean metric and the Laplacian formula reduce to the familiar forms

ds^2 = dx^2 + dy^2 + dz^2

and

\nabla^2 F = \frac{\partial^2 F}{\partial x^2} + \frac{\partial^2 F}{\partial y^2} + \frac{\partial^2 F}{\partial z^2}

The scale factors are all equal to 1 in this case since the coordinate differentials are already in units of length. In the case of spherical polar coordinates (an orthogonal coordinate system) we can find the scale factors and hence the Laplacian by using the standard conversion equations

x = r \sin \theta \cos \phi

y = r \sin \theta \sin \phi

z = r \cos \theta

Taking the differentials of these, squaring them and adding them we find that the Euclidean metric in spherical polar coordinates takes the form

ds^2 = dr^2 + r^2 d \theta^2 + r^2 \sin^2 \theta d \phi^2

The scale factors in this case are therefore h_1 = 1, h_2 = r and h_3 = r \sin \theta. Putting these into the Laplacian formula we immediately get

\nabla^2 F = \frac{1}{r^2 \sin \theta} \bigg[ \frac{\partial }{\partial r}\big( \frac{r^2 \sin \theta}{1}\frac{\partial F}{\partial r} \big) + \frac{\partial }{\partial \theta}\big( \frac{r \sin \theta}{r}\frac{\partial F}{\partial \theta} \big) + \frac{\partial }{\partial \phi}\big( \frac{r}{r \sin \theta}\frac{\partial F}{\partial \phi} \big) \bigg]

= \frac{1}{r^2} \frac{\partial }{\partial r}\big( r^2 \frac{\partial F}{\partial r}\big) + \frac{1}{r^2 \sin \theta}\frac{\partial }{\partial \theta}\big( \sin \theta \frac{\partial F}{\partial \theta} \big) + \frac{1}{r^2 \sin^2 \theta}\frac{\partial^2 F}{\partial \phi^2}

Therefore Laplace’s equation in spherical polar coordinates is

\frac{1}{r^2} \frac{\partial }{\partial r}\big( r^2 \frac{\partial F}{\partial r}\big) + \frac{1}{r^2 \sin \theta}\frac{\partial }{\partial \theta}\big( \sin \theta \frac{\partial F}{\partial \theta} \big) + \frac{1}{r^2 \sin^2 \theta}\frac{\partial^2 F}{\partial \phi^2} = 0

I recently had occasion to use Laplace’s equation in spherical polar coordinates again and this time I decided to have a go at justifying the change of variables to myself using a matrix calculus approach instead of the above scalar factor method. This turned out to be a laborious but interesting process. In the present note I want to record my calculations using the matrix calculus approach in detail as I have never seen something like this elsewhere. The advantage of the matrix calculus approach is that it works for any coordinate system irrespective of whether it is orthogonal or not.

In the matrix calculus approach we use the standard conversion equations for spherical polar coordinates

x = r \sin \theta \cos \phi

y = r \sin \theta \sin \phi

z = r \cos \theta

to work out the coefficient matrix in the system

\begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi}\end{bmatrix} = \begin{bmatrix} \frac{\partial x}{\partial r} & \frac{\partial y}{\partial r} & \frac{\partial z}{\partial r}\\ \ \\ \frac{\partial x}{\partial \theta} & \frac{\partial y}{\partial \theta} & \frac{\partial z}{\partial \theta}\\ \ \\ \frac{\partial x}{\partial \phi} & \frac{\partial y}{\partial \phi} & \frac{\partial z}{\partial \phi}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial x}\\ \ \\ \frac{\partial F}{\partial y} \\ \ \\ \frac{\partial F}{\partial z} \end{bmatrix}

\big(which of course is equivalent to the set of equations

\frac{\partial F}{\partial r} = \frac{\partial F}{\partial x} \frac{\partial x}{\partial r} + \frac{\partial F}{\partial y} \frac{\partial y}{\partial r} + \frac{\partial F}{\partial z} \frac{\partial z}{\partial r}

\frac{\partial F}{\partial \theta} = \frac{\partial F}{\partial x} \frac{\partial x}{\partial \theta} + \frac{\partial F}{\partial y} \frac{\partial y}{\partial \theta} + \frac{\partial F}{\partial z} \frac{\partial z}{\partial \theta}

\frac{\partial F}{\partial \phi} = \frac{\partial F}{\partial x} \frac{\partial x}{\partial \phi} + \frac{\partial F}{\partial y} \frac{\partial y}{\partial \phi} + \frac{\partial F}{\partial z} \frac{\partial z}{\partial \phi} \big)

We get the system

\begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi}\end{bmatrix} = \begin{bmatrix} \sin \theta \cos \phi & \sin \theta \sin \phi & \cos \theta\\ \ \\ r \cos \theta \cos \phi & r \cos \theta \sin \phi & - r \sin \theta\\ \ \\ - r \sin \theta \sin \phi & r \sin \theta \cos \phi & 0\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial x}\\ \ \\ \frac{\partial F}{\partial y} \\ \ \\ \frac{\partial F}{\partial z} \end{bmatrix}

We can solve this system either by inverting the coefficient matrix or by using Cramer’s rule. Using Cramer’s rule we get

\frac{\partial F}{\partial x} = \frac{\begin{vmatrix} \frac{\partial F}{\partial r} & \sin \theta \sin \phi & \cos \theta\\ \ \\ \frac{\partial F}{\partial \theta} & r \cos \theta \sin \phi & - r \sin \theta\\ \ \\ \frac{\partial F}{\partial \phi} & r \sin \theta \cos \phi & 0\end{vmatrix}}{\begin{vmatrix} \sin \theta \cos \phi & \sin \theta \sin \phi & \cos \theta\\ \ \\ r \cos \theta \cos \phi & r \cos \theta \sin \phi & - r \sin \theta\\ \ \\ - r \sin \theta \sin \phi & r \sin \theta \cos \phi & 0\end{vmatrix}} = \frac{r^2 \sin^2 \theta \cos \phi \frac{\partial F}{\partial r} + r \cos \theta \sin \theta \cos \phi \frac{\partial F}{\partial \theta} - r \sin \phi \frac{\partial F}{\partial \phi} }{r^2 \sin \theta}
\
= \sin \theta \cos \phi \frac{\partial F}{\partial r} + \frac{\cos \theta \cos \phi}{r} \frac{\partial F}{\partial \theta} - \frac{\sin \phi}{r \sin \theta} \frac{\partial F}{\partial \phi}

and similarly

\frac{\partial F}{\partial y} = \frac{\begin{vmatrix} \sin \theta \cos \phi & \frac{\partial F}{\partial r} & \cos \theta\\ \ \\ r \cos \theta \cos \phi & \frac{\partial F}{\partial \theta} & - r \sin \theta\\ \ \\ - r \sin \theta \sin \phi & \frac{\partial F}{\partial \phi} & 0\end{vmatrix}}{\begin{vmatrix} \sin \theta \cos \phi & \sin \theta \sin \phi & \cos \theta\\ \ \\ r \cos \theta \cos \phi & r \cos \theta \sin \phi & - r \sin \theta\\ \ \\ - r \sin \theta \sin \phi & r \sin \theta \cos \phi & 0\end{vmatrix}} = \frac{r^2 \sin^2 \theta \sin \phi \frac{\partial F}{\partial r} + r \cos \theta \sin \theta \sin \phi \frac{\partial F}{\partial \theta} + r \cos \phi \frac{\partial F}{\partial \phi} }{r^2 \sin \theta}
\
= \sin \theta \sin \phi \frac{\partial F}{\partial r} + \frac{\cos \theta \sin \phi}{r} \frac{\partial F}{\partial \theta} + \frac{\cos \phi}{r \sin \theta} \frac{\partial F}{\partial \phi}

and

\frac{\partial F}{\partial z} = \frac{\begin{vmatrix} \sin \theta \cos \phi & \sin \theta \sin \phi & \frac{\partial F}{\partial r}\\ \ \\ r \cos \theta \cos \phi & r \cos \theta \sin \phi & \frac{\partial F}{\partial \theta}\\ \ \\ - r \sin \theta \sin \phi & r \sin \theta \cos \phi & \frac{\partial F}{\partial \phi}\end{vmatrix}}{\begin{vmatrix} \sin \theta \cos \phi & \sin \theta \sin \phi & \cos \theta\\ \ \\ r \cos \theta \cos \phi & r \cos \theta \sin \phi & - r \sin \theta\\ \ \\ - r \sin \theta \sin \phi & r \sin \theta \cos \phi & 0\end{vmatrix}} = \frac{r^2 \cos \theta \sin \theta \frac{\partial F}{\partial r} - r \sin^2 \theta \frac{\partial F}{\partial \theta}}{r^2 \sin \theta}
\
= \cos \theta \frac{\partial F}{\partial r} - \frac{\sin \theta}{r} \frac{\partial F}{\partial \theta}

We can write these solutions more concisely in matrix form as

\begin{bmatrix} \frac{\partial F}{\partial x}\\ \ \\ \frac{\partial F}{\partial y} \\ \ \\ \frac{\partial F}{\partial z}\end{bmatrix} = \begin{bmatrix} \sin \theta \cos \phi & \frac{\cos \theta \cos \phi}{r} & -\frac{\sin \phi}{r \sin \theta}\\ \ \\ \sin \theta \sin \phi & \frac{\cos \theta \sin \theta}{r} & \frac{\cos \phi}{r \sin \theta}\\ \ \\ \cos \theta & -\frac{\sin \theta}{r} & 0\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi} \end{bmatrix}

The coefficient matrix here is, of course, the inverse of the coefficient matrix in the original system.

To find the second-order partials \frac{\partial^2F}{\partial x^2}, \frac{\partial^2F}{\partial y^2} and \frac{\partial^2F}{\partial z^2}, let G \equiv \frac{\partial F}{\partial x}, H \equiv \frac{\partial F}{\partial y} and I \equiv \frac{\partial F}{\partial z}. Then we need to find \frac{\partial G}{x}, \frac{\partial H}{\partial y} and \frac{\partial I}{\partial z}. We can use the first, second and third equations respectively in the above matrix system to find these, with F replaced by G, H and I respectively. Thus,

\frac{\partial G}{\partial x} =  \begin{bmatrix} \sin \theta \cos \phi & \frac{\cos \theta \cos \phi}{r} & -\frac{\sin \phi}{r \sin \theta}\end{bmatrix} \begin{bmatrix} \frac{\partial G}{\partial r}\\ \ \\ \frac{\partial G}{\partial \theta} \\ \ \\ \frac{\partial G}{\partial \phi} \end{bmatrix}

\frac{\partial H}{\partial y} =  \begin{bmatrix} \sin \theta \sin \phi & \frac{\cos \theta \sin \phi}{r} & \frac{\cos \phi}{r \sin \theta}\end{bmatrix} \begin{bmatrix} \frac{\partial H}{\partial r}\\ \ \\ \frac{\partial H}{\partial \theta} \\ \ \\ \frac{\partial H}{\partial \phi} \end{bmatrix}

\frac{\partial I}{\partial z} =  \begin{bmatrix} \cos \theta & -\frac{\sin \theta}{r} \end{bmatrix} \begin{bmatrix} \frac{\partial I}{\partial r}\\ \ \\ \frac{\partial I}{\partial \theta} \end{bmatrix}

But differentiating

G \equiv \begin{bmatrix} \sin \theta \cos \phi & \frac{\cos \theta \cos \phi}{r} & -\frac{\sin \phi}{r \sin \theta}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi} \end{bmatrix}

with respect to r, \theta and \phi we get

\begin{bmatrix} \frac{\partial G}{\partial r}\\ \ \\ \frac{\partial G}{\partial \theta} \\ \ \\ \frac{\partial G}{\partial \phi}\end{bmatrix} = \begin{bmatrix} \frac{\partial^2 F}{\partial r^2} & \frac{\partial^2 F}{\partial r \partial \theta} & \frac{\partial^2 F}{\partial r \partial \phi}\\ \ \\ \frac{\partial^2 F}{\partial \theta \partial r} & \frac{\partial^2 F}{\partial \theta^2} & \frac{\partial^2 F}{\partial \theta \partial \phi}\\ \ \\ \frac{\partial^2 F}{\partial \phi \partial r} & \frac{\partial^2 F}{\partial \phi \partial \theta} & \frac{\partial^2 F}{\partial \phi^2}\end{bmatrix} \begin{bmatrix} \sin \theta \cos \phi\\ \ \\ \frac{\cos \theta \cos \phi}{r} \\ \ \\ -\frac{\sin \phi}{r \sin \theta} \end{bmatrix}  + \begin{bmatrix} 0 & -\frac{\cos \theta \cos \phi}{r^2} & \frac{\sin \phi}{r^2 \sin \theta}\\ \ \\ \cos \theta \cos \phi & -\frac{\sin \theta \cos \phi}{r} & \frac{\sin \phi}{r \sin^2 \theta \cos \theta}\\ \ \\ -\sin \theta \sin \phi & -\frac{\cos \theta \sin \phi}{r} & -\frac{\cos \phi}{r \sin \theta}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi} \end{bmatrix}

Similarly, differentiating

H \equiv \begin{bmatrix} \sin \theta \sin \phi & \frac{\cos \theta \sin \phi}{r} & \frac{\cos \phi}{r \sin \theta}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi} \end{bmatrix}

with respect to r, \theta and \phi we get

\begin{bmatrix} \frac{\partial H}{\partial r}\\ \ \\ \frac{\partial H}{\partial \theta} \\ \ \\ \frac{\partial H}{\partial \phi}\end{bmatrix} = \begin{bmatrix} \frac{\partial^2 F}{\partial r^2} & \frac{\partial^2 F}{\partial r \partial \theta} & \frac{\partial^2 F}{\partial r \partial \phi}\\ \ \\ \frac{\partial^2 F}{\partial \theta \partial r} & \frac{\partial^2 F}{\partial \theta^2} & \frac{\partial^2 F}{\partial \theta \partial \phi}\\ \ \\ \frac{\partial^2 F}{\partial \phi \partial r} & \frac{\partial^2 F}{\partial \phi \partial \theta} & \frac{\partial^2 F}{\partial \phi^2}\end{bmatrix} \begin{bmatrix} \sin \theta \sin \phi\\ \ \\ \frac{\cos \theta \sin \phi}{r} \\ \ \\ \frac{\cos \phi}{r \sin \theta} \end{bmatrix}  + \begin{bmatrix} 0 & -\frac{\cos \theta \sin \phi}{r^2} & -\frac{\cos \phi}{r^2 \sin \theta}\\ \ \\ \cos \theta \sin \phi & -\frac{\sin \theta \sin \phi}{r} & -\frac{\cos \phi}{r \sin^2 \theta \cos \theta}\\ \ \\ \sin \theta \cos \phi & \frac{\cos \theta \cos \phi}{r} & -\frac{\sin \phi}{r \sin \theta}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi} \end{bmatrix}

and differentiating

I \equiv \begin{bmatrix} \cos \theta & -\frac{\sin \theta}{r}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \end{bmatrix}

with respect to r and \theta we get

\begin{bmatrix} \frac{\partial I}{\partial r}\\ \ \\ \frac{\partial I}{\partial \theta} \end{bmatrix} = \begin{bmatrix} \frac{\partial^2 F}{\partial r^2} & \frac{\partial^2 F}{\partial r \partial \theta}\\ \ \\ \frac{\partial^2 F}{\partial \theta \partial r} & \frac{\partial^2 F}{\partial \theta^2} \end{bmatrix} \begin{bmatrix} \cos \theta \\ \ \\ -\frac{\sin \theta}{r} \end{bmatrix}  + \begin{bmatrix} 0 & \frac{\sin \theta}{r^2}\\ \ \\ -\sin \theta & -\frac{\cos \theta}{r}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \end{bmatrix}

Therefore we have

\frac{\partial G}{\partial x} =  \begin{bmatrix} \sin \theta \cos \phi & \frac{\cos \theta \cos \phi}{r} & -\frac{\sin \phi}{r \sin \theta}\end{bmatrix}\begin{bmatrix} \frac{\partial^2 F}{\partial r^2} & \frac{\partial^2 F}{\partial r \partial \theta} & \frac{\partial^2 F}{\partial r \partial \phi}\\ \ \\ \frac{\partial^2 F}{\partial \theta \partial r} & \frac{\partial^2 F}{\partial \theta^2} & \frac{\partial^2 F}{\partial \theta \partial \phi}\\ \ \\ \frac{\partial^2 F}{\partial \phi \partial r} & \frac{\partial^2 F}{\partial \phi \partial \theta} & \frac{\partial^2 F}{\partial \phi^2}\end{bmatrix} \begin{bmatrix} \sin \theta \cos \phi\\ \ \\ \frac{\cos \theta \cos \phi}{r} \\ \ \\ -\frac{\sin \phi}{r \sin \theta} \end{bmatrix}

+ \begin{bmatrix} \sin \theta \cos \phi & \frac{\cos \theta \cos \phi}{r} & -\frac{\sin \phi}{r \sin \theta}\end{bmatrix}\begin{bmatrix} 0 & -\frac{\cos \theta \cos \phi}{r^2} & \frac{\sin \phi}{r^2 \sin \theta}\\ \ \\ \cos \theta \cos \phi & -\frac{\sin \theta \cos \phi}{r} & \frac{\sin \phi}{r \sin^2 \theta \cos \theta}\\ \ \\ -\sin \theta \sin \phi & -\frac{\cos \theta \sin \phi}{r} & -\frac{\cos \phi}{r \sin \theta}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi} \end{bmatrix}

and similarly

\frac{\partial H}{\partial y} =  \begin{bmatrix} \sin \theta \sin \phi & \frac{\cos \theta \sin \phi}{r} & \frac{\cos \phi}{r \sin \theta}\end{bmatrix}\begin{bmatrix} \frac{\partial^2 F}{\partial r^2} & \frac{\partial^2 F}{\partial r \partial \theta} & \frac{\partial^2 F}{\partial r \partial \phi}\\ \ \\ \frac{\partial^2 F}{\partial \theta \partial r} & \frac{\partial^2 F}{\partial \theta^2} & \frac{\partial^2 F}{\partial \theta \partial \phi}\\ \ \\ \frac{\partial^2 F}{\partial \phi \partial r} & \frac{\partial^2 F}{\partial \phi \partial \theta} & \frac{\partial^2 F}{\partial \phi^2}\end{bmatrix} \begin{bmatrix} \sin \theta \sin \phi\\ \ \\ \frac{\cos \theta \sin \phi}{r} \\ \ \\ \frac{\cos \phi}{r \sin \theta} \end{bmatrix}

+ \begin{bmatrix} \sin \theta \sin \phi & \frac{\cos \theta \sin \phi}{r} & \frac{\cos \phi}{r \sin \theta}\end{bmatrix}\begin{bmatrix} 0 & -\frac{\cos \theta \sin \phi}{r^2} & -\frac{\cos \phi}{r^2 \sin \theta}\\ \ \\ \cos \theta \sin \phi & -\frac{\sin \theta \sin \phi}{r} & -\frac{\cos \phi}{r \sin^2 \theta \cos \theta}\\ \ \\ \sin \theta \cos \phi & \frac{\cos \theta \cos \phi}{r} & -\frac{\sin \phi}{r \sin \theta}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \\ \ \\ \frac{\partial F}{\partial \phi} \end{bmatrix}

and

\frac{\partial I}{\partial z} =  \begin{bmatrix} \cos \theta & -\frac{\sin \theta}{r} \end{bmatrix}\begin{bmatrix} \frac{\partial^2 F}{\partial r^2} & \frac{\partial^2 F}{\partial r \partial \theta}\\ \ \\ \frac{\partial^2 F}{\partial \theta \partial r} & \frac{\partial^2 F}{\partial \theta^2} \end{bmatrix} \begin{bmatrix} \cos \theta \\ \ \\ -\frac{\sin \theta}{r} \end{bmatrix}  + \begin{bmatrix} \cos \theta & -\frac{\sin \theta}{r} \end{bmatrix}\begin{bmatrix} 0 & \frac{\sin \theta}{r^2}\\ \ \\ -\sin \theta & -\frac{\cos \theta}{r}\end{bmatrix} \begin{bmatrix} \frac{\partial F}{\partial r}\\ \ \\ \frac{\partial F}{\partial \theta} \end{bmatrix}

Laplace’s equation is given by \frac{\partial G}{\partial x} + \frac{\partial H}{\partial y} + \frac{\partial I}{\partial z} = 0 so we need to work out the three matrix equations above and sum them. The result will be a sum involving the nine partials \frac{\partial^F}{\partial r^2}, \frac{\partial^F}{\partial \theta^2}, \frac{\partial^F}{\partial \phi^2}, \frac{\partial^F}{\partial r \partial \theta}, \frac{\partial^F}{\partial r \partial \phi}, \frac{\partial^F}{\partial \theta \phi}, \frac{\partial F}{\partial r}, \frac{\partial F}{\partial \theta} and \frac{\partial F}{\partial \phi}. I found that the most convenient way to work out the sum was by working out the coefficients of each of these nine partials individually in turn. The result is

\frac{\partial G}{\partial x} + \frac{\partial H}{\partial y} + \frac{\partial I}{\partial z} =

\frac{\partial^2 F}{\partial x^2} + \frac{\partial^2 F}{\partial y^2} + \frac{\partial^2 F}{\partial z^2} =

\big \{\sin^2 \theta \cos^2 \phi + \sin^2 \theta \sin^2 \phi + \cos^2 \theta \big \} \frac{\partial^2 F}{\partial r^2}

+ \big \{ \frac{\cos^2 \theta \cos^2 \phi}{r^2} + \frac{\cos^2 \theta \sin^2 \phi}{r^2} + \frac{\sin^2 \theta}{r^2} \big \} \frac{\partial^2 F}{\partial \theta^2}

+ \big \{ \frac{\sin^2 \phi}{r^2 \sin^2 \theta} + \frac{\cos^2 \phi}{r^2 \sin^2 \theta}\big \} \frac{\partial^2 F}{\partial \phi^2}

+ \big \{ \frac{\sin \theta \cos \theta \cos^2 \phi}{r} + \frac{\sin \theta \cos \theta \sin^2 \phi}{r} - \frac{\sin \theta \cos \theta}{r}\big \} \frac{\partial^2 F}{\partial r \partial \theta}

+ \big \{ -\frac{\cos \phi \sin \phi}{r} + \frac{\cos \phi \sin \phi}{r} \big \} \frac{\partial^2 F}{\partial r \partial \phi}

+ \big \{ -\frac{\cos \theta \cos \phi \sin \phi}{r^2 \sin \theta} -\frac{\cos \theta \cos \phi \sin \phi}{r^2 \sin \theta} + \frac{\cos \theta \cos \phi \sin \phi}{r^2 \sin \theta} + \frac{\cos \theta \cos \phi \sin \phi}{r^2 \sin \theta} \big \} \frac{\partial^2 F}{\partial \theta \partial \phi}

+ \big \{ \frac{\cos^2 \theta \cos^2 \phi}{r} + \frac{\sin^2 \phi}{r} + \frac{\cos^2 \theta \sin^2 \phi}{r} + \frac{\cos^2 \phi}{r} + \frac{\sin^2 \theta}{r}\big \} \frac{\partial F}{\partial r}

+ \big \{ -\frac{2\sin \theta \cos \theta \cos^2 \phi}{r^2} + \frac{\cos \theta \sin^2 \phi}{r^2 \sin \theta} -\frac{2\sin \theta \cos \theta \sin^2 \phi}{r^2} + \frac{\cos \theta \cos^2 \phi}{r^2 \sin \theta} + \frac{2\cos \theta \sin \theta}{r^2} \big \} \frac{\partial F}{\partial \theta}

+ \big \{ \frac{\cos \phi \sin \phi}{r^2} + \frac{2 \cos \phi \sin \phi}{r^2 \sin^2 \theta} - \frac{\cos \phi \sin \phi}{r^2} - \frac{2 \cos \phi \sin \phi}{r^2 \sin^2 \theta} \big \} \frac{\partial F}{\partial \phi}

which simplifies to

\frac{\partial^2 F}{\partial x^2} + \frac{\partial^2 F}{\partial y^2} + \frac{\partial^2 F}{\partial z^2} =

\frac{\partial^2 F}{\partial r^2} + \frac{1}{r^2} \frac{\partial^2 F}{\partial \theta^2} + \frac{1}{r^2 \sin^2 \theta} \frac{\partial^2 F}{\partial \phi^2} + \frac{2}{r} \frac{\partial F}{\partial r} + \frac{\cos \theta}{r^2 \sin \theta} \frac{\partial F}{\partial \theta}

= \frac{1}{r^2} \frac{\partial }{\partial r}\big( r^2 \frac{\partial F}{\partial r}\big) + \frac{1}{r^2 \sin \theta}\frac{\partial }{\partial \theta}\big( \sin \theta \frac{\partial F}{\partial \theta} \big) + \frac{1}{r^2 \sin^2 \theta}\frac{\partial^2 F}{\partial \phi^2}

Therefore Laplace’s equation in spherical polar coordinates is

\frac{1}{r^2} \frac{\partial }{\partial r}\big( r^2 \frac{\partial F}{\partial r}\big) + \frac{1}{r^2 \sin \theta}\frac{\partial }{\partial \theta}\big( \sin \theta \frac{\partial F}{\partial \theta} \big) + \frac{1}{r^2 \sin^2 \theta}\frac{\partial^2 F}{\partial \phi^2} = 0

as before.