Notes on Sturm-Liouville theory

Sturm-Liouville theory was developed in the 19th century in the context of solving differential equations. When one studies it in depth, however, one experiences a sudden realisation that this is the mathematics underlying a lot of quantum mechanics. In quantum mechanics we envisage a quantum state (a time-dependent function) expressed as a superposition of eigenfunctions of a self-adjoint operator (usually referred to as a Hermitian operator) representing an observable. The coefficients of the eigenfunctions in this superposition are probability amplitudes. A measurement of the observable quantity represented by the Hermitian operator produces one of the eigenvalues of the operator with a probability equal to the square of the probability amplitude attached to the eigenfunction corresponding to that eigenvalue in the superposition. It is the fact that the operator is self-adjoint that ensures the eigenvalues are real (and thus observable), and furthermore, that the eigenfunctions corresponding to the eigenvalues form a complete and orthogonal set of functions enabling quantum states to be represented as a superposition in the first place (i.e., an eigenfunction expansion akin to a Fourier series). The Sturm-Liouville theory of the 19th century has essentially this same structure and in fact Sturm-Liouville eigenvalue problems are important more generally in mathematical physics precisely because they frequently arise in attempting to solve commonly-encountered partial differential equations (e.g., Poisson’s equation, the diffusion equation, the wave equation, etc.), particularly when the method of separation of variables is employed.

I want to get an overview of Sturm-Liouville theory in the present note and will begin by considering a nice discussion of a vibrating string problem in Courant & Hilbert’s classic text, Methods of Mathematical Physics (Volume I). Although the problem is simple and the treatment in Courant & Hilbert a bit terse, it (remarkably) brings up a lot of the key features of Sturm-Liouville theory which apply more generally in a wide variety of physics problems. I will then consider Sturm-Liouville theory in a more general setting emphasising the role of the Sturm-Liouville differential operator, and finally I will illustrate further the occurrence of Sturm-Liouville systems in physics by looking at the eigenvalue problems encountered when solving Schrödinger’s equation for the hydrogen atom.

On page 287 of Volume I, Courant & Hilbert write the following:

Equation (12) here is the one-dimensional wave equation

$\frac{\partial^2 u}{\partial x^2} = \mu^2 \frac{\partial^2 u}{\partial t^2}$

which (as usual) the authors are going to solve by using a separation of variables of the form

$u(x, t) = v(x) g(t)$

As Courant & Hilbert explain, the problem then involves finding the function $v(x)$ by solving the second-order homogeneous linear differential equation

$\frac{\partial^2 v}{\partial x^2} + \lambda v = 0$

subject to the boundary conditions

$v(0) = v(\pi) = 0$

Although not explicitly mentioned by Courant & Hilbert at this stage, equations (13) and (13a) in fact constitute a full blown Sturm-Liouville eigenvalue problem. Despite being very simple, this setup captures many of the typical features encountered in a wide variety of such problems in physics. It is instructive to explore the text underneath equation (13a):

Not all these requirements can be fulfilled for arbitrary values of the constant $\lambda$.

… the boundary conditions can be fulfilled if and only if $\lambda = n^2$ is the square of an integer $n$.

To clarify this, we can try to solve (13) and (13a) for the three possible cases: $\lambda < 0$, $\lambda = 0$ and $\lambda > 0$. Suppose first that $\lambda < 0$. Then $-\lambda > 0$ and the auxiliary equation for (13) is

$D^2 = - \lambda$

$\implies$

$D = \pm \sqrt{- \lambda}$

Thus, we can write the general solution of (13) in this case as

$v = \alpha e^{\sqrt{-\lambda} x} + \beta e^{-\sqrt{-\lambda} x} = A \mathrm{cosh} \big(\sqrt{-\lambda} x\big) + B \mathrm{sinh} \big(\sqrt{-\lambda} x\big)$

where $A$ and $B$ are constants to be determined from the boundary conditions. From the boundary condition $v(0) = 0$ we conclude that $A = 0$ so the equation reduces to

$v = B \mathrm{sinh} \big(\sqrt{-\lambda} x\big)$

But from the boundary condition $v(\pi) = 0$ we are forced to conclude that $B = 0$ since $\mathrm{sinh} \big(\sqrt{-\lambda} \pi\big) \neq 0$. Therefore there is only the trivial solution $v(x) = 0$ in the case $\lambda < 0$.

Next, suppose that $\lambda = 0$. Then equation (13) reduces to

$\frac{\mathrm{d}^2 v}{\mathrm{d} x^2} = 0$

$\implies$

$v = A + Bx$

From the boundary condition $v(0) = 0$ we must conclude that $A = 0$, and the boundary condition $v(\pi) = 0$ means we are also forced to conclude that $B = 0$. Thus, again, there is only the trivial solution $v(x) = 0$ in the case $\lambda = 0$.

We see that nontrivial solutions can only be obtained when $\lambda > 0$. In this case we have $-\lambda < 0$ and the auxiliary equation is

$D^2 = - \lambda$

$\implies$

$D = \pm i \sqrt{\lambda}$

Thus, we can write the general solution of (13) in this case as

$v = \alpha e^{i \sqrt{\lambda} x} + \beta e^{- i \sqrt{\lambda} x} = A \mathrm{cos} \big(\sqrt{\lambda} x\big) + B \mathrm{sin} \big(\sqrt{\lambda} x\big)$

where $A$ and $B$ are again to be determined from the boundary conditions. From the boundary condition $v(0) = 0$ we conclude that $A = 0$ so the equation reduces to

$v = B \mathrm{sin} \big(\sqrt{\lambda} x\big)$

But from the boundary condition $v(\pi) = 0$ we must conclude that, if $B \neq 0$, then we must have $\sqrt{\lambda} = n$ where $n = 1, 2, 3, \ldots$. Thus, we find that for each $n = 1, 2, 3, \ldots$, the eigenvalues of this Sturm-Liouville problem are $\lambda_n = n^2$, and the corresponding eigenfunctions are $v = B \mathrm{sin}\big(n x\big)$. The coefficient $B$ is undetermined and must be specified through some normalisation process, for example by setting the integral of $v^2$ between $0$ and $\pi$ equal to $1$ and then finding the value of $B$ that is consistent with this. In Courant & Hilbert they have (implicitly) simply set $B = 1$.

Some features of this solution are typical of Sturm-Liouville eigenvalue problems in physics more generally. For example, the eigenvalues are real (rather than complex) numbers, there is a minimum eigenvalue ($\lambda_1 = 1$) but not a maximum one, and for each eigenvalue there is a unique eigenfunction (up to a multiplicative constant). Also, importantly, the eigenfunctions here form a complete and orthogonal set of functions. Orthogonality refers to the fact that the integral of a product of any two distinct eigenfunctions over the interval $(0, \pi)$ is zero, i.e.,

$\int_0^{\pi} \mathrm{sin}(nx) \mathrm{sin}(mx) \mathrm{d} x = 0$

for $n \neq m$, as can easily be demonstrated in the same way as in the theory of Fourier series. Completeness refers to the fact that over the interval $(0, \pi)$ the infinite set of functions $\mathrm{sin} (nx)$, $n = 1, 2, 3, \ldots$, can be used to represent any sufficiently well behaved function $f(x)$ using a Fourier series of the form

$f(x) = \sum_{n=1}^{\infty} a_n \mathrm{sin} (nx)$

All of this is alluded to (without explicit explanation at this stage) in the subsequent part of this section of Courant & Hilbert’s text, where they go on to provide the general solution of the vibrating string problem. They write the following:

The properties of completeness and orthogonality of the eigenfunctions are again a typical feature of the solutions of Sturm-Liouville eigenvalue problems more generally, and this is one of the main reasons why Sturm-Liouville theory is so important to the solution of physical problems involving differential equations. To get a better understanding of this, I will now develop Sturm-Liouville theory in a more general setting by starting with a standard second-order homogeneous linear differential equation of the form

$\alpha(x) \frac{\mathrm{d}^2 y}{\mathrm{d} x^2} + \beta(x) \frac{\mathrm{d} y}{\mathrm{d} x} + \gamma(x) y = 0$

where the variable $x$ is confined to an interval $a \leq x \leq b$.

Let

$p(x) = \mathrm{exp} \bigg(\int \mathrm{d} x \frac{\beta(x)}{\alpha(x)}\bigg)$

$q(x) = \frac{\gamma(x)}{\alpha(x)} p(x)$

Dividing the differential equation by $\alpha(x)$ and multiplying through by $p(x)$ we get

$p(x) \frac{\mathrm{d}^2 y}{\mathrm{d} x^2} + \frac{\beta(x)}{\alpha(x)} p(x) \frac{\mathrm{d} y}{\mathrm{d} x} + \frac{\gamma(x)}{\alpha(x)} p(x) y = 0$

$\iff$

$\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} y}{\mathrm{d} x} \bigg) + q(x) y = 0$

$\iff$

$L y = 0$

where

$L = \frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} }{\mathrm{d} x} \bigg) + q(x)$

is called the Sturm-Liouville differential operator. Thus, we see already that a wide variety of second-order differential equations encountered in physics will be able to be put into a form involving the operator $L$, so results concerning the properties of $L$ will have wide applicability.

Using the Sturm-Liouville operator we can now write the defining differential equation of Sturm-Liouville theory in an eigenvalue-eigenfunction format that is very reminiscent of the setup in quantum mechanics outlined at the start of this note. The defining differential equation is

$L \phi = - \lambda w \phi$

where $w(x)$ is a real-valued positive weight function and $\lambda$ is an eigenvalue corresponding to the eigenfunction $\phi$. This differential equation is often written out in full as

$\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} \phi}{\mathrm{d} x} \bigg) + \big(q(x) + \lambda w(x)\big) \phi = 0$

with $x \in [a, b]$. In Sturm-Liouville problems, the functions $p(x)$, $q(x)$ and $w(x)$ are specified at the start and, crucially, the function $\phi$ is required to satisfy particular boundary conditions at $a$ and $b$. The boundary conditions are a key aspect of each Sturm-Liouville problem; for a given form of the differential equation, different boundary conditions can produce very different problems. Solving a Sturm-Liouville problem involves finding the values of $\lambda$ for which there exist non-trivial solutions of the defining differential equation above subject to the specified boundary conditions. The vibrating string problem in Courant & Hilbert (discussed above) is a simple example. We obtain the differential equation (13) in that problem by setting $p(x) = 1$, $q(x) = 0$ and $w(x) = 1$ in the defining Sturm-Liouville differential equation.

We would now like to prove that the eigenvalues in a Sturm-Liouville problem will always be real and that the eigenfunctions will form an orthogonal set of functions, as claimed earlier. To do this, we need to consider a few more developments. In Sturm-Liouville theory we can apply $L$ to both real and complex functions, and a key role is played by the concept of the inner product of such functions. Using the notation $f(x)^{*}$ to denote the complex conjugate of the function $f(x)$, we define the inner product of two functions $f$ and $g$ over the interval $a \leq x \leq b$ as

$(f, g) = \int_a^b \mathrm{d} x f(x)^{*} g(x)$

and we define the weighted inner product as

$(f, g)_w = \int_a^b \mathrm{d} x w(x) f(x)^{*} g(x)$

where $w(x)$ is the real-valued positive weight function mentioned earlier. A key result in the theory is Lagrange’s identity, which says that for any two complex-valued functions of a real variable $u(x)$ and $v(x)$, we have

$v(Lu)^{*} - u^{*} Lv = \frac{\mathrm{d}}{\mathrm{d} x} \bigg[p(x) \bigg(v \frac{\mathrm{d}u^{*}}{\mathrm{d}x} - u^{*}\frac{\mathrm{d}v}{\mathrm{d}x}\bigg) \bigg]$

This follows from the form of $L$, since

$v(Lu)^{*} - u^{*} Lv = v\bigg[\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d}u^{*}}{\mathrm{d} x} \bigg) + q(x) u^{*}\bigg] - u^{*} \bigg[\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} v}{\mathrm{d} x} \bigg) + q(x) v\bigg]$

$= v \frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d}u^{*}}{\mathrm{d} x} \bigg) - u^{*} \frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} v}{\mathrm{d} x} \bigg)$

$= v \frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d}u^{*}}{\mathrm{d} x} \bigg) + \frac{\mathrm{d} v}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d}u^{*}}{\mathrm{d} x} \bigg) - u^{*} \frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} v}{\mathrm{d} x} \bigg) - \frac{\mathrm{d} u^{*}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} v}{\mathrm{d} x} \bigg)$

$= \frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) v \frac{\mathrm{d}u^{*}}{\mathrm{d} x} \bigg) - \frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) u^{*} \frac{\mathrm{d} v}{\mathrm{d} x} \bigg)$

$= \frac{\mathrm{d}}{\mathrm{d} x} \bigg[p(x) \bigg(v \frac{\mathrm{d}u^{*}}{\mathrm{d}x} - u^{*}\frac{\mathrm{d}v}{\mathrm{d}x}\bigg) \bigg]$

Using the inner product notation, we can write Lagrange’s identity in an alternative form that reveals the crucial role played by the boundary conditions in a Sturm-Liouville problem. We have

$(Lu, v) - (u, Lv) = \int_a^b (Lu)^{*} v \mathrm{d} x - \int_a^b u^{*} Lv \mathrm{d} x$

$= \int_a^b \frac{\mathrm{d}}{\mathrm{d} x} \bigg[p(x) \bigg(v \frac{\mathrm{d}u^{*}}{\mathrm{d}x} - u^{*}\frac{\mathrm{d}v}{\mathrm{d}x}\bigg) \bigg] \mathrm{d} x$

$= \int_a^b \mathrm{d} \bigg[p(x) \bigg(v \frac{\mathrm{d}u^{*}}{\mathrm{d}x} - u^{*}\frac{\mathrm{d}v}{\mathrm{d}x}\bigg) \bigg]$

$= \bigg[p(x) \bigg(v \frac{\mathrm{d}u^{*}}{\mathrm{d}x} - u^{*}\frac{\mathrm{d}v}{\mathrm{d}x}\bigg) \bigg]_a^b$

For some boundary conditions the final term here is zero and then we will have

$(Lu, v) = (u, Lv)$

When this happens, the operator in conjunction with the boundary conditions is said to be self-adjoint. As an example, a so-called regular Sturm-Liouville problem involves solving the differential equation

$\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} \phi}{\mathrm{d} x} \bigg) + \big(q(x) + \lambda w(x)\big) \phi = 0$

subject to what are called separated boundary conditions, taking the form

$A_1 \phi(a) + A_2 \phi^{\prime}(a) = 0$

and

$B_1 \phi(b) + B_2 \phi^{\prime}(b) = 0$

In this case, the operator $L$ is self-adjoint. To see this, suppose the functions $u$ and $v$ satisfy these boundary conditions. Then at $a$ we have

$A_1 u(a)^{*} + A_2 u^{\prime}(a)^{*} = 0$

and

$A_1 v(a) + A_2 v^{\prime}(a) = 0$

from which we can deduce that

$\frac{u^{\prime}(a)^{*}}{u(a)^{*}} = -\frac{A_1}{A_2} = \frac{v^{\prime}(a)}{v(a)}$

$\implies$

$v(a) u^{\prime}(a)^{*} = u(a)^{*} v^{\prime}(a)$

Similarly, at the boundary point $b$ we find that

$v(b) u^{\prime}(b)^{*} = u(b)^{*} v^{\prime}(b)$

These results then imply

$\bigg[p(x) \bigg(v \frac{\mathrm{d}u^{*}}{\mathrm{d}x} - u^{*}\frac{\mathrm{d}v}{\mathrm{d}x}\bigg) \bigg]_a^b = 0$

so the operator $L$ is self-adjoint as claimed. As another example, a singular Sturm-Liouville problem involves solving the same differential equation as in the regular problem, but subject to the boundary condition that $p(x)$ is zero at either $a$ or $b$ or both, while being positive for $a < x < b$. If $p(x)$ does not vanish at one of the boundary points, then $\phi$ is required to satisfy the same boundary condition at that point as in the regular problem. Clearly we will have

$\bigg[p(x) \bigg(v \frac{\mathrm{d}u^{*}}{\mathrm{d}x} - u^{*}\frac{\mathrm{d}v}{\mathrm{d}x}\bigg) \bigg]_a^b = 0$

in this case too, so the operator $L$ will also be self-adjoint in the case of a singular Sturm-Liouville problem. As a final example, suppose the Sturm-Liouville problem involves solving the same differential equation as before, but with periodic boundary conditions of the form

$\phi(a) = \phi(b)$

$\phi^{\prime}(a) = \phi^{\prime}(b)$

and

$p(a) = p(b)$

Then if $u$ and $v$ are two functions satisfying these boundary conditions we will have

$\bigg[p(x) \bigg(v \frac{\mathrm{d}u^{*}}{\mathrm{d}x} - u^{*}\frac{\mathrm{d}v}{\mathrm{d}x}\bigg) \bigg]_a^b$

$= p(b) \bigg(v(b) u^{\prime}(b)^{*} - u(b)^{*} v^{\prime}(b)\bigg) - p(a) \bigg(v(a) u^{\prime}(a)^{*} - u(a)^{*} v^{\prime}(a)\bigg)$

$= p(a) \bigg[\bigg(v(b) u^{\prime}(b)^{*} - v(a) u^{\prime}(a)^{*}\bigg) + \bigg(u(a)^{*} v^{\prime}(a) - u(b)^{*} v^{\prime}(b)\bigg)\bigg] = 0$

So again, the operator $L$ will be self-adjoint in the case of periodic boundary conditions. We will see later that the singular and periodic cases arise when attempting to solve Schrödinger’s equation for the hydrogen atom.

The key reason for focusing so much on the self-adjoint property of the operator $L$ is that the eigenvalues of a self-adjoint operator are always real, and the eigenfunctions are orthogonal. Note that by orthogonality of the eigenfunctions in the more general context we mean that

$(\phi_n, \phi_m)_w = \int_a^b \mathrm{d} x w(x) \phi_n(x)^{*} \phi_m(x) = 0$

whenever $\phi_n(x)$ and $\phi_m(x)$ are eigenfunctions corresponding to two distinct eigenvalues.

To prove that the eigenvalues are always real, suppose that $\phi(x)$ is an eigenfunction corresponding to an eigenvalue $\lambda$. Then we have

$L \phi = - \lambda w \phi$

and so

$(L \phi, \phi) = (- \lambda w \phi, \phi) = \int_a^b (- \lambda w \phi)^{*} \phi \mathrm{d} x = -\lambda^{*} \int_a^b (w \phi)^{*} \phi \mathrm{d} x = -\lambda^{*}\int_a^b \mathrm{d}x w(x)|\phi(x)|^2$

But we also have

$(\phi, L \phi) = (\phi, - \lambda w \phi) = \int_a^b \phi^{*}(- \lambda w \phi) \mathrm{d} x = -\lambda \int_a^b \phi^{*} (w \phi) \mathrm{d} x = -\lambda\int_a^b \mathrm{d}x w(x)|\phi(x)|^2$

Therefore if the operator is self-adjoint we can write

$(L \phi, \phi) - (\phi, L \phi) = (\lambda - \lambda^{*}) \int_a^b \mathrm{d}x w(x)|\phi(x)|^2 = 0$

$\implies$

$\lambda = \lambda^{*}$

since $\int_a^b \mathrm{d}x w(x)|\phi(x)|^2 > 0$, so the eigenvalues must be real. In particular, this must be the case for regular and singular Sturm-Liouville problems, and for Sturm-Liouville problems involving periodic boundary conditions.

To prove that the eigenfunctions are orthogonal, let $\phi(x)$ and $\psi(x)$ denote two eigenfunctions corresponding to distinct eigenvalues $\lambda$ and $\mu$ respectively. Then we have

$L \phi = - \lambda w \phi$

$L \psi = - \mu w \psi$

and so by the self-adjoint property we can write

$(L \phi, \psi) - (\phi, L \psi) = \int_a^b (- \lambda w \phi)^{*} \psi \mathrm{d} x - \int_a^b \phi^{*} (- \mu w \psi) \mathrm{d} x$

$= (\mu - \lambda) \int_a^b \mathrm{d}x w(x)\phi(x)^{*} \psi(x) = 0$

Since the eigenvalues are distinct, the only way this can happen is if

$(\phi, \psi)_w = \int_a^b \mathrm{d}x w(x)\phi(x)^{*} \psi(x) = 0$

so the eigenfunctions must be orthogonal as claimed.

In addition to being orthogonal, the eigenfunctions $\phi_n(x)$, $n = 1, 2, 3, \dots$, of a Sturm-Liouville problem with specified boundary conditions also form a complete set of functions (I will not prove this here), which means that any sufficiently well-behaved function $f(x)$ for which $\int_a^b\mathrm{d} x |f(x)|^2$ exists can be represented by a Fourier series of the form

$f(x) = \sum_{n=1}^{\infty} a_n \phi_n(x)$

for $x \in [a, b]$, where the coefficients $a_n$ are given by the formula

$a_n = \frac{(\phi_n, f)_w}{(\phi_n, \phi_n)_w} = \frac{\int_a^b \mathrm{d}x w(x) \phi_n(x)^{*} f(x)}{\int_a^b \mathrm{d}x w(x) |\phi_n(x)|^2}$

It is the completeness and orthogonality of the eigenfunctions that makes Sturm-Liouville theory so useful in solving linear differential equations, because (for example) it means that the solutions of many second-order inhomogeneous linear differential equations of the form

$\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} \phi}{\mathrm{d} x} \bigg) + q(x) \phi = F(x)$

with suitable boundary conditions can be expressed as a linear combination of the eigenfunctions of the corresponding Sturm-Liouville problem

$\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} \phi}{\mathrm{d} x} \bigg) + \big(q(x) + \lambda w(x)\big) \phi = 0$

with the same boundary conditions. To illustrate this, suppose this Sturm-Liouville problem with boundary conditions $\phi(a) = \phi(b) = 0$ has an infinite set of eigenvalues $\lambda_k$ and corresponding eigenfunctions $\phi_k(x)$, $k = 1, 2, 3, \dots$, which are orthogonal and form a complete set. We will assume that the solution of the inhomogeneous differential equation above is an infinite series of the form

$\phi(x) = \sum_{k = 1}^{\infty} a_k \phi_k(x)$

where the coefficients $a_k$ are constants, and we will find these coefficients using the orthogonality of the eigenfunctions. Since for each $k$ it is true that

$\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p \frac{\mathrm{d} \phi_k}{\mathrm{d} x} \bigg) + q \phi_k = - \lambda_k w(x) \phi_k$

we can write

$\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p \frac{\mathrm{d} \phi}{\mathrm{d} x} \bigg) + q \phi$

$= \frac{\mathrm{d}}{\mathrm{d} x} \bigg(p \sum_{k = 1}^{\infty} a_k \frac{\mathrm{d} \phi_k}{\mathrm{d} x} \bigg) + q \sum_{k=1}^{\infty} a_k \phi_k$

$= \sum_{k=1}^{\infty} a_k \bigg[\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p \frac{\mathrm{d} \phi_k}{\mathrm{d} x} \bigg) + q \phi_k\bigg]$

$= \sum_{k=1}^{\infty} a_k\big[- \lambda_k w(x) \phi_k\big]$

$= - \sum_{k=1}^{\infty} a_k \lambda_k w(x) \phi_k$

Thus, in the inhomogeneous equation

$\frac{\mathrm{d}}{\mathrm{d} x} \bigg(p(x) \frac{\mathrm{d} \phi}{\mathrm{d} x} \bigg) + q(x) \phi = F(x)$

we can put

$F(u) = - \sum_{k=1}^{\infty} a_k \lambda_k w(u) \phi_k(u)$

To find the $m$th coefficient $a_m$ we can multiply both sides by $\phi_m(u)^{*}$ and integrate. By orthogonality, all the terms in the sum on the right will vanish except the one involving $\phi_m(u)$. We will get

$\int_a^b \phi_m(u)^{*} F(u) \mathrm{d}u = - \int_a^b a_m \lambda_m w(x) \phi_m(u)^{*}\phi_m(u) \mathrm{d} u = -a_m \lambda_m (\phi_m, \phi_m)_w$

$\implies$

$a_m = -\int_a^b \frac{\phi_m(u)^{*} F(u)}{\lambda_m (\phi_m, \phi_m)_w}\mathrm{d} u$

Having found a formula for the coefficients $a_k$, we can now write the solution of the original inhomogeneous differential equation as

$\phi(x) = \sum_{k = 1}^{\infty} a_k \phi_k(x)$

$= \sum_{k = 1}^{\infty} \bigg(-\int_a^b \frac{\phi_k(u)^{*} F(u)}{\lambda_k (\phi_k, \phi_k)_w}\mathrm{d} u\bigg) \phi_k(x)$

$= \int_a^b \mathrm{d} u \bigg(-\sum_{k = 1}^{\infty} \frac{\phi_k(u)^{*} \phi_k(x)}{\lambda_k (\phi_k, \phi_k)_w}\bigg) F(u)$

$= \int_a^b \mathrm{d} u G(x, u) F(u)$

where

$G(x, u) \equiv -\sum_{k = 1}^{\infty} \frac{\phi_k(u)^{*} \phi_k(x)}{\lambda_k (\phi_k, \phi_k)_w}$

To conclude this note, I want to go back to a previous note in which I explored in detail the solution of Schrödinger’s equation for the hydrogen atom by the method of separation of variables. This approach reduced Schrödinger’s partial differential equation into a set of three uncoupled ordinary differential equations which we can now see are in fact Sturm-Liouville problems. As discussed in my previous note, Schrödinger’s three-dimensional equation for the hydrogen atom can be written in spherical polar coordinates as

$\frac{1}{r^2} \frac{\partial }{\partial r}\big( r^2 \frac{\partial \psi}{\partial r}\big) + \frac{1}{r^2 \sin \theta}\frac{\partial }{\partial \theta}\big( \sin \theta \frac{\partial \psi}{\partial \theta} \big) + \frac{1}{r^2 \sin^2 \theta}\frac{\partial^2 \psi}{\partial \phi^2} + \frac{2m_e}{\hbar^2}(E - U) \psi = 0$

and after solving this by the usual separation of variables approach starting from the assumption that the $\psi$ function can be expressed as a product

$\psi(r, \theta, \phi) = R(r) \Phi(\phi) \Theta(\theta)$

we end up with an equation for $R$ (the radial equation) of the form

$\frac{1}{r^2} \frac{d}{d r}\big( r^2 \frac{d R}{d r}\big) + \big[ \frac{2m_e}{\hbar^2}(E - U) - \frac{\lambda}{r^2} \big] R = 0$

and equations for $\Phi$ and $\Theta$ of the forms

$\frac{d^2 \Phi}{d \phi^2} + k \Phi = 0$

and

$\frac{1}{\sin \theta}\frac{d}{d \theta}\big(\sin \theta \frac{d \Theta}{d \theta}\big) + \big( \lambda - \frac{k}{\sin^2 \theta}\big) \Theta = 0$

respectively. Taking each of these in turn, we first observe that the radial equation is of the Sturm-Liouville form with $p(r) = r^2$ and eigenvalues corresponding to the energy term $E$ in the equation. The variable $r$ can range between $0$ and $\infty$ and the boundary conditions are formulated in such a way that the solutions of the radial equation remain bounded as $r \rightarrow 0$ and go to zero as $r \rightarrow \infty$. Furthermore, since $p(0) =0$, the radial equation is a singular Sturm-Liouville problem. Next, we observe that the equation for $\Phi$ is essentially the same as equation (13) for the vibrating string in the extract from Courant & Hilbert discussed at the start of this note. The azimuth angle $\phi$ can take any value in $(-\infty, \infty)$ but the function $\Phi$ must take a single value at each point in space (since this is a required property of the quantum wave function which $\Phi$ is a constituent of). It follows that the function $\Phi$ must be periodic since it must take the same value at $\phi$ and $\phi + 2\pi$ for any given $\phi$. This condition implies the conditions $\Phi(0) = \Phi(2 \pi)$ and $\Phi^{\prime}(0) = \Phi^{\prime}(2\pi)$. Furthermore, we have $p(\phi) = 1$ for all $\phi$. Thus, the equation for $\Phi$ is a Sturm-Liouville problem with periodic boundary conditions. Finally, as discussed in my previous note, the $\Theta$ equation can be rewritten as

$(1 - x^2) \frac{d^2 \Theta}{d x^2} - 2x \frac{d \Theta}{d x} + \big(\lambda - \frac{m^2}{1 - x^2} \big) \Theta = 0$

$\iff$

$\frac{d}{d x}\bigg((1 - x^2) \frac{d \Theta}{d x}\bigg) + \big(\lambda - \frac{m^2}{1 - x^2} \big) \Theta = 0$

where $x = \cos \theta$ and thus $-1 \leq x \leq 1$. This is a Sturm-Liouville problem with $p(x) = 1 - x^2$ and the boundary conditions are given by the requirement that $\Theta(\theta)$ should remain bounded for all $x$. Since $p(x) = 0$ at both ends of the interval $[-1, 1]$, this equation can be classified as a singular Sturm-Liouville problem. The eigenvalue is $\lambda$ in this equation.