Sturm-Liouville theory was developed in the 19th century in the context of solving differential equations. When one studies it in depth, however, one experiences a sudden realisation that *this is the mathematics underlying a lot of quantum mechanics*. In quantum mechanics we envisage a quantum state (a time-dependent function) expressed as a superposition of eigenfunctions of a self-adjoint operator (usually referred to as a Hermitian operator) representing an observable. The coefficients of the eigenfunctions in this superposition are probability amplitudes. A measurement of the observable quantity represented by the Hermitian operator produces one of the eigenvalues of the operator with a probability equal to the square of the probability amplitude attached to the eigenfunction corresponding to that eigenvalue in the superposition. It is the fact that the operator is self-adjoint that ensures the eigenvalues are real (and thus observable), and furthermore, that the eigenfunctions corresponding to the eigenvalues form a complete and orthogonal set of functions enabling quantum states to be represented as a superposition in the first place (i.e., an eigenfunction expansion akin to a Fourier series). The Sturm-Liouville theory of the 19th century has essentially this same structure and in fact Sturm-Liouville eigenvalue problems are important more generally in mathematical physics precisely because they frequently arise in attempting to solve commonly-encountered partial differential equations (e.g., Poisson’s equation, the diffusion equation, the wave equation, etc.), particularly when the method of separation of variables is employed.

I want to get an overview of Sturm-Liouville theory in the present note and will begin by considering a nice discussion of a vibrating string problem in Courant & Hilbert’s classic text, *Methods of Mathematical Physics (Volume I). *Although the problem is simple and the treatment in Courant & Hilbert a bit terse, it (remarkably) brings up a lot of the key features of Sturm-Liouville theory which apply more generally in a wide variety of physics problems. I will then consider Sturm-Liouville theory in a more general setting emphasising the role of the Sturm-Liouville differential operator, and finally I will illustrate further the occurrence of Sturm-Liouville systems in physics by looking at the eigenvalue problems encountered when solving Schrödinger’s equation for the hydrogen atom.

On page 287 of Volume I, Courant & Hilbert write the following:

Equation (12) here is the one-dimensional wave equation

which (as usual) the authors are going to solve by using a separation of variables of the form

As Courant & Hilbert explain, the problem then involves finding the function by solving the second-order homogeneous linear differential equation

subject to the boundary conditions

Although not explicitly mentioned by Courant & Hilbert at this stage, equations (13) and (13a) in fact constitute a full blown Sturm-Liouville eigenvalue problem. Despite being very simple, this setup captures many of the typical features encountered in a wide variety of such problems in physics. It is instructive to explore the text underneath equation (13a):

*Not all these requirements can be fulfilled for arbitrary values of the constant* *.*

*… the boundary conditions can be fulfilled if and only if is the square of an integer* .

To clarify this, we can try to solve (13) and (13a) for the three possible cases: , and . Suppose first that . Then and the auxiliary equation for (13) is

Thus, we can write the general solution of (13) in this case as

where and are constants to be determined from the boundary conditions. From the boundary condition we conclude that so the equation reduces to

But from the boundary condition we are forced to conclude that since . Therefore there is only the trivial solution in the case .

Next, suppose that . Then equation (13) reduces to

From the boundary condition we must conclude that , and the boundary condition means we are also forced to conclude that . Thus, again, there is only the trivial solution in the case .

We see that nontrivial solutions can only be obtained when . In this case we have and the auxiliary equation is

Thus, we can write the general solution of (13) in this case as

where and are again to be determined from the boundary conditions. From the boundary condition we conclude that so the equation reduces to

But from the boundary condition we must conclude that, if , then we must have where . Thus, we find that for each , the eigenvalues of this Sturm-Liouville problem are , and the corresponding eigenfunctions are . The coefficient is undetermined and must be specified through some normalisation process, for example by setting the integral of between and equal to and then finding the value of that is consistent with this. In Courant & Hilbert they have (implicitly) simply set .

Some features of this solution are typical of Sturm-Liouville eigenvalue problems in physics more generally. For example, the eigenvalues are real (rather than complex) numbers, there is a minimum eigenvalue () but not a maximum one, and for each eigenvalue there is a unique eigenfunction (up to a multiplicative constant). Also, importantly, the eigenfunctions here form a complete and orthogonal set of functions. Orthogonality refers to the fact that the integral of a product of any two distinct eigenfunctions over the interval is zero, i.e.,

for , as can easily be demonstrated in the same way as in the theory of Fourier series. Completeness refers to the fact that over the interval the infinite set of functions , , can be used to represent any sufficiently well behaved function using a Fourier series of the form

All of this is alluded to (without explicit explanation at this stage) in the subsequent part of this section of Courant & Hilbert’s text, where they go on to provide the general solution of the vibrating string problem. They write the following:

The properties of completeness and orthogonality of the eigenfunctions are again a typical feature of the solutions of Sturm-Liouville eigenvalue problems more generally, and this is one of the main reasons why Sturm-Liouville theory is so important to the solution of physical problems involving differential equations. To get a better understanding of this, I will now develop Sturm-Liouville theory in a more general setting by starting with a standard second-order homogeneous linear differential equation of the form

where the variable is confined to an interval .

Let

Dividing the differential equation by and multiplying through by we get

where

is called the *Sturm-Liouville differential operator*. Thus, we see already that a wide variety of second-order differential equations encountered in physics will be able to be put into a form involving the operator , so results concerning the properties of will have wide applicability.

Using the Sturm-Liouville operator we can now write the defining differential equation of Sturm-Liouville theory in an eigenvalue-eigenfunction format that is very reminiscent of the setup in quantum mechanics outlined at the start of this note. The defining differential equation is

where is a real-valued positive weight function and is an eigenvalue corresponding to the eigenfunction . This differential equation is often written out in full as

with . In Sturm-Liouville problems, the functions , and are specified at the start and, crucially, the function is required to satisfy particular boundary conditions at and . The boundary conditions are a key aspect of each Sturm-Liouville problem; for a given form of the differential equation, different boundary conditions can produce very different problems. Solving a Sturm-Liouville problem involves finding the values of for which there exist non-trivial solutions of the defining differential equation above subject to the specified boundary conditions. The vibrating string problem in Courant & Hilbert (discussed above) is a simple example. We obtain the differential equation (13) in that problem by setting , and in the defining Sturm-Liouville differential equation.

We would now like to prove that the eigenvalues in a Sturm-Liouville problem will always be real and that the eigenfunctions will form an orthogonal set of functions, as claimed earlier. To do this, we need to consider a few more developments. In Sturm-Liouville theory we can apply to both real and complex functions, and a key role is played by the concept of the *inner product* of such functions. Using the notation to denote the complex conjugate of the function , we define the inner product of two functions and over the interval as

and we define the *weighted* inner product as

where is the real-valued positive weight function mentioned earlier. A key result in the theory is Lagrange’s identity, which says that for any two complex-valued functions of a real variable and , we have

This follows from the form of , since

Using the inner product notation, we can write Lagrange’s identity in an alternative form that reveals the crucial role played by the boundary conditions in a Sturm-Liouville problem. We have

For some boundary conditions the final term here is zero and then we will have

When this happens, the operator in conjunction with the boundary conditions is said to be *self-adjoint*. As an example, a so-called *regular* Sturm-Liouville problem involves solving the differential equation

subject to what are called *separated* boundary conditions, taking the form

and

In this case, the operator is self-adjoint. To see this, suppose the functions and satisfy these boundary conditions. Then at we have

and

from which we can deduce that

Similarly, at the boundary point we find that

These results then imply

so the operator is self-adjoint as claimed. As another example, a *singular* Sturm-Liouville problem involves solving the same differential equation as in the regular problem, but subject to the boundary condition that is zero at either or or both, while being positive for . If does not vanish at one of the boundary points, then is required to satisfy the same boundary condition at that point as in the regular problem. Clearly we will have

in this case too, so the operator will also be self-adjoint in the case of a singular Sturm-Liouville problem. As a final example, suppose the Sturm-Liouville problem involves solving the same differential equation as before, but with *periodic* boundary conditions of the form

and

Then if and are two functions satisfying these boundary conditions we will have

So again, the operator will be self-adjoint in the case of periodic boundary conditions. We will see later that the singular and periodic cases arise when attempting to solve Schrödinger’s equation for the hydrogen atom.

The key reason for focusing so much on the self-adjoint property of the operator is that the eigenvalues of a self-adjoint operator are always real, and the eigenfunctions are orthogonal. Note that by orthogonality of the eigenfunctions in the more general context we mean that

whenever and are eigenfunctions corresponding to two distinct eigenvalues.

To prove that the eigenvalues are always real, suppose that is an eigenfunction corresponding to an eigenvalue . Then we have

and so

But we also have

Therefore if the operator is self-adjoint we can write

since , so the eigenvalues must be real. In particular, this must be the case for regular and singular Sturm-Liouville problems, and for Sturm-Liouville problems involving periodic boundary conditions.

To prove that the eigenfunctions are orthogonal, let and denote two eigenfunctions corresponding to distinct eigenvalues and respectively. Then we have

and so by the self-adjoint property we can write

Since the eigenvalues are distinct, the only way this can happen is if

so the eigenfunctions must be orthogonal as claimed.

In addition to being orthogonal, the eigenfunctions , , of a Sturm-Liouville problem with specified boundary conditions also form a *complete* set of functions (I will not prove this here), which means that any sufficiently well-behaved function for which exists can be represented by a Fourier series of the form

for , where the coefficients are given by the formula

It is the completeness and orthogonality of the eigenfunctions that makes Sturm-Liouville theory so useful in solving linear differential equations, because (for example) it means that the solutions of many second-order *inhomogeneous* linear differential equations of the form

with suitable boundary conditions can be expressed as a linear combination of the eigenfunctions of the corresponding Sturm-Liouville problem

with the same boundary conditions. To illustrate this, suppose this Sturm-Liouville problem with boundary conditions has an infinite set of eigenvalues and corresponding eigenfunctions , , which are orthogonal and form a complete set. We will assume that the solution of the inhomogeneous differential equation above is an infinite series of the form

where the coefficients are constants, and we will find these coefficients using the orthogonality of the eigenfunctions. Since for each it is true that

we can write

Thus, in the inhomogeneous equation

we can put

To find the th coefficient we can multiply both sides by and integrate. By orthogonality, all the terms in the sum on the right will vanish except the one involving . We will get

Having found a formula for the coefficients , we can now write the solution of the original inhomogeneous differential equation as

where

To conclude this note, I want to go back to a previous note in which I explored in detail the solution of Schrödinger’s equation for the hydrogen atom by the method of separation of variables. This approach reduced Schrödinger’s partial differential equation into a set of three uncoupled ordinary differential equations which we can now see are in fact Sturm-Liouville problems. As discussed in my previous note, Schrödinger’s three-dimensional equation for the hydrogen atom can be written in spherical polar coordinates as

and after solving this by the usual separation of variables approach starting from the assumption that the function can be expressed as a product

we end up with an equation for (the radial equation) of the form

and equations for and of the forms

and

respectively. Taking each of these in turn, we first observe that the radial equation is of the Sturm-Liouville form with and eigenvalues corresponding to the energy term in the equation. The variable can range between and and the boundary conditions are formulated in such a way that the solutions of the radial equation remain bounded as and go to zero as . Furthermore, since , the radial equation is a singular Sturm-Liouville problem. Next, we observe that the equation for is essentially the same as equation (13) for the vibrating string in the extract from Courant & Hilbert discussed at the start of this note. The azimuth angle can take any value in but the function must take a single value at each point in space (since this is a required property of the quantum wave function which is a constituent of). It follows that the function must be periodic since it must take the same value at and for any given . This condition implies the conditions and . Furthermore, we have for all . Thus, the equation for is a Sturm-Liouville problem with periodic boundary conditions. Finally, as discussed in my previous note, the equation can be rewritten as

where and thus . This is a Sturm-Liouville problem with and the boundary conditions are given by the requirement that should remain bounded for all . Since at both ends of the interval , this equation can be classified as a singular Sturm-Liouville problem. The eigenvalue is in this equation.