Sturm-Liouville theory was developed in the 19th century in the context of solving differential equations. When one studies it in depth, however, one experiences a sudden realisation that this is the mathematics underlying a lot of quantum mechanics. In quantum mechanics we envisage a quantum state (a time-dependent function) expressed as a superposition of eigenfunctions of a self-adjoint operator (usually referred to as a Hermitian operator) representing an observable. The coefficients of the eigenfunctions in this superposition are probability amplitudes. A measurement of the observable quantity represented by the Hermitian operator produces one of the eigenvalues of the operator with a probability equal to the square of the probability amplitude attached to the eigenfunction corresponding to that eigenvalue in the superposition. It is the fact that the operator is self-adjoint that ensures the eigenvalues are real (and thus observable), and furthermore, that the eigenfunctions corresponding to the eigenvalues form a complete and orthogonal set of functions enabling quantum states to be represented as a superposition in the first place (i.e., an eigenfunction expansion akin to a Fourier series). The Sturm-Liouville theory of the 19th century has essentially this same structure and in fact Sturm-Liouville eigenvalue problems are important more generally in mathematical physics precisely because they frequently arise in attempting to solve commonly-encountered partial differential equations (e.g., Poisson’s equation, the diffusion equation, the wave equation, etc.), particularly when the method of separation of variables is employed.
I want to get an overview of Sturm-Liouville theory in the present note and will begin by considering a nice discussion of a vibrating string problem in Courant & Hilbert’s classic text, Methods of Mathematical Physics (Volume I). Although the problem is simple and the treatment in Courant & Hilbert a bit terse, it (remarkably) brings up a lot of the key features of Sturm-Liouville theory which apply more generally in a wide variety of physics problems. I will then consider Sturm-Liouville theory in a more general setting emphasising the role of the Sturm-Liouville differential operator, and finally I will illustrate further the occurrence of Sturm-Liouville systems in physics by looking at the eigenvalue problems encountered when solving Schrödinger’s equation for the hydrogen atom.
On page 287 of Volume I, Courant & Hilbert write the following:
Equation (12) here is the one-dimensional wave equation
which (as usual) the authors are going to solve by using a separation of variables of the form
As Courant & Hilbert explain, the problem then involves finding the function by solving the second-order homogeneous linear differential equation
subject to the boundary conditions
Although not explicitly mentioned by Courant & Hilbert at this stage, equations (13) and (13a) in fact constitute a full blown Sturm-Liouville eigenvalue problem. Despite being very simple, this setup captures many of the typical features encountered in a wide variety of such problems in physics. It is instructive to explore the text underneath equation (13a):
Not all these requirements can be fulfilled for arbitrary values of the constant .
… the boundary conditions can be fulfilled if and only if is the square of an integer .
To clarify this, we can try to solve (13) and (13a) for the three possible cases: , and . Suppose first that . Then and the auxiliary equation for (13) is
Thus, we can write the general solution of (13) in this case as
where and are constants to be determined from the boundary conditions. From the boundary condition we conclude that so the equation reduces to
But from the boundary condition we are forced to conclude that since . Therefore there is only the trivial solution in the case .
Next, suppose that . Then equation (13) reduces to
From the boundary condition we must conclude that , and the boundary condition means we are also forced to conclude that . Thus, again, there is only the trivial solution in the case .
We see that nontrivial solutions can only be obtained when . In this case we have and the auxiliary equation is
Thus, we can write the general solution of (13) in this case as
where and are again to be determined from the boundary conditions. From the boundary condition we conclude that so the equation reduces to
But from the boundary condition we must conclude that, if , then we must have where . Thus, we find that for each , the eigenvalues of this Sturm-Liouville problem are , and the corresponding eigenfunctions are . The coefficient is undetermined and must be specified through some normalisation process, for example by setting the integral of between and equal to and then finding the value of that is consistent with this. In Courant & Hilbert they have (implicitly) simply set .
Some features of this solution are typical of Sturm-Liouville eigenvalue problems in physics more generally. For example, the eigenvalues are real (rather than complex) numbers, there is a minimum eigenvalue () but not a maximum one, and for each eigenvalue there is a unique eigenfunction (up to a multiplicative constant). Also, importantly, the eigenfunctions here form a complete and orthogonal set of functions. Orthogonality refers to the fact that the integral of a product of any two distinct eigenfunctions over the interval is zero, i.e.,
for , as can easily be demonstrated in the same way as in the theory of Fourier series. Completeness refers to the fact that over the interval the infinite set of functions , , can be used to represent any sufficiently well behaved function using a Fourier series of the form
All of this is alluded to (without explicit explanation at this stage) in the subsequent part of this section of Courant & Hilbert’s text, where they go on to provide the general solution of the vibrating string problem. They write the following:
The properties of completeness and orthogonality of the eigenfunctions are again a typical feature of the solutions of Sturm-Liouville eigenvalue problems more generally, and this is one of the main reasons why Sturm-Liouville theory is so important to the solution of physical problems involving differential equations. To get a better understanding of this, I will now develop Sturm-Liouville theory in a more general setting by starting with a standard second-order homogeneous linear differential equation of the form
where the variable is confined to an interval .
Dividing the differential equation by and multiplying through by we get
is called the Sturm-Liouville differential operator. Thus, we see already that a wide variety of second-order differential equations encountered in physics will be able to be put into a form involving the operator , so results concerning the properties of will have wide applicability.
Using the Sturm-Liouville operator we can now write the defining differential equation of Sturm-Liouville theory in an eigenvalue-eigenfunction format that is very reminiscent of the setup in quantum mechanics outlined at the start of this note. The defining differential equation is
where is a real-valued positive weight function and is an eigenvalue corresponding to the eigenfunction . This differential equation is often written out in full as
with . In Sturm-Liouville problems, the functions , and are specified at the start and, crucially, the function is required to satisfy particular boundary conditions at and . The boundary conditions are a key aspect of each Sturm-Liouville problem; for a given form of the differential equation, different boundary conditions can produce very different problems. Solving a Sturm-Liouville problem involves finding the values of for which there exist non-trivial solutions of the defining differential equation above subject to the specified boundary conditions. The vibrating string problem in Courant & Hilbert (discussed above) is a simple example. We obtain the differential equation (13) in that problem by setting , and in the defining Sturm-Liouville differential equation.
We would now like to prove that the eigenvalues in a Sturm-Liouville problem will always be real and that the eigenfunctions will form an orthogonal set of functions, as claimed earlier. To do this, we need to consider a few more developments. In Sturm-Liouville theory we can apply to both real and complex functions, and a key role is played by the concept of the inner product of such functions. Using the notation to denote the complex conjugate of the function , we define the inner product of two functions and over the interval as
and we define the weighted inner product as
where is the real-valued positive weight function mentioned earlier. A key result in the theory is Lagrange’s identity, which says that for any two complex-valued functions of a real variable and , we have
This follows from the form of , since
Using the inner product notation, we can write Lagrange’s identity in an alternative form that reveals the crucial role played by the boundary conditions in a Sturm-Liouville problem. We have
For some boundary conditions the final term here is zero and then we will have
When this happens, the operator in conjunction with the boundary conditions is said to be self-adjoint. As an example, a so-called regular Sturm-Liouville problem involves solving the differential equation
subject to what are called separated boundary conditions, taking the form
In this case, the operator is self-adjoint. To see this, suppose the functions and satisfy these boundary conditions. Then at we have
from which we can deduce that
Similarly, at the boundary point we find that
These results then imply
so the operator is self-adjoint as claimed. As another example, a singular Sturm-Liouville problem involves solving the same differential equation as in the regular problem, but subject to the boundary condition that is zero at either or or both, while being positive for . If does not vanish at one of the boundary points, then is required to satisfy the same boundary condition at that point as in the regular problem. Clearly we will have
in this case too, so the operator will also be self-adjoint in the case of a singular Sturm-Liouville problem. As a final example, suppose the Sturm-Liouville problem involves solving the same differential equation as before, but with periodic boundary conditions of the form
Then if and are two functions satisfying these boundary conditions we will have
So again, the operator will be self-adjoint in the case of periodic boundary conditions. We will see later that the singular and periodic cases arise when attempting to solve Schrödinger’s equation for the hydrogen atom.
The key reason for focusing so much on the self-adjoint property of the operator is that the eigenvalues of a self-adjoint operator are always real, and the eigenfunctions are orthogonal. Note that by orthogonality of the eigenfunctions in the more general context we mean that
whenever and are eigenfunctions corresponding to two distinct eigenvalues.
To prove that the eigenvalues are always real, suppose that is an eigenfunction corresponding to an eigenvalue . Then we have
But we also have
Therefore if the operator is self-adjoint we can write
since , so the eigenvalues must be real. In particular, this must be the case for regular and singular Sturm-Liouville problems, and for Sturm-Liouville problems involving periodic boundary conditions.
To prove that the eigenfunctions are orthogonal, let and denote two eigenfunctions corresponding to distinct eigenvalues and respectively. Then we have
and so by the self-adjoint property we can write
Since the eigenvalues are distinct, the only way this can happen is if
so the eigenfunctions must be orthogonal as claimed.
In addition to being orthogonal, the eigenfunctions , , of a Sturm-Liouville problem with specified boundary conditions also form a complete set of functions (I will not prove this here), which means that any sufficiently well-behaved function for which exists can be represented by a Fourier series of the form
for , where the coefficients are given by the formula
It is the completeness and orthogonality of the eigenfunctions that makes Sturm-Liouville theory so useful in solving linear differential equations, because (for example) it means that the solutions of many second-order inhomogeneous linear differential equations of the form
with suitable boundary conditions can be expressed as a linear combination of the eigenfunctions of the corresponding Sturm-Liouville problem
with the same boundary conditions. To illustrate this, suppose this Sturm-Liouville problem with boundary conditions has an infinite set of eigenvalues and corresponding eigenfunctions , , which are orthogonal and form a complete set. We will assume that the solution of the inhomogeneous differential equation above is an infinite series of the form
where the coefficients are constants, and we will find these coefficients using the orthogonality of the eigenfunctions. Since for each it is true that
we can write
Thus, in the inhomogeneous equation
we can put
To find the th coefficient we can multiply both sides by and integrate. By orthogonality, all the terms in the sum on the right will vanish except the one involving . We will get
Having found a formula for the coefficients , we can now write the solution of the original inhomogeneous differential equation as
To conclude this note, I want to go back to a previous note in which I explored in detail the solution of Schrödinger’s equation for the hydrogen atom by the method of separation of variables. This approach reduced Schrödinger’s partial differential equation into a set of three uncoupled ordinary differential equations which we can now see are in fact Sturm-Liouville problems. As discussed in my previous note, Schrödinger’s three-dimensional equation for the hydrogen atom can be written in spherical polar coordinates as
and after solving this by the usual separation of variables approach starting from the assumption that the function can be expressed as a product
we end up with an equation for (the radial equation) of the form
and equations for and of the forms
respectively. Taking each of these in turn, we first observe that the radial equation is of the Sturm-Liouville form with and eigenvalues corresponding to the energy term in the equation. The variable can range between and and the boundary conditions are formulated in such a way that the solutions of the radial equation remain bounded as and go to zero as . Furthermore, since , the radial equation is a singular Sturm-Liouville problem. Next, we observe that the equation for is essentially the same as equation (13) for the vibrating string in the extract from Courant & Hilbert discussed at the start of this note. The azimuth angle can take any value in but the function must take a single value at each point in space (since this is a required property of the quantum wave function which is a constituent of). It follows that the function must be periodic since it must take the same value at and for any given . This condition implies the conditions and . Furthermore, we have for all . Thus, the equation for is a Sturm-Liouville problem with periodic boundary conditions. Finally, as discussed in my previous note, the equation can be rewritten as
where and thus . This is a Sturm-Liouville problem with and the boundary conditions are given by the requirement that should remain bounded for all . Since at both ends of the interval , this equation can be classified as a singular Sturm-Liouville problem. The eigenvalue is in this equation.