Study of a proof of Noether’s theorem and its application to conservation laws in physics

While I have for a long time been aware of Noether’s theorem and its relevance to symmetry and conservation laws in physics, I have only recently taken the time to closely explore its mathematical proof. In the present note I want to record some notes I made on the mathematical nuances involved in a proof of Noether’s theorem and the mathematical relevance of the theorem to some simple conservation laws in classical physics, namely, the conservation of energy and the conservation of linear momentum. Noether’s Theorem has important applications in a wide range of classical mechanics problems as well as in quantum mechanics and Einstein’s relativity theory. It is also used in the study of certain classes of partial differential equations that can be derived from variational principles.

The theorem was first published by Emmy Noether in 1918. An English translation of the full original paper is available here. An interesting book by Yvette Kosmann-Schwarzbach also presents an English translation of Noether’s 1918 paper and discusses in detail the history of the theorem’s development and its impact on theoretical physics in the 20th Century. (Kosmann-Schwarzbach, Y, 2011, The Noether Theorems: Invariance and Conservation Laws in the Twentieth Century. Translated by Bertram Schwarzbach. Springer). At the time of writing, the book is freely downloadable from here.

Mathematical setup of Noether’s theorem

The case I explore in detail here is that of a variational calculus functional of the form

S[y] = \int_a^b \mathrm{d}x F(x, y, y^{\prime})

where x is a single independent variable and y = (y_1, y_2, \ldots, y_n) is a vector of n dependent variables. The functional has stationary paths defined by the usual Euler-Lagrange equations of variational calculus. Noether’s theorem concerns how the value of this functional is affected by families of continuous transformations of the dependent and independent variables (e.g., translations, rotations) that are defined in terms of one or more real parameters. The case I explore in detail here involves transformations defined in terms of only a single parameter, call it \delta. The transformations can be represented in general terms as

\overline{x} = \Phi(x, y, y^{\prime}; \delta)

\overline{y}_k = \Psi_k(x, y, y^{\prime}; \delta)

for k = 1, 2, \ldots, n. The functions \Phi and \Psi_k are assumed to have continuous first derivatives with respect to all the variables, including the parameter \delta. Furthermore, the transformations must reduce to identities when \delta = 0, i.e.,

x \equiv \Phi(x, y, y^{\prime}; 0)

y_k \equiv \Psi_k(x, y, y^{\prime}; 0)

for k = 1, 2, \ldots, n. As concrete examples, translations and rotations are continuous differentiable transformations that can be defined in terms of a single parameter and that reduce to identities when the parameter takes the value zero.

Noether’s theorem is assumed to apply to infinitesimally small changes in the dependent and independent variables, so we can assume |\delta| \ll 1 and then use perturbation theory to prove the theorem. Treating \overline{x} and \overline{y}_k as functions of \delta and Taylor-expanding them about \delta = 0 we get

\overline{x}(\delta) = \overline{x}(0) + \frac{\partial \Phi}{\partial \delta} \big|_{\delta = 0}(\delta - 0) + O(\delta^2)

\iff

\overline{x}(\delta) = x + \delta \phi + O(\delta^2)

where

\phi(x, y, y^{\prime}) \equiv \frac{\partial \Phi}{\partial \delta} \big|_{\delta = 0}

and

\overline{y}_k (\delta) = \overline{y}_k (0) + \frac{\partial \Psi_k}{\partial \delta} \big|_{\delta = 0}(\delta - 0) + O(\delta^2)

\iff

\overline{y}_k (\delta) = y_k + \delta \psi_k + O(\delta^2)

where

\psi_k (x, y, y^{\prime}) \equiv \frac{\partial \Psi_k}{\partial \delta} \big|_{\delta = 0} for k = 1, 2, \ldots, n.

Noether’s theorem then says that whenever the functional S[y] is invariant under the above family of transformations, i.e., whenever

\int_{\overline{c}}^{\overline{d}} \mathrm{d} \overline{x} F(\overline{x}, \overline{y}, \overline{y}^{\prime}) = \int_c^d \mathrm{d}x F(x, y, y^{\prime})

for all c and d such that a \leq c < d \leq b, where \overline{c} = \Phi(c, y(c), y^{\prime}(c)) and \overline{d} = \Phi(d, y(d), y^{\prime}(d)), then for each stationary path of S[y] the following equation holds:

\sum_{k = 1}^n \frac{\partial F}{\partial y_k^{\prime}}\psi_k + \bigg(F - \sum_{k = 1}^n y_k^{\prime}\frac{\partial F}{\partial y_k^{\prime}}\bigg)\phi = \mathrm{constant}

As illustrated below, this remarkable equation encodes a number of conservation laws in physics, including conservation of energy, linear and angular momentum given that the relevant equations of motion are invariant under translations in time and space, and under rotations in space respectively. Thus, Noether’s theorem is often expressed as a statement along the lines that whenever a system has a continuous symmetry there must be corresponding quantities whose values are conserved.

Application of the theorem to familiar conservation laws in classical physics

It is, of course, not necessary to use the full machinery of Noether’s theorem for simple examples of conservation laws in classical physics. The theorem is most useful in unfamiliar situations in which it can reveal conserved quantities which were not previously known. However, going through the motions in simple cases clarifies how the mathematical machinery works in more sophisticated and less familiar situations.

To obtain the law of the conservation of energy in the simplest possible scenario, consider a particle of mass m moving along a straight line in a time-invariant potential field V(x) with position at time t given by the function x(t). The Lagrangian formulation of mechanics then says that the path followed by the particle will be a stationary path of the action functional

\int_0^{T} \mathrm{d}t L(x, \dot{x}) = \int_0^{T} \mathrm{d}t \big(\frac{1}{2}m\dot{x}^2 - V(x)\big)

The Euler-Lagrange equation for this functional would give Newton’s second law as the equation governing the particle’s motion. With regard to demonstrating energy conservation, we notice that the Lagrangian, which is more generally of the form L(t, x, \dot{x}) when there is a time-varying potential, here takes the simpler form L(x, \dot{x}) because there is no explicit dependence on time. Therefore we might expect the functional to be invariant under translations in time, and thus Noether’s theorem to hold. We will verify this. In the context of the mathematical setup of Noether’s theorem above, we can write the relevant transformations as

\overline{t}(\delta) = t + \delta \phi + O(\delta^2) \equiv t + \delta

and

\overline{x}(\delta) = x + \delta \cdot 0 + O(\delta^2) \equiv x

From the first equation we see that \phi = 1 in the case of a simple translation in time by an amount \delta, and from the second equation we see that \psi = 0, which simply reflects the fact that we are only translating in the time direction. The invariance of the functional under these transformations can easily be demonstrated by writing

\int_{\overline{0}}^{\overline{T}} \mathrm{d}\overline{t} L(\overline{x}, \dot{\overline{x}}) = \int_{\overline{0}-\delta}^{\overline{T}-\delta} \mathrm{d}t L(x, \dot{x}) = \int_0^{T} \mathrm{d}t L(x, \dot{x})

where the limits in the second integral follow from the change of the time variable from \overline{t} to t. Thus, Noether’s theorem holds and with \phi = 1 and \psi = 0 the fundamental equation in the theorem reduces to

L - \dot{x}\frac{\partial L}{\partial \dot{x}} = \mathrm{constant}

Evaluating the terms on the left-hand side we get

\frac{1}{2}m\dot{x}^2 - V(x) - \dot{x} m\dot{x} =\mathrm{constant}

\iff

\frac{1}{2}m\dot{x}^2 + V(x) = E = \mathrm{constant}

which is of course the statement of the conservation of energy.

To obtain the law of conservation of linear momentum in the simplest possible scenario, assume now that the above particle is moving freely in the absence of any potential field, so V(x) = 0 and the only energy involved is kinetic energy. The path followed by the particle will now be a stationary path of the action functional

\int_0^{T} \mathrm{d}t L(\dot{x}) = \int_0^{T} \mathrm{d}t \big(\frac{1}{2}m\dot{x}^2\big)

The Euler-Lagrange equation for this functional would give Newton’s first law as the equation governing the particle’s motion (constant velocity in the absence of any forces). To get the law of conservation of linear momentum we will consider a translation in space rather than time, and check that the action functional is invariant under such translations. In the context of the mathematical setup of Noether’s theorem above, we can write the relevant transformations as

\overline{t}(\delta) = t + \delta \cdot 0 + O(\delta^2) \equiv t

and

\overline{x}(\delta) = x + \delta \psi + O(\delta^2) \equiv x + \delta

From the first equation we see that \phi = 0 reflecting the fact that we are only translating in the space direction, and from the second equation we see that \psi = 1 in the case of a simple translation in space by an amount \delta. The invariance of the functional under these transformations can easily be demonstrated by noting that \dot{\overline{x}} = \dot{x}, so we can write

\int_{\overline{0}}^{\overline{T}} \mathrm{d}\overline{t} L(\dot{\overline{x}}) = \int_0^{T} \mathrm{d}t L(\dot{x})

since the limits of integration are not affected by the translation in space. Thus, Noether’s theorem holds and with \phi = 0 and \psi = 1 the fundamental equation in the theorem reduces to

\frac{\partial L}{\partial \dot{x}} = \mathrm{constant}

\iff

m\dot{x} = \mathrm{constant}

This is, of course, the statement of the conservation of linear momentum.

Proof of Noether’s theorem

To prove Noether’s theorem we will begin with the transformed functional

\int_{\overline{c}}^{\overline{d}} \mathrm{d} \overline{x} F(\overline{x}, \overline{y}, \overline{y}^{\prime})

We will substitute into this the linearised forms of the transformations, namely

\overline{x}(\delta) = x + \delta \phi + O(\delta^2)

and

\overline{y}_k (\delta) = y_k + \delta \psi_k + O(\delta^2)

for k = 1, 2, \ldots, n, and then expand to first order in \delta. Note that the integration limits are, to first order in \delta,

\overline{c} = c + \delta \phi(c)

and

\overline{d} = d + \delta \phi(d)

Using the linearised forms of the transformations and writing \psi = (\psi_1, \psi_2, \ldots, \psi_n) we get

\frac{\mathrm{d} \overline{y}}{\mathrm{d}\overline{x}} = \big(\frac{\mathrm{d}y}{\mathrm{d}x} + \delta \frac{\mathrm{d}\psi}{\mathrm{d}x} \big) \frac{\mathrm{d}x}{\mathrm{d}\overline{x}}

\frac{\mathrm{d}\overline{x}}{\mathrm{d}x} = 1 + \delta \frac{\mathrm{d}\phi}{\mathrm{d}x}

Inverting the second equation we get

\frac{\mathrm{d}x}{\mathrm{d}\overline{x}} = \big(1 + \delta \frac{\mathrm{d}\phi}{\mathrm{d}x}\big)^{-1} = 1 - \delta \frac{\mathrm{d}\phi}{\mathrm{d}x} + O(\delta^2)

Using this in the first equation we find, to first order in \delta,

\frac{\mathrm{d} \overline{y}}{\mathrm{d}\overline{x}} = \big(\frac{\mathrm{d}y}{\mathrm{d}x} + \delta \frac{\mathrm{d}\psi}{\mathrm{d}x} \big) \big(1 - \delta \frac{\mathrm{d}\phi}{\mathrm{d}x}\big) = \frac{\mathrm{d}y}{\mathrm{d}x} + \delta\big(\frac{\mathrm{d}\psi}{\mathrm{d}x} - \frac{\mathrm{d}y}{\mathrm{d}x}\frac{\mathrm{d}\phi}{\mathrm{d}x}\big)

Making the necessary substitutions we can then write the transformed functional as

\int_{\overline{c}}^{\overline{d}} \mathrm{d} \overline{x} F(\overline{x}, \overline{y}, \overline{y}^{\prime})

= \int_{\overline{c}-\delta \phi(c)}^{\overline{d}-\delta \phi(d)} \mathrm{d}x \frac{ \mathrm{d}\overline{x}}{\mathrm{d}x} F\bigg(x + \delta \phi, y+\delta \psi, \frac{\mathrm{d}y}{\mathrm{d}x} + \delta\big(\frac{\mathrm{d}\psi}{\mathrm{d}x} - \frac{\mathrm{d}y}{\mathrm{d}x}\frac{\mathrm{d}\phi}{\mathrm{d}x}\big) \bigg)

= \int_c^d \mathrm{d}x \frac{ \mathrm{d}\overline{x}}{\mathrm{d}x} F\bigg(x + \delta \phi, y+\delta \psi, \frac{\mathrm{d}y}{\mathrm{d}x} + \delta\big(\frac{\mathrm{d}\psi}{\mathrm{d}x} - \frac{\mathrm{d}y}{\mathrm{d}x}\frac{\mathrm{d}\phi}{\mathrm{d}x}\big) \bigg)

Treating F as a function of \delta and expanding about \delta = 0 to first order we get

F(\delta) = F(0) + \delta \frac{\partial F}{\partial \delta}\big|_{\delta = 0}

= F(x, y, y^{\prime}) + \delta \bigg(\frac{\partial F}{\partial x}\phi + \sum_{k=1}^n \big(\frac{\partial F}{\partial y_k}\psi_k + \frac{\partial F}{\partial y^{\prime}_k}\frac{\mathrm{d}\psi_k}{\mathrm{d}x} - \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\frac{\mathrm{d}\phi}{\mathrm{d}x}\big)\bigg)

Then using the expression for \frac{\mathrm{d}\overline{x}}{\mathrm{d}x} above, the transformed functional becomes

\int_c^d \mathrm{d}x \frac{ \mathrm{d}\overline{x}}{\mathrm{d}x} F\bigg(x + \delta \phi, y+\delta \psi, \frac{\mathrm{d}y}{\mathrm{d}x} + \delta\big(\frac{\mathrm{d}\psi}{\mathrm{d}x} - \frac{\mathrm{d}y}{\mathrm{d}x}\frac{\mathrm{d}\phi}{\mathrm{d}x}\big) \bigg)

= \int_c^d \mathrm{d}x F\bigg(x + \delta \phi, y+\delta \psi, \frac{\mathrm{d}y}{\mathrm{d}x} + \delta\big(\frac{\mathrm{d}\psi}{\mathrm{d}x} - \frac{\mathrm{d}y}{\mathrm{d}x}\frac{\mathrm{d}\phi}{\mathrm{d}x}\big) \bigg)

+ \int_c^d \mathrm{d}x \delta \frac{\mathrm{d}\phi}{\mathrm{d}x} F\bigg(x + \delta \phi, y+\delta \psi, \frac{\mathrm{d}y}{\mathrm{d}x} + \delta\big(\frac{\mathrm{d}\psi}{\mathrm{d}x} - \frac{\mathrm{d}y}{\mathrm{d}x}\frac{\mathrm{d}\phi}{\mathrm{d}x}\big) \bigg)

= \int_c^d \mathrm{d}x F(x, y, y^{\prime})

+ \int_c^d \mathrm{d}x \delta \bigg(\frac{\partial F}{\partial x}\phi + \sum_{k=1}^n \big(\frac{\partial F}{\partial y_k}\psi_k + \frac{\partial F}{\partial y^{\prime}_k}\frac{\mathrm{d}\psi_k}{\mathrm{d}x} - \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\frac{\mathrm{d}\phi}{\mathrm{d}x}\big)\bigg)

+ \int_c^d \mathrm{d}x \delta \frac{\mathrm{d}\phi}{\mathrm{d}x} F(x, y, y^{\prime}) + O(\delta^2)

Ignoring the second order term in \delta we can thus write

\int_{\overline{c}}^{\overline{d}} \mathrm{d} \overline{x} F(\overline{x}, \overline{y}, \overline{y}^{\prime}) = \int_c^d \mathrm{d}x F(x, y, y^{\prime})

+ \delta \int_c^d \mathrm{d}x \bigg(\big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big)\frac{\mathrm{d}\phi}{\mathrm{d}x} + \frac{\partial F}{\partial x}\phi + \sum_{k=1}^n \big(\frac{\partial F}{\partial y_k}\psi_k + \frac{\partial F}{\partial y^{\prime}_k}\frac{\mathrm{d}\psi_k}{\mathrm{d}x}\big)\bigg)

Since the functional is invariant, however, this implies

\int_c^d \mathrm{d}x \bigg(\big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big)\frac{\mathrm{d}\phi}{\mathrm{d}x} + \frac{\partial F}{\partial x}\phi + \sum_{k=1}^n \big(\frac{\partial F}{\partial y_k}\psi_k + \frac{\partial F}{\partial y^{\prime}_k}\frac{\mathrm{d}\psi_k}{\mathrm{d}x}\big)\bigg) = 0

We now manipulate this equation by integrating the terms involving \frac{\mathrm{d}\phi}{\mathrm{d}x} and \frac{\mathrm{d}\psi_k}{\mathrm{d}x} by parts. We get

\int_c^d \mathrm{d}x \big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big)\frac{\mathrm{d}\phi}{\mathrm{d}x} = \bigg[\phi \big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big)\bigg]_c^d

- \int_c^d \mathrm{d}x \phi \frac{\mathrm{d}}{\mathrm{d}x}\big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big)

and

\int_c^d \mathrm{d}x \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k}\frac{\mathrm{d}\psi_k}{\mathrm{d}x} = \bigg[\sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k}\psi_k\bigg]_c^d - \int_c^d \mathrm{d}x \sum_{k=1}^n \frac{\mathrm{d}}{\mathrm{d}x}\big(\frac{\partial F}{\partial y^{\prime}_k}\big)\psi_k

Substituting these into the equation gives

\bigg[\phi \big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big) + \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k}\psi_k\bigg]_c^d

+ \int_c^d \mathrm{d}x \phi \bigg(\frac{\partial F}{\partial x} - \frac{\mathrm{d}}{\mathrm{d}x}\big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big)\bigg)

+ \int_c^d \mathrm{d}x \sum_{k=1}^n \psi_k \bigg(\frac{\partial F}{\partial y_k} - \frac{\mathrm{d}}{\mathrm{d}x}\big(\frac{\partial F}{\partial y^{\prime}_k}\big)\bigg) = 0

We can manipulate this equation further by expanding the integrand in the second term on the left-hand side. We get

\frac{\partial F}{\partial x} - \frac{\mathrm{d}}{\mathrm{d}x}\big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big)

= \frac{\partial F}{\partial x} - \frac{\partial F}{\partial x} - \sum_{k=1}^n \frac{\partial F}{\partial y_k}y^{\prime}_k - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k}y^{\prime \prime}_k + \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k}y^{\prime \prime}_k + \sum_{k=1}^n y^{\prime}_k \frac{\mathrm{d}}{\mathrm{d}x}\big(\frac{\partial F}{\partial y^{\prime}_k}\big)

= \sum_{k=1}^n y^{\prime}_k \bigg(\frac{\mathrm{d}}{\mathrm{d}x}\big(\frac{\partial F}{\partial y^{\prime}_k}\big) - \frac{\partial F}{\partial y_k}\bigg)

Thus, the equation becomes

\bigg[\phi \big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big) + \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k}\psi_k\bigg]_c^d

+ \int_c^d \mathrm{d}x \phi \sum_{k=1}^n y^{\prime}_k \bigg(\frac{\mathrm{d}}{\mathrm{d}x}\big(\frac{\partial F}{\partial y^{\prime}_k}\big) - \frac{\partial F}{\partial y_k}\bigg)

+ \int_c^d \mathrm{d}x \sum_{k=1}^n \psi_k \bigg(\frac{\partial F}{\partial y_k} - \frac{\mathrm{d}}{\mathrm{d}x}\big(\frac{\partial F}{\partial y^{\prime}_k}\big)\bigg) = 0

We can now see at a glance that the second and third terms on the left-hand side must vanish because of the Euler-Lagrange expressions appearing in the brackets (which are identically zero on stationary paths). Thus we arrive at the equation

\bigg[\phi \big(F - \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k} \frac{\mathrm{d}y_k}{\mathrm{d}x}\big) + \sum_{k=1}^n \frac{\partial F}{\partial y^{\prime}_k}\psi_k\bigg]_c^d = 0

which proves that the formula inside the square brackets is constant as per Noether’s theorem.