On the electrodynamical part of Einstein’s paper ‘Zur Elektrodynamik bewegter Körper’

Albert Einstein (1879-1955) introduced his special theory of relativity in a 1905 paper Zur Elektrodynamik bewegter Körper (On the Electrodynamics of Moving Bodies, Annalen der Physik. 17 (10): 891–921). The thoughtfulness and confidence of the 26 year-old Einstein are impressive and what is also striking is that the paper contains no references to other research publications. The only person Einstein acknowledges is his best friend Michele Angelo Besso (1873-1955) with the now famous concluding sentence: Zum Schlusse bemerke ich, daß mir beim Arbeiten an dem hier behandelten Probleme mein Freund und Kollege M. Besso treu zur Seite stand und daß ich demselben manche wertvolle Anregung verdanke. (`In conclusion I wish to say that in working at the problem here dealt with I have had the loyal assistance of my friend and colleague M. Besso, and that I am indebted to him for several valuable suggestions.’) The two remained close until they died a month apart fifty years later.

While reading through Einstein’s work I could not help tinkering with his electrodynamical equations and comparing them to the modern equations of classical and relativistic electrodynamics. I want to record some of my jottings about this in the present note, particularly in relation to two key sections of his paper:

§3 Theorie der Koordinaten- und Zeittransformation… (Theory of the Transformation of Coordinates and Times…)


§6 Transformation der Maxwell-Hertzschen Gleichungen für den leeren Raum. (Transformation of the Maxwell-Hertz Equations for Empty Space.)

What Einstein envisages throughout these two sections is a pair of inertial reference frames K and k which are in what we now call `standard configuration’. That is, frame k is moving in the positive x-direction relative to frame K with speed v such that all the Cartesian coordinate axes in the two frames remain parallel and such that the origins and axes of the Cartesian coordinate systems in K and k coincide perfectly at some initial time point. Under these conditions the coordinates of an event using the coordinate system in the k-frame can be expressed in terms of the coordinates of the same event using the coordinate system in the K-frame by a set of transformation equations which Einstein derives in §3 of his paper, and which are now known as the Lorentz transformation equations. Einstein expresses these equations as follows on page 902 of the published paper:

(`It follows from this relation and the one previously found that \phi(v) = 1, so that the transformation equations which have been found become: \cdots.’) Note that Einstein uses the symbol V to denote the speed of light whereas we now use the symbol c.

Shortly after Einstein published this in 1905, Herman Minkowski (1864-1909) realised that the special theory of relativity could be better understood by positing the existence of a four-dimensional spacetime. In Minkowski spacetime we have four-vectors consisting of a time coordinate, ct, and three spatial coordinates, x, y and z. Note that the time coordinate takes the form ct in order to make it have the same units as the spatial coordinates (i.e., units of length). The full coordinate transformations between Einstein’s two inertial frames K and k in standard configuration would be obtained in Minkowski spacetime as

\begin{bmatrix} ct^{\prime}\\ \ \\x^{\prime} \\ \ \\ y^{\prime} \\ \ \\ z^{\prime} \end{bmatrix} = \begin{bmatrix} \beta & -\beta \big(\frac{v}{c}\big) & 0 & \  0\\ \ \\ -\beta \big(\frac{v}{c}\big) & \beta & 0 & \ 0 \\ \ \\0 & 0 & \  1 & \  \ 0 \\ \ \\ 0 & 0 & \  0 & \  \  1 \end{bmatrix} \begin{bmatrix} ct \\ \ \\x \\ \ \\ y \\ \ \\ z \end{bmatrix}

where, using the same notation as Einstein,

\beta = \frac{1}{\sqrt{1 - \big(\frac{v}{c}\big)^2}}

The coefficient matrix is actually a rank-2 tensor of type (1, 1) usually denoted by \Lambda^{\mu}_{\hphantom{\mu} \nu}. It is the application of this matrix to a four-vector in the K-frame that would produce the required Lorentz transformation to a corresponding (primed) four-vector in the k-frame when K and k are in standard configuration.

At the start of §6 on page 907 of the published paper Einstein writes the following:

(‘Let the Maxwell-Hertz equations for empty space hold for the stationary system K, so that we have: \ldots where (X, Y, Z) denotes the vector of the electric force, and (L, M, N) that of the magnetic force.’) The key point that Einstein is trying to make in §6 is that when the coordinate transformations he derived in §3 are applied to electromagnetic processes satisfying the above Maxwell-Hertz equations, it is found that the vectors (X, Y, Z) and (L, M, N) themselves satisfy transformation equations of the form

(page 909 of Einstein’s published paper). Here the primed letters are the components of the vectors with respect to the coordinate system in the k-frame and the unprimed letters are the components with respect to the coordinate system in the K-frame. These equations show that the components of (X, Y, Z) and (L, M, N) do not all remain unchanged when we switch from one inertial reference frame to another. In the standard configuration scenario, only the X and L components remain unchanged. In contrast, looking at the Y^{\prime} equation, for example, we see that an event which from the point of view of the k-frame would be regarded as being due solely to an `electric force’ Y^{\prime} would be regarded from the point of view of the K-frame as being due to a combination of an electric force Y and a magnetic force N. Thus, Einstein writes on page 910 that die elektrischen und magnetischen Kräfte keine von dem Bewegungszustande des Koordinatensystems unabhängige Existenz besitzen. (`electric and magnetic forces do not exist independently of the state of motion of the system of coordinates.’)

Not surprisingly, the terminology and notation that Einstein uses in 1905 seem rather archaic and obscure from a modern perspective and I could not help jotting down modern interpretations of what he was saying as I was reading his paper. To begin with, the `Maxwell-Hertz equations for empty space’ that Einstein refers to at the start can be viewed as arising from a simple scenario in which there is a changing charge and current distribution within a certain region of space, but we are considering the fields produced by this source of radiation in the free space outside the region. In this free space outside the region the charge and current densities (denoted by \rho and \vec{J} respectively) are everywhere zero and the differential form of the Maxwell equations in SI units reduce to

\nabla \cdot \vec{E} = 0

\nabla \cdot \vec{B} = 0

\mu_0 \epsilon_0 \frac{\partial \vec{E}}{\partial t} = \nabla \times \vec{B}

\frac{\partial \vec{B}}{\partial t} = - \nabla \times \vec{E}

where the parameters \epsilon_0 and \mu_0 and the speed of light c are related by the equation

c^2 = \frac{1}{\mu_0 \epsilon_0}

The vector \vec{E} = (E_x, E_y, E_z) denotes the electric field which is defined at any given point as the electric force per unit charge on an infinitesimally small positive test charge placed at that point. This corresponds to the vector (X, Y, Z) in Einstein’s paper which he called den Vektor der elektrischen (‘the vector of the electric force’). The vector \vec{B} = (B_x, B_y, B_z) denotes the magnetic field and this corresponds to the vector (L, M, N) in Einstein’s paper. A moving charge creates a magnetic field in all of space and this magnetic field exerts a force on any other moving charge. (Charges at rest do not produce magnetic fields nor do magnetic fields exert forces on them).

In his paper Einstein actually employs an alternative formulation of the above Maxwell equations in which quantities are measured in Gaussian units rather than SI units. This involves defining

\epsilon_0 = \frac{1}{4 \pi c}

(and therefore \mu_0 = \frac{4 \pi}{c}) and also rescaling the electric field vector as

\vec{E} = c^{-1} \vec{E}_{SI}

where \vec{E}_{SI} denotes the electric field vector expressed in SI units. When these changes are substituted into the above Maxwell equations, the equations become

\nabla \cdot \vec{E} = 0

\nabla \cdot \vec{B} = 0

\frac{1}{c} \frac{\partial \vec{E}}{\partial t} = \nabla \times \vec{B}

\frac{1}{c} \frac{\partial \vec{B}}{\partial t} = - \nabla \times \vec{E}

The last two of these are the equations appearing at the start of §6 of Einstein’s paper, with V \equiv c there. Incidentally, these are also the equations that are responsible for the emergence of electromagnetic waves, as can easily be demonstrated by taking the curl of both sides of the equations to get

\frac{1}{c} \frac{\partial (\nabla \times \vec{E})}{\partial t} = \nabla \times (\nabla \times \vec{B})

\frac{1}{c} \frac{\partial (\nabla \times \vec{B})}{\partial t} = - \nabla \times (\nabla \times \vec{E})

On the left-hand sides I have taken the curl operator into the partial derivative which is permissible since curl does not involve time. We can then replace the curl inside each partial derivative by the corresponding term in Maxwell’s equations. Using a standard identity in vector calculus the right-hand-sides become

\nabla \times (\nabla \times \vec{B}) = \nabla (\nabla \cdot \vec{B}) - \nabla^{2} \vec{B}

\nabla \times (\nabla \times \vec{E}) = \nabla (\nabla \cdot \vec{E}) - \nabla^{2} \vec{E}

Since the divergence terms are zero in free space we are left only with the Laplacian terms on the right-hand sides. Putting the left and right-hand sides together we get the electromagnetic wave equations

\nabla^2 \vec{B} = \frac{1}{c^2} \frac{\partial^2 \vec{B}}{\partial t^2}

\nabla^2 \vec{E} = \frac{1}{c^2} \frac{\partial^2 \vec{E}}{\partial t^2}

With regard to Einstein’s demonstration that the electric and magnetic fields get mixed together under a Lorentz transformation, we would show this now via a Lorentz transformation of a rank-2 electromagnetic field tensor of the form

F^{\mu \nu} = \begin{bmatrix} 0 & \frac{1}{c} E_x & \frac{1}{c} E_y & \frac{1}{c} E_z \\ \ \\ -\frac{1}{c} E_x & 0 & B_z & -B_y \\ \ \\-\frac{1}{c} E_y & -B_z & 0 & B_x \\ \ \\ -\frac{1}{c}E_z & B_y & -B_x & 0 \end{bmatrix}

Note that we are using SI units here, which is why the coefficient \frac{1}{c} appears in front of the electric field components. This coefficient would disappear if we were using Gaussian units. To apply the Lorentz transformation to the electromagnetic field tensor we observe that since F^{\mu \nu} is a rank-2 tensor of type (2, 0) it transforms according to

F^{\prime \  \alpha \beta} = \frac{\partial x^{\prime \alpha}}{\partial x^{\mu}} \frac{\partial x^{\prime \beta}}{\partial x^{\nu}} F^{\mu \nu}


\Lambda^{\alpha}_{\hphantom{\alpha} \mu} \equiv \frac{\partial x^{\prime \alpha}}{\partial x^{\mu}}

we obtain the required Lorentz transformation of the electromagnetic field tensor as

F^{\prime \  \alpha \beta} = \Lambda^{\alpha}_{\hphantom{\alpha} \mu} \Lambda^{\beta}_{\hphantom{\beta} \nu} F^{\mu \nu}

But \Lambda^{\beta}_{\hphantom{\beta} \nu} F^{\mu \nu} involves multiplying corresponding terms in each row of the matrices \Lambda^{\beta}_{\hphantom{\beta} \nu} and F^{\mu \nu} which is the same as the matrix multiplication

[F^{\mu \nu}] [\Lambda^{\beta}_{\hphantom{\beta} \nu}]^T

where the T denotes the matrix transpose. (Note that the transformation matrix for frames in standard configuration is symmetric, so it is unaffected by transposition). Applying \Lambda^{\alpha}_{\hphantom{\alpha} \mu} on the left then amounts to multiplying the above matrix product on the left by the matrix [\Lambda^{\alpha}_{\hphantom{\alpha} \mu}]. Thus, we are able to compute the required Lorentz transformation of the electromagnetic field tensor as a matrix product. We get

F^{\prime \  \alpha \beta} = \Lambda^{\alpha}_{\hphantom{\alpha} \mu} \Lambda^{\beta}_{\hphantom{\beta} \nu} F^{\mu \nu}

=  [\Lambda^{\alpha}_{\hphantom{\alpha} \mu}] [F^{\mu \nu}] [\Lambda^{\beta}_{\hphantom{\beta} \nu}]^T

= \begin{bmatrix} \beta & -\beta \big(\frac{v}{c}\big) & 0 & \  0\\ \ \\ -\beta \big(\frac{v}{c}\big) & \beta & 0 & \ 0 \\ \ \\0 & 0 & \  1 & \  \ 0 \\ \ \\ 0 & 0 & \  0 & \  \  1 \end{bmatrix} \begin{bmatrix} 0 & \frac{1}{c} E_x & \frac{1}{c} E_y & \frac{1}{c} E_z \\ \ \\ -\frac{1}{c} E_x & 0 & B_z & -B_y \\ \ \\-\frac{1}{c} E_y & -B_z & 0 & B_x \\ \ \\ -\frac{1}{c}E_z & B_y & -B_x & 0 \end{bmatrix} \begin{bmatrix} \beta & -\beta \big(\frac{v}{c}\big) & 0 & \  0\\ \ \\ -\beta \big(\frac{v}{c}\big) & \beta & 0 & \ 0 \\ \ \\0 & 0 & \  1 & \  \ 0 \\ \ \\ 0 & 0 & \  0 & \  \  1 \end{bmatrix}

= \begin{bmatrix} 0 & \frac{E_x}{c} & \beta \bigg(\frac{E_y}{c} - \big(\frac{v}{c} \big) B_z \bigg) & \beta \bigg(\frac{E_z}{c} + \big(\frac{v}{c} \big) B_y \bigg)\\ \ \\ -\frac{E_x}{c} & 0 & \beta \bigg(B_z - \big(\frac{v}{c} \big) \frac{E_y}{c} \bigg) & -\beta \bigg(B_y + \big(\frac{v}{c} \big) \frac{E_z}{c} \bigg) \\ \ \\ -\beta \bigg(\frac{E_y}{c} - \big(\frac{v}{c} \big) B_z \bigg) & -\beta \bigg(B_z - \big(\frac{v}{c} \big) \frac{E_y}{c} \bigg) & 0 & B_x \\ \ \\ -\beta \bigg(\frac{E_z}{c} + \big(\frac{v}{c} \big) B_y \bigg) & \beta \bigg(B_y + \big(\frac{v}{c} \big) \frac{E_z}{c} \bigg) & -B_x & 0 \end{bmatrix}

\equiv \begin{bmatrix} 0 & \frac{E_x}{c}^{\prime} & \frac{E_y}{c}^{\prime} & \frac{E_z}{c}^{\prime} \\ \ \\ -\frac{E_x}{c}^{\prime} & 0 & B_z^{\prime} & -B_y^{\prime} \\ \ \\-\frac{E_y}{c}^{\prime} & -B_z^{\prime} & 0 & B_x^{\prime} \\ \ \\ -\frac{E_z}{c}^{\prime} & B_y^{\prime} & -B_x^{\prime} & 0 \end{bmatrix}

Comparing the entries in the last two matrices we get exactly the same relations as Einstein did on page 909 of his paper, except that we are using SI units here so the electric field components are divided by c:

\frac{E_x}{c}^{\prime} = \frac{E_x}{c}

\frac{E_y}{c}^{\prime} = \beta \bigg(\frac{E_y}{c} - \big(\frac{v}{c} \big) B_z \bigg)

\frac{E_z}{c}^{\prime} = \beta \bigg(\frac{E_z}{c} + \big(\frac{v}{c} \big) B_y \bigg)

B_x^{\prime} = B_x

B_y^{\prime} = \beta \bigg(B_y + \big(\frac{v}{c} \big) \frac{E_z}{c} \bigg)

B_z^{\prime} = \beta \bigg(B_z - \big(\frac{v}{c} \big) \frac{E_y}{c} \bigg)