A note on the quaternion rotation operator

Plaque Sir William Rowan Hamilton famously discovered the key rules for quaternion algebra while walking with his wife past a bridge in Dublin in 1843. A plaque (shown left) now commemorates this event.

I needed to use the quaternion rotation operator recently and while digging around the literature on this topic I noticed that a lot of it is quite unclear and over-complicated. See, e.g., this Wikipedia article about it and references therein. A couple of simple, yet vital, ideas, if they were spelt out, would make key results seem less mysterious but these never seem to be mentioned.  The vector notation often used in this area also seems to over-complicate things. In this note I want to record some thoughts about the quaternion rotation operator, bringing out some key underlying ideas that (to me) make things seem far less mysterious.

Quaternions are hypercomplex numbers of the form

q = a + bi + cj + dk

In many ways (as I will show below) they can usefully be thought about using familar ideas for two-dimensional complex numbers x + yi in which i \equiv \sqrt{-1}.

cyclic In the case of quaternions, the identities i^2 = j^2 = k^2 = ijk = -1 discovered by Hamilton determine all possible products of i, j and k. They imply a cyclic relationship when calculating their products (similar to that of the cross products of the three-dimensional basis vectors i, j, k, which is why authors often use these basis vectors when defining quaternions). Taking products clockwise one obtains

ij = k
jk = i
ki = j

and taking products anticlockwise one obtains

ik = -j
kj = -i
ji = -k

This algebraic structure leads to some interesting differences from the familiar algebra of ordinary two-dimensional complex numbers, particularly non-commutativity of multiplication. Technically, one says that the set of all quaternions with the operations of addition and multiplication constitute a non-commutative division ring, i.e., every non-zero quaternion has an inverse and quaternion products are generally non-commutative.

I was interested to notice that one way in which this algebraic difference with ordinary complex numbers manifests itself is in taking the complex conjugate of products. With ordinary complex numbers z_1 = x_1 + y_1 i and z_2 = x_2 + y_2 i one obtains

\overline{z_1 \cdot z_2} = \overline{z_1} \cdot \overline{z_2}

since

z_1 \cdot z_2 = x_1 x_2 + (x_1 y_2 + x_2 y_1)i - y_1 y_2 = (x_1 x_2 - y_1 y_2) + (x_1 y_2 + x_2 y_1)i

and therefore

\overline{z_1 \cdot z_2} = (x_1 x_2 - y_1 y_2) - (x_1 y_2 + x_2 y_1)i

but this is the same as

\overline{z_1} \cdot \overline{z_2} = (x_1 - y_1 i)(x_2 - y_2 i) = x_1 x_2 - (x_1 y_2 + x_2 y_1)i - y_1 y_2

= (x_1 x_2 - y_1 y_2) - (x_1 y_2 + x_2 y_1)i

With the product of two quaternions q_1 and q_2 we get a different result:

\overline{q_1 \cdot q_2} = \overline{q_2} \cdot \overline{q_1}

In words, the complex conjugate of the product is the product of the complex conjugates in reverse order. To see this, let

q_1 = a_1 + b_1i + c_1j + d_1k

q_2 = a_2 + b_2i + c_2j + d_2k

Then

q_1 \cdot q_2 =

a_1a_2 + a_1b_2i + a_1c_2j + a_1d_2k

+ a_2b_1i - b_1b_2 + b_1c_2k - b_1d_2j

+ a_2c_1j - b_2c_1k - c_1c_2 + c_1d_2i

+ a_2d_1k + b_2d_1j - c_2d_1i - d_1d_2

=

(a_1a_2 - b_1b_2 - c_1c_2 - d_1d_2)

+ (a_1b_2 + a_2b_1 + c_1d_2 - c_2d_1)i

+ (a_1c_2 - b_1d_2 + a_2c_1 + b_2d_1)j

+ (a_1d_2 + b_1c_2 - b_2c_1 + a_2d_1)k

Therefore

\overline{q_1 \cdot q_2} =

(a_1a_2 - b_1b_2 - c_1c_2 - d_1d_2)

+ (c_2d_1 - a_1b_2 - a_2b_1 - c_1d_2)i

+ (b_1d_2 - a_1c_2 - a_2c_1 - b_2d_1)j

+ (b_2c_1 - a_1d_2 - b_1c_2 - a_2d_1)k

But this is the same as

\overline{q_2} \cdot \overline{q_1}

= (a_2 - b_2i - c_2j - d_2k)(a_1 - b_1i - c_1j - d_1k)

= a_1a_2 - a_2b_1i - a_2c_1j - a_2d_1k

- a_1b_2i - b_1b_2 + b_2c_1k - b_2d_1j

- a_1c_2j - b_1c_2k - c_1c_2 + c_2d_1i

- a_1d_2k + b_1d_2j - c_1d_2i - d_1d_2

=

(a_1a_2 - b_1b_2 - c_1c_2 - d_1d_2)

+ (c_2d_1 - a_1b_2 - a_2b_1 - c_1d_2)i

+ (b_1d_2 - a_1c_2 - a_2c_1 - b_2d_1)j

+ (b_2c_1 - a_1d_2 - b_1c_2 - a_2d_1)k

The complex conjugate is used to define the length |q| of a quaternion

q = a + bi + cj + dk

as

|q| = \sqrt{\overline{q} \cdot q}

= \sqrt{(a - bi - cj - dk)(a + bi + cj + dk)}

= \sqrt{a^2 + b^2 + c^2 + d^2}

To find the inverse q^{-1} of a quaternion we observe that

q \cdot q^{-1} = 1

so

\overline{q} \cdot q \cdot q^{-1} = \overline{q}

\iff |q|^2 \cdot q^{-1} = \overline{q}

\iff q^{-1} = \frac{\overline{q}}{|q|^2}

For a quaternion

q = a + bi + cj + dk

let

r \equiv \sqrt{b^2 + c^2 + d^2}

A key result that helps to clarify the literature on quaternion rotation is that

\frac{bi + cj + dk}{r} = \sqrt{-1}

This seems mysterious at first but can easily be verified by confirming that when the term on the left hand side is multiplied by itself, the result is -1:

\frac{1}{r^2}(bi + cj + dk)(bi + cj + dk)

= \frac{1}{r^2}(-b^2 + bck - bdj -cbk - c^2 + cdi + dbj - dci - d^2)

= \frac{-b^2 - c^2 - d^2}{r^2}

= -\frac{r^2}{r^2} = -1

This result means that any quaternion of the above form can be written as

q = a + r \big( \frac{bi + cj + dk}{r} \big)

= a + r \sqrt{-1}

This is just a familiar two-dimensional complex number! It therefore has an angle \theta associated with it, given by the equations

quaternionargand

\cos \theta = \frac{a}{|q|}

\sin \theta = \frac{r}{|q|}

\tan \theta = \frac{r}{a}

We can express the quaternion in terms of this angle as

q = |q|(\cos \theta + \frac{bi + cj + dk}{r} \sin \theta)

If q is a unit quaternion (i.e., |q| = 1) we then get that

q = \cos \theta + \frac{bi + cj + dk}{r} \sin \theta

This is the form that is needed in the context of quaternion rotation. It turns out that for any unit quaternion of this form and for any vector (v_1, v_2, v_3) \in \mathbb{R} the operation

L_q(v_1, v_2, v_3) = q \cdot (v_1, v_2, v_3) \cdot \overline{q}

will result in a rotation of the vector (v_1, v_2, v_3) through an angle 2 \theta about the vector (b, c, d) as the axis of rotation. The direction of rotation is given by the familiar right-hand rule, i.e., the thumb of the right hand points in the direction of the vector (b, c, d) and the fingers then curl in the direction of rotation.

As an example, suppose we want to rotate the vector (0, 0, 1) through 90^{\circ} about the vector (0, 1, 0) in the sense of the right-hand rule.

unitvectors

Looking at the diagram above, we would expect the result to be the vector (1, 0, 0). Using the quaternion rotation operator to achieve this, we would specify

2 \theta = 90^{\circ} \implies \theta = 45^{\circ}

\cos \theta = \frac{1}{\sqrt{2}}

\sin \theta = \frac{1}{\sqrt{2}}

\frac{bi + cj + dk}{r} = \frac{0i + 1j + 0k}{1} = j

q = \frac{1}{\sqrt{2}} + \frac{1}{\sqrt{2}} j

\overline{q} = \frac{1}{\sqrt{2}} - \frac{1}{\sqrt{2}} j

v_1i + v_2j + v_3k = 0i + 0j + 1k = k

The resulting vector would be

L_q(v_1, v_2, v_3) = q \cdot (v_1i, v_2j, v_3k) \cdot \overline{q}

= \big(\frac{1}{\sqrt{2}} + \frac{1}{\sqrt{2}}j\big)(k)\big(\frac{1}{\sqrt{2}} - \frac{1}{\sqrt{2}}j\big)

= \big(\frac{1}{\sqrt{2}}k + \frac{1}{\sqrt{2}}i\big)\big(\frac{1}{\sqrt{2}} - \frac{1}{\sqrt{2}}j\big)

= \frac{1}{2}k + \frac{1}{2}i + \frac{1}{2}i - \frac{1}{2}k

= i

= 1i + 0j + 0k

This result is interpreted as the vector (1, 0, 0) which is exactly what we expected based on the diagram above.

Note that to achieve the same result through conventional matrix algebra we would have to use the unwieldy rotation matrix

\begin{bmatrix} \cos \varphi & 0 & \sin \varphi\\0 & 1 & 0\\-\sin \varphi & 0 & \cos \varphi\end{bmatrix}

Setting \varphi = 90^{\circ} and applying this to the vector (0, 0, 1)^T we get

\begin{bmatrix} 0 & 0 & 1\\0 & 1 & 0\\-1 & 0 & 0\end{bmatrix} \begin{bmatrix}0\\0\\1\end{bmatrix}

= \begin{bmatrix}1\\0\\0\end{bmatrix}

This is the same result, but note that this approach is more complicated to implement because there are different rotation matrices for different axes of rotation and for different rotation conventions. The above rotation matrix happens to be the one required to achieve a rotation about the y-axis using the right-hand rule. In general, the correct rotation matrix would have to be computed each time to suit the particular rotation required. Once the angle of rotation and the axis of rotation are known, the quaternion rotation operator is much easier to specify and implement.