Intro to Lie groups for rigid transformations
===
# Introduction
A **Lie group** is a group that deals with manifolds.
What is a manifold and what is a group?
When we think about geometry, most of us think about geometry in a [_Euclidean space_](https://en.wikipedia.org/wiki/Euclidean_space).
A Euclidean space has many nice properties.
One property is that, if you travel in one direction, you keep going in that direction, and never find yourself going in another direction.
A good example is the 2D Euclidean space $\mathbb R^2$.
If you're travelling in the $x$-direction, the $y$-coordinate couldn't care less!
In a Euclidean space, you can also keep travelling in a particular direction and never come back to where you started.
However, there are some spaces that are non-Euclidean.
For example, imagine standing on the surface of a sphere.
If you're standing in San Francisco facing north, and you keep walking forwards, you'll eventually find yourself in Kazakhstan, which is in the Eastern hemisphere, even though you started out in the West!
And then if you go even further, you'll end up right where you started!
pic little-prince.png cover image of The Little Prince : The Little Prince stands on a spherical surface which is a manifold. He can walk forwards, backwards, left, and right as though he's on a flat plane, but in the end the sphere is not a plane.
A **manifold** is a space that may not be Euclidean, but sure looks Euclidean in a small area around you.
In the sphere example, the small area around you looks pretty much just like a flat plane, which is Euclidean!
Cue the flat Earther jokes!
A **group** is a set that has some operation $\oplus$ to combine two things in the group to make a third thing, so that:
* **closure**: $A \oplus B$ is still in the group
* **associativity**: $(A \oplus B) \oplus C = A \oplus (B \oplus C)$
* **identity**: there is an [identity element](https://en.wikipedia.org/wiki/Identity_element) $I$ such that $I \oplus A = A \oplus I = A$
* **inverse**: for every $A$, there is an $A^{-1}$ so that $A^{-1} \oplus A = I$.
A **matrix group** uses matrix multiplication for $\oplus$ so that it automatically satisfies the last three requirements.
A **Lie group** is a group that is also associated with a **Lie algebra** to help us deal with manifolds.
If we proceed with the analogy of standing on a the spherical Earth...
* The Lie algebra handles the locally Euclidean stuff on the manifold, like walking forwards or walking left while on a spherical Earth.
* The Lie group deals with bigger motions, like an intercontinental flight, where you can no longer assume a flat Earth.
Before we get into details, let us begin with an example.
# Rotations in 2D
Now we can start with an example of a classical Lie group called $SO(2)$, which stands for the _special orthogonal group_.
Put simply, this is a 2D rotation.
2D rotations are not Euclidean, because if you rotate by 360 degrees, you'll end up at the same place, which is something that cannot happen in a Euclidean space.
However, locally, they are like $\mathbb R^1$, a trivial 1-dimensional Euclidean space.
Imagine a point
$ \mathbf p = \begin{bmatrix}x\\y\end{bmatrix}
and we want to rotate it anticlockwise a little bit, by a small angle $\omega$.
How does the point's position change?
Well, first we study how it changes depending on the point's $x$ position.
pic https://pics.dllu.net/file/dllu-sc/rsMYJGGVCQ978eHB.png how a point on the x-axis moves : How a point's position changes based on its $x$-position.
As you can see, the more the $x$-coordinate, the more the $y$-coordinate increases.
Yet, the way its $x$-coordinate changes doesn't depend on its previous $x$-coordinate at all.
So we can write:
$ y_\text{new} = y + \omega x
Likewise, the more the $y$-coordinate, the more the $x$-coordinate _decreases.
pic https://pics.dllu.net/file/dllu-sc/eZDXFt4IYIor9Z68.png how a point moves when rotated in 2D : How a point's position changes when rotated in 2D.
So we can write:
$ x_\text{new} = x - \omega y
In matrix notation,
$ \begin{bmatrix}x_\text{new}\\y_\text{new}\end{bmatrix} = \begin{bmatrix}x - \omega y\\y + \omega x\end{bmatrix} = \underbrace{\left(\mathbf I + \omega \begin{bmatrix}0 & -1\\1 & 0\end{bmatrix} \right)}_A \mathbf p
Well, that's cool, but we've only rotated it by a tiny amount.
To rotate it by a nontrivial amount, we need to do it a bunch of times!
pic https://pics.dllu.net/file/dllu-sc/IpbdzkJQomVJ35iX.png moving a point several times : Moving a point several times by applying the same transformation repeatedly.
Suppose we actually want to rotate things by an angle $\theta$.
We can make the tiny amount $\omega$ be a small fraction of $\theta$, say,
$ \omega = \frac{\theta}{n}
Then, to rotate it by the full angle $\theta$, we'd need to apply it $n$ times.
The biggger the $n$, the better!
$ \mathbf R \mathbf p = \lim_{n\rightarrow \infty} \left(\mathbf I + \frac{\theta}{n} \begin{bmatrix}0 & -1\\1 & 0\end{bmatrix}\right)^n \mathbf p
Let's define the _generator_ $\mathbf G$:
$ \mathbf G = \begin{bmatrix}0 & -1\\1 & 0\end{bmatrix}
So the rotation matrix is:
$ \mathbf R = \lim_{n\rightarrow \infty} \left(\mathbf I + \frac{\theta \mathbf G}{n}\right)^n
To find the limit, we can use the [binomial theorem](https://en.wikipedia.org/wiki/Binomial_theorem):
$ \begin{align}\left(\mathbf I + \frac{\theta \mathbf G}{n}\right)^n = \mathbf I
&+ \frac{n}{1!}\left(\frac{\theta \mathbf G}{n}\right)^1\\
&+ \frac{n(n-1)}{2!}\left(\frac{\theta \mathbf G}{n}\right)^2\\
&+ \frac{n(n-1)(n-2)}{3!}\left(\frac{\theta \mathbf G}{n}\right)^3\\
&+ \cdots\end{align}
But then as $n$ approaches infinity, all the $n$s cancel out.
We use the [monotone convergence theorem](https://en.wikipedia.org/wiki/Monotone_convergence_theorem#Convergence_of_a_monotone_series) and apply the limit to each term, e.g. the third one:
$ \lim_{n\rightarrow \infty} \frac{n(n-1)(n-2)}{n^3} = 1
Now we're left with:
$ \mathbf R = \mathbf I + \theta \mathbf G + \frac{(\theta \mathbf G)^2}{2!} + \frac{(\theta \mathbf G)^3}{3!} + \cdots
Notice it is the Taylor expansion of the exponential function.
This is called the **exponential map**, and the Taylor series definition applies to all Lie groups.
$ \mathbf R \equiv \exp{(\theta \mathbf G)}
In the case of 2D rotations, we can sum up this series with a closed form.
Notice that:
$ \mathbf G^2 = -\mathbf I.
Then we group odd and even terms and notice they resemble the Taylor series of cosine and sine,
$ \begin{align}
\mathbf R &= \mathbf I + \theta \mathbf G + \frac{(\theta \mathbf G)^2}{2!} + \frac{(\theta \mathbf G)^3}{3!} + \cdots\\
&= \mathbf I + \theta\mathbf G - \frac{\theta^2 \mathbf I}{2!} - \frac{\theta^3 \mathbf G}{3!} + \frac{\theta^4\mathbf I}{4!} + \frac{\theta^5\mathbf G}{5!} \cdots\\
&= \left(1 + \frac{\theta^2}{2!} + \frac{\theta^4}{4!} \cdots\right)\mathbf I + \left(\frac{\theta}{1!} - \frac{\theta^3}{3!} + \frac{\theta^5}{5!} -\cdots\right) \mathbf G\\
&= \mathbf I \cos\theta + \mathbf G \sin \theta\\
&= \begin{bmatrix}
\cos \theta & -\sin \theta\\
\sin\theta & \cos\theta
\end{bmatrix}
\end{align}
Hooray, we've just derived the rotation matrix!
To recap:
* The rotation matrix $\mathbf R$ is in the Lie group $SO(2)$.
* The matrix $\theta \mathbf G$ is in the Lie algebra $\mathfrak{so}(2)$, which is of dimensionality 1 and behaves just like $\mathbb R^1$. The value $\theta$ represents the angle of how much you're rotating. The local space that is sort of Euclidean-ish is known as the _tangent space_. In this case, as you've seen from the pictures, it literally resembles the tangent of a circle.
* The exponential map relates the two: $\exp : \mathfrak{so}(3)\rightarrow SO(3)$
The astute reader will notice many similarities between this exercise and imaginary numbers.
Indeed, an imaginary number $\theta i$, representing an angle, is related to rotation in the form of $\exp(\theta i)$ in much the same way.
# Rotations in 3D
Unlike in 2D, we can now rotate things in three axes.
pic https://pics.dllu.net/file/dllu-sc/A_DxINKfiXmw7SJW.png coordinate system : We can rotate things in three axes in 3D.
Based on the same idea as previously, we can think about what happens if we rotate stuff slightly by $\theta$ around the $z$-axis.
Well, the point's $z$-coordinate remains unchanged, but its $x$- and $y$-coordinates change in exactly the same way as previously.
$ \mathbf G_z = \begin{bmatrix}0 & -1 & 0\\
1 & 0 & 0 \\
0 & 0& 0
\end{bmatrix}
Notice the top-left $2\times 2$ submatrix is exactly the same as in the previous section.
We can likewise derive:
$ \mathbf G_x = \begin{bmatrix}0 & 0 & 0\\
0 & 0 & -1\\
0 & 1 & 0\end{bmatrix},
\mathbf G_y = \begin{bmatrix}0 & 0 & 1\\
0 & 0 & 0\\
-1 & 0 & 0
\end{bmatrix}.
Then, an element $\boldsymbol \omega_\times \in \mathfrak{so}(3)$ is the skew-symmetric matrix:
$ \begin{align} \boldsymbol \omega_\times &= \omega_x \mathbf G_x + \omega_y \mathbf G_y + \omega_z \mathbf G_z \\
&= \begin{bmatrix}0 & -\omega_3 & \omega_2\\
\omega_3 & 0 & -\omega_1\\
-\omega_2 & \omega_1 & 0\end{bmatrix}.
\end{align}
Note that the matrix $\boldsymbol \omega_\times$ is called the "cross-product matrix" because
$ \boldsymbol \omega_\times \mathbf p =
\begin{bmatrix}\omega_1\\\omega_2\\\omega_3\end{bmatrix}\times \mathbf p = \boldsymbol \omega \times \mathbf p.
The exponential can be computed again by grouping odd and even terms of the Taylor series.
$ \begin{align}
\mathbf R = \exp(\boldsymbol \omega_\times)
&\equiv \mathbf I + \boldsymbol \omega_\times + \frac{\boldsymbol\omega_\times^2}{2!} + \frac{\boldsymbol\omega_\times^3}{3!} + \cdots\\
&=\mathbf I + \boldsymbol \omega_\times \frac{\sin \|\boldsymbol \omega\|}{\|\boldsymbol \omega\|}
+ \boldsymbol \omega_\times^2 \frac{1 - \cos \|\boldsymbol \omega\|}{\|\boldsymbol \omega\|^2}.
\end{align}
This is called the [Rodrigues' rotation formula](https://en.wikipedia.org/wiki/Rodrigues%27_rotation_formula).
The vector $\boldsymbol \omega$ is also called the **axis-angle** representation because its norm $\|\boldsymbol \omega\|$ represents the angle, while the normalised vector is the axis of rotation.
Interestingly, in all the examples we've looked at so far, we're dealing with spheres whose surfaces look like planes of one fewer dimension.
* for 2D rotations, the 1D rotation angle $\theta$ is the tangent space of a 2D circle. We discussed an analogy to imaginary numbers
* the Little Prince is on a 3D sphere whose tangent space is a 2D plane
* here the 3D tangent space $\boldsymbol \omega$ is related to a 4D sphere! The 4D sphere is known as unit **quaternions**, an extension of imaginary numbers
Just like how a 3D sphere has $x^2 + y^2 + z^2 = 1$, a unit quaternion representing must have $q_x^2 + q_y^2 + q_z^2 + q_w^2 = 1$.
## Implementations of 3D rotations
The Lie Group SO(3) is a group of rotation matrices, but we can implement it in many ways.
For example we could imagine there being an abstract trait `SO3` in your favourite programming language, supporting common operators such as `rotate_point`, `compose`, `log`, and so on.
The following is Rust-like pseudocode.
~~~
lang rust
trait SO3 {
fn from_so3(so3: &Point3) -> Self;
fn rotate_point(&self, point: &Point3) -> Point3;
fn compose(&self, other: &Self) -> Self;
/* ... and so on ...*/
}
impl SO3 for Mat3 {
fn from_so3(so3: &Point3) -> Self {
// TODO: implement Rodrigues formula here lol
}
fn rotate_point(&self, point: &Point3) -> Point3 {
// using matrix multiplication
self * point
}
fn compose(&self, other: &Self) -> Self {
// using matrix multiplication
self * other
}
}
impl SO3 for Quaternion {
fn from_so3(so3: &Point3) -> Self {
let half_theta = so3.norm() / 2.0;
let s = theta.sin();
let c = theta.cos();
Self::new(s * so3.x, s * so3.y, s * so3.z, c)
}
fn rotate_point(&self, point: &Point3) -> Point3 {
// using quaternion multiplication
let q = Quaternion::new(point.x, point.y, point.z, 0.0);
self.conjugate() * q * self
}
fn compose(&self, other: &Self) -> Self {
// using quaternion multiplication
self * other
}
}
~~~
There are many other ways to implement `SO3` as [explained on Wikipedia](https://en.wikipedia.org/wiki/Rotation_formalisms_in_three_dimensions). Common ways to implement `SO3` could include:
* axis angle, which is basically just $\mathfrak{so}(3)$
** store the angle as the norm of the axis
** store the angle separately from the axis, which is normalized
** store the tangent of the angle (Gibbs vector)
* Euler angles, which have the problem of Gimbal lock
Which parameterisation of $SO(3)$ to use depends on what you want to do.
* Transforming many points with the same rotation? Use a rotation matrix
* Composing rotations often? Quaternion
* Controlling a pan tilt camera? Euler angles
# Rigid transformations SE(3)
The Special Euclidean Group $SE(3)$ deals with rigid transformations by combining a rotation in $SO(3)$ with a translation in $\mathbb R^3$.
$ \mathbf T_{4\times 4} = \begin{bmatrix}
\mathbf R_{3\times 3} & \mathbf t_{3\times 1}\\
\mathbf 0_{1\times 3} & 1
\end{bmatrix} \in SE(3).
Just like $SO(2)$ and $SO(3)$, each element in $SE(3)$ is associated with an element on the corresponding Lie algebra, $\mathfrak{se}(3)$:
$ \begin{align}
\boldsymbol\xi_{4\times 4} &= \begin{bmatrix}
[\boldsymbol\omega]_\times & \boldsymbol\tau\\
\mathbf 0_{1 \times 3} & 0
\end{bmatrix} \in \mathfrak{se}(3)\\
&= \begin{bmatrix}
0 & -\omega_3 & \omega_2 & \tau_1\\
\omega_3 & 0 & -\omega_1 & \tau_2\\
-\omega_2 & \omega_1 & 0 & \tau_3\\
0 & 0 & 0 & 0
\end{bmatrix} \in \mathfrak{se}(3)
\end{align}
where $[\boldsymbol\omega]_\times$ is the cross product matrix of $\omega$. We may also write it as a $6\times 1$ vector.
$ \boldsymbol\xi_{6\times 1} = \begin{bmatrix}
\boldsymbol\omega_{3\times 1}\\
\boldsymbol\tau_{3\times 1}
\end{bmatrix}.
Some textbooks put the rotational part on top and some put it on the bottom; it doesn't matter as long as you're consistent.
In future blog posts we will use the $6\times 1$ and $4\times 4$ representations interchangeably depending on context.
A transformation $\mathbf T \in SE(3)$ is used to transform a 3D point $\mathbf p \in \mathbb R^3$:
$ \mathbf p_{3\times 1} = \begin{bmatrix}x\\y\\z\end{bmatrix}.
The homogeneous representation adds a one at the end:
$ \mathbf p_{4\times 1} = \begin{bmatrix}x\\ y\\ z\\ 1\end{bmatrix}
so that points may be transformed rigidly
$ \mathbf T \mathbf p \equiv \mathbf R \mathbf p + \mathbf t.
# Things to do with Lie groups
Lie groups provide us with elegant tools to do:
* **Interpolation**. For example the element between $A$ and $B$ is $A \exp(0.5 \log(A^{-1} B))$. With Lie groups, it's possible to define all sorts of curves, such as splines, without running into pitfalls due to the non-Euclidean nature of things such as 3D rotations.
* **Linearisation**. Lie algebras allow us to differentiate with respect to small perturbations. This allows us to do non-linear least squares optimisation, as well as other estimation techniques based on linearisation, such as an extended Kalman filter.
In a next blog post we will cover optimization on the manifold of Lie groups for all kinds of robotics applications like SLAM!