SE(3) constraints for robotics

This document summarizes some common maths used for state estimation of rigid bodies such as in robotics.

1 Transformation parameterisation

Rigid transformations in 3 dimensions are known as the special Euclidean group, SE(3), and can be written in the homogeneous form.

1 \displaystyle{\begin{align}\mathbf T_{4\times 4} = \begin{bmatrix}
\mathbf R_{3\times 3} & \mathbf t_{3\times 1}\\
\mathbf 0_{1\times 3} & 1
\end{bmatrix} \in SE(3).\end{align}}

The rotation matrix \mathbf R is in the special orthogonal group SO(3), which means that it is orthogonal (its columns are normal and orthogonal to each other) and it has determinant +1.

1.1 Transforming a point

An element \mathbf T \in SE(3) may transform a 3D point \mathbf p \in \mathbb R^3:

2 \displaystyle{\begin{align}\mathbf p_{3\times 1} &= \begin{bmatrix}x\\y\\z\end{bmatrix}.\end{align}}

In this document we use it interchangeably with the homogeneous representation:

3 \displaystyle{\begin{align}\mathbf p_{4\times 1} = \begin{bmatrix}x\\ y\\ z\\ 1\end{bmatrix}\end{align}}

so that points may be transformed rigidly

4 \displaystyle{\begin{align}\mathbf T \mathbf p \equiv \mathbf R \mathbf p + \mathbf t.\end{align}}

1.2 The Lie algebra

Each element in SE(3) is associated with an element on the corresponding Lie algebra, \mathfrak{se}(3):

5 \displaystyle{\begin{align}\boldsymbol{\xi}_{4\times 4} &= \begin{bmatrix}
[\boldsymbol{\omega}]_\times & \boldsymbol{\tau}\\
\mathbf 0_{1 \times 3} & 0
\end{bmatrix} \in \mathfrak{se}(3)\\
&= \begin{bmatrix}
0 & -\omega_3 & \omega_2 & \tau_1\\
\omega_3 & 0 & -\omega_1 & \tau_2\\
-\omega_2 & \omega_1 & 0 & \tau_3\\
0 & 0 & 0 & 0
\end{bmatrix} \in \mathfrak{se}(3)\end{align}}

where [\boldsymbol{\omega}]_\times is the skew symmetric form of the cross product by \omega. We may also write it as a 6\times 1 vector.

6 \displaystyle{\begin{align}\boldsymbol{\xi}_{6\times 1} = \begin{bmatrix}
\boldsymbol{\omega}_{3\times 1}\\
\boldsymbol{\tau}_{3\times 1}

We will use the 6\times 1 and 4\times 4 representations interchangeably depending on context.

Note that some other textbooks put the translational part on top and the rotational part below. It doesn’t matter much, but it will affect our notation for things like the adjoint action, differentials, etc below.

1.3 The exponential map

The group SE(3) and its algebra \mathfrak{se}(3) are related by the exponential map:

7 \displaystyle{\begin{align}\exp &: \mathfrak{se}(3) \rightarrow SE(3)\\
\log &: SE(3) \rightarrow \mathfrak{se}(3).\end{align}}

The definition of \exp is based on the Taylor series:

8 \displaystyle{\begin{align}\exp(\boldsymbol{\xi}) = \mathbf I_{4\times 4} + \boldsymbol{\xi}_{4\times 4}
+ \frac{1}{2!} \boldsymbol{\xi}_{4\times 4}^2
+ \frac{1}{3!} \boldsymbol{\xi}_{4\times 4}^3 \ldots\end{align}}

Closed forms exist. See: J. L. Blanco’s report jlblanco.

Note that SE(3) is not commutative. The adjoint action relates things in different orders:

9 \displaystyle{\begin{align}\mathbf T \exp(\boldsymbol{\delta}) = \exp\left(\operatorname{Ad}(\mathbf T) \boldsymbol{\delta}\right) \mathbf T.\end{align}}

The adjoint is a 6\times 6 matrix:

10 \displaystyle{\begin{align}\operatorname{Ad}(\mathbf T) = \begin{bmatrix}
\mathbf R & \mathbf 0_{3\times 3}\\
[\mathbf t]_\times \mathbf R & \mathbf R

1.4 Notation summary

In general,

Here we define and summarise some notation for the following sections.

skew-symmetric cross product matrix \begin{bmatrix}0 & -t_3 & t_2\\t_3 & 0 & -t_1\\-t_2 & t_1 & 0\end{bmatrix}[\mathbf t]_\times
translational part of \mathbf T \in SE(3)\mathbf t(\mathbf T)
rotational part of \mathbf T \in SE(3)\mathbf R(\mathbf T)
translational part of \boldsymbol{\xi} \in \mathfrak{se}(3)\boldsymbol{\tau}(\boldsymbol{\xi})
rotational part of \boldsymbol{\xi} \in \mathfrak{se}(3)\boldsymbol{\omega}(\boldsymbol{\xi})
compose \mathbf T_1, \mathbf T_2 \in SE(3)T_1 T_2
exp \boldsymbol{\xi} \in \mathfrak{se}(3)\exp(\boldsymbol{\xi})
log of \mathbf T \in SE(3)\log(\mathbf T)
exp \boldsymbol{\omega} \in \mathfrak{so}(3)\exp(\boldsymbol{\omega})
log of \mathbf R \in SO(3)\log(\mathbf R)
inverse of \boldsymbol{\xi} \in \mathfrak{se}(3)-\boldsymbol{\xi}
inverse of \mathbf T \in SE(3)\mathbf T^{-1}
inverse of \boldsymbol{\omega} \in \mathfrak{so}(3)-\boldsymbol{\omega}
inverse of \mathbf R \in SO(3)\mathbf R^T
adjoint of \mathbf T \in SE(3)\operatorname{Ad}(\mathbf T)
ith element of \lbrace \boldsymbol{\xi}\vert \boldsymbol{\xi} \in \mathfrak{se}(3)\rbrace\boldsymbol{\xi}_i
ith element of \lbrace \mathbf T \vert \mathbf T \in SE(3)\rbrace\mathbf T_i
transform \mathbf p \in \mathbb R^3 by \mathbf T \in SE(3)\mathbf T \mathbf p
transform \mathbf p \in \mathbb R^3 by \boldsymbol{\xi} \in \mathfrak{se}(3)\exp(\boldsymbol{\xi}) \mathbf p
Table 1 Summary of notation and implementation. For the array ones, operations are applied elementwise.

2 Derivatives

Here, we only differentiate with respect to \boldsymbol{\delta} \in \mathfrak{se}(3) around the point \boldsymbol{\delta} = \mathbf 0. In other words, we have a function \mathbf F(\mathbf T) where \mathbf T \in SE(3) and we would like to perturb \mathbf T by a very small \boldsymbol{\delta}.

Suppose \boldsymbol{\delta} = \begin{bmatrix}\boldsymbol{\omega} & \boldsymbol{\tau}\end{bmatrix}^T. If \mathbf F is of dimension n, then the resulting derivative is an n\times 6 Jacobian.

11 \displaystyle{\begin{align}\mathbf J_{n\times 6} \equiv \left.\frac{\partial \mathbf F(\mathbf T \oplus \boldsymbol{\delta})}{\partial \boldsymbol{\delta}}\right]_{\boldsymbol{\delta} = \mathbf 0} &=
\frac{\partial F_1}{\partial \omega_1} &
\frac{\partial F_1}{\partial \omega_2} &
\frac{\partial F_1}{\partial \omega_3} &
\frac{\partial F_1}{\partial \tau_1} &
\frac{\partial F_1}{\partial \tau_2} &
\frac{\partial F_1}{\partial \tau_3} \\
\vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\
\frac{\partial F_n}{\partial \omega_1} &
\frac{\partial F_n}{\partial \omega_2} &
\frac{\partial F_n}{\partial \omega_3} &
\frac{\partial F_n}{\partial \tau_1} &
\frac{\partial F_n}{\partial \tau_2} &
\frac{\partial F_n}{\partial \tau_3} \\

where \mathbf F(\mathbf T \oplus \boldsymbol{\delta}) = \mathbf F(\exp(\boldsymbol{\delta})\mathbf T) in the case of left-perturbations and \mathbf F(\mathbf T\exp(\boldsymbol{\delta})) in the case of right-perturbations.

The adjoint action can be used to relate the left and right derivatives by using the chain rule.

12 \displaystyle{\begin{align}\frac{\partial \mathbf F(\mathbf A \exp(\boldsymbol{\delta}) \mathbf B)}{\partial \boldsymbol{\delta}}
&= \frac{\partial \mathbf F(\exp(\operatorname{Ad}(\mathbf A) \boldsymbol{\delta}) \mathbf{AB})}{\partial \boldsymbol{\delta}}\\
&= \frac{\partial}{\partial \boldsymbol{\delta}} \mathbf F(\exp(\underbrace{\operatorname{Ad}(\mathbf A) \boldsymbol{\delta}}_{\boldsymbol{\epsilon}}) \mathbf{AB})\\
&= \frac{\partial}{\partial \boldsymbol{\epsilon}} \mathbf F(\exp(\boldsymbol{\epsilon}) \mathbf{AB}) \frac{\partial \boldsymbol{\epsilon}}{\partial \boldsymbol{\delta}}\\
&= \frac{\partial}{\partial \boldsymbol{\epsilon}} \mathbf F(\exp(\boldsymbol{\epsilon}) \mathbf{AB}) \operatorname{Ad}(\mathbf A).\end{align}}

We also have a derivative of the log function:

13 \displaystyle{\begin{align}\mathbf D_\text{log}(\mathbf T) &\equiv \left.\frac{\partial}{\partial \boldsymbol{\delta}}\right]_{\boldsymbol{\delta} = \mathbf 0_{6\times 1}}\log(\exp(\boldsymbol{\delta}) \mathbf T)\\
&= \begin{bmatrix}
\mathbf D_\text{log}(\boldsymbol{\omega}) & 
\mathbf 0_{3\times 3} \\
-\mathbf D_\text{log}(\boldsymbol{\omega}) \mathbf B \mathbf D_\text{log}(\boldsymbol{\omega}) &
\mathbf D_\text{log}(\boldsymbol{\omega})


14 \displaystyle{\begin{align}\mathbf D_\text{log}(\boldsymbol{\omega}) &= \mathbf I - \frac{1}{2} [\boldsymbol{\omega}_\times] + e_\theta [\boldsymbol{\omega}]_\times^2\\
\mathbf B &\equiv b_\theta [ \mathbf u]_\times + c_\theta (\boldsymbol{\omega} \mathbf u^T + \mathbf u \boldsymbol{\omega}^T) + (\boldsymbol{\omega}^t \mathbf u) \cdot \mathbf W(\boldsymbol{\omega})\\
\theta &= \|\boldsymbol{\omega}\|\\
a_\theta &= \frac{\sin\theta}{\theta}\\
b_\theta &= \frac{1-\cos\theta}{\theta^2}\\
c_\theta &= \frac{1-a_\theta}{\theta^2}\\
e_\theta &= \frac{b_\theta - 2c_\theta}{2a_\theta}\end{align}}

The exact derivation is in Ethan Eade’s report eade.

Let \mathbf T \in SE(3), \mathbf p \in \mathbb R^3, then the following table lists some useful derivatives.

As you will see later, these are basically all the derivatives you need, and all other derivatives can be easily derived from these, often together with using the adjoint.

\mathbf F(\boldsymbol{\delta})\left.\frac{\partial \mathbf F}{\partial \boldsymbol{\delta}} \right]_{\boldsymbol{\delta} = \mathbf 0}
\exp(\boldsymbol{\delta}) \mathbf T \mathbf p\begin{bmatrix}-[\mathbf T \mathbf p]_\times & \mathbf I\end{bmatrix}
\mathbf T \exp(\boldsymbol{\delta}) \mathbf p\mathbf R(\mathbf T) \begin{bmatrix}-[\mathbf p]_\times & \mathbf I\end{bmatrix}
\log(\exp(\boldsymbol{\delta}) \mathbf T)\mathbf D_\text{log} (\mathbf T)
Table 2 Summary of derivatives

3 Optimisation

Suppose we have a trajectory \mathbf S(t) \in SE(3) that is a smooth curve. We would like to minimise some objective function:

15 \displaystyle{\begin{align}\operatorname{cost}(\mathbf S) = \mathbf F(\mathbf S)^T \mathbf F(\mathbf S).\end{align}}

We seek to reduce the cost as much as possible. This is a nonlinear least squares problem. During the optimisation process, we perturb it by a small perturbation \boldsymbol{\delta}(t) \in \mathfrak{se}(3):

16 \displaystyle{\begin{align}\mathbf S_{\text{new}}(t) = \mathbf S(t) \oplus \boldsymbol{\delta}(t)\end{align}}

where the \oplus operator can either be the left-update:

17 \displaystyle{\begin{align}\mathbf S \oplus \boldsymbol{\delta} \equiv \exp(\boldsymbol{\delta}) \mathbf S\end{align}}

or the right-update:

18 \displaystyle{\begin{align}\mathbf S \oplus \boldsymbol{\delta} \equiv \mathbf S \exp(\boldsymbol{\delta}).\end{align}}

Both are valid depending on numerical properties of the problem.

In any case, we linearise the problem around \boldsymbol{\delta} = \mathbf 0.

Let the Jacobian matrix be:

19 \displaystyle{\begin{align}\mathbf J = \left. \frac{\partial}{\partial \boldsymbol{\delta}} \mathbf F(\mathbf S \oplus \boldsymbol{\delta}) \right]_{\boldsymbol{\delta} = 0}.\end{align}}

Then we can solve \boldsymbol{\delta} using Gauss-Newton:

20 \displaystyle{\begin{align}\mathbf J^T \mathbf J \boldsymbol{\delta}_\text{GN} = -\mathbf J^T \mathbf F\end{align}}

or steepest descent:

21 \displaystyle{\begin{align}\boldsymbol{\delta}_\text{s} = -\mathbf J^T \mathbf F\end{align}}

or more sophisticated algorithms. In practice, we use Powell’s Dog Leg. The update is

22 \displaystyle{\begin{align}\boldsymbol{\delta} = c_\text{GN} \boldsymbol{\delta}_\text{GN} + c_\text{s} \boldsymbol{\delta}_\text{s}\end{align}}

where scalar weights c_\text{GN}, c_\text{s} are chosen such that \|\boldsymbol{\delta} \| \leq \Delta where the scalar parameter \Delta is the radius of the trust region. When \Delta is small, the problem is behaving badly and the quadratic approximation that Gauss-Newton uses is not very valid. In this case, c_\text{GN} is zero, allowing the optimiser to take timid steps along the steepest descent direction. When \Delta is large, the quadratic approximation is good and we take bigger steps in the Gauss-Newton direction.

Heuristics are used to determine when to increase \Delta and when to decrease it.

3.1 Optimisation under uncertainty

When uncertainty is present, the data is associated with some covariance matrix \boldsymbol{\Sigma} which is an n\times n matrix where n is the sise of the residual. The Gauss-Newton update then becomes:

23 \displaystyle{\begin{align}\mathbf J^T \boldsymbol{\Sigma}^{-1} \mathbf J \boldsymbol{\delta} = -\mathbf J^T \boldsymbol{\Sigma}^{-1} \mathbf F\end{align}}

The matrix \boldsymbol{\Sigma}^{-1} is also sometimes called the information matrix. In practice, it is useful to factor this matrix, for example, by taking the matrix square root. Let \mathbf W = \boldsymbol{\Sigma}^{-\frac{1}{2}}, then,

24 \displaystyle{\begin{align}(\mathbf W\mathbf J)^T (\mathbf W\mathbf J) \boldsymbol{\delta} = -(\mathbf W \mathbf J)^T (\mathbf W \mathbf F).\end{align}}

In other words, we just pre-multiply the Jacobian and residual by a weight matrix. This is a form of whitening. In the common case where \boldsymbol{\Sigma} is a diagonal matrix, we simply divide each row of the Jacobian and the residual by the standard deviation.

Another common factorisation is the eigendecomposition of the covariance matrix \boldsymbol{\Sigma}:

25 \displaystyle{\begin{align}\boldsymbol{\Sigma} = \mathbf Q \boldsymbol{\Lambda} \mathbf Q^T\end{align}}

where \mathbf Q \in SO(n) is the square matrix whose columns are the orthonormal eigenvectors of \boldsymbol{\Sigma} and \boldsymbol{\Lambda} is the diagonal matrix whose entries are the eigenvalues. Then,

26 \displaystyle{\begin{align}\mathbf W = \boldsymbol{\Lambda}^{-\frac{1}{2}} \mathbf Q^T.\end{align}}

This is useful, for example, in the case of surfel matches. Recall that the covariance matrix is always symmetric positive semidefinite, allowing for easy eigendecomposition.

3.2 Robust loss functions

A least squares problem can be thought of as minimising the negative log likelihood function of a Gaussian. The likelihood function is of the form

27 \displaystyle{\begin{align}\mathcal L = \prod \exp(-f_i^2)\end{align}}

Then, the log likelihood is of the least squares form:

28 \displaystyle{\begin{align}\operatorname{cost} = -\log \mathcal L = \sum f_i^2.\end{align}}

However, in many cases, the random variable is not Gaussian. The Gaussian has very thin tails, making it incredibly unlikely to have outliers. In the real life, there are often many outliers that necessitate a more thick-tailed distribution. To model such situations, we need robust loss functions.

Instead of optimising \sum f_i^2, we optimise:

29 \displaystyle{\begin{align}\operatorname{cost} = \sum \rho(f_i)^2\end{align}}

where \rho is a robust loss function. We can then weigh each row or block of the Jacobian \mathbf J and the residual \mathbf F with a robust reweighting factor:

30 \displaystyle{\begin{align}r_i \equiv \frac{\partial \rho(f_i)}{\partial f_i}.\end{align}}
TrivialLoss\rho(x) = x
CauchyLoss\rho(x) = \log(1 + x)
HuberLoss\rho(x) = \begin{cases} x & x < k\\2 k\sqrt{x} - k^2 & x \geq k^2\end{cases}
Table 3 Some loss functions for x = |f|.

4 Trajectory representation

The goal is to recover the trajectory of a vehicle as a parametric curve \mathbf S(t) \in SE(3) as a function of time t.

We may assume the curve does not oscillate faster than some frequency (say, 10~Hz).

Now we should find a trajectory representation that satisfies the following properties:

Our solution to satisfy these requirements is a piecewise linear trajectory. The trajectory is represented as a sequence of elements \mathbf S_i that represent \mathbf S(t_i) for some t_i sampled with even spacing at a high frequency (say, 100~Hz). To evaluate \mathbf S(t), we use a geodesic interpolation for t_i \leq t < t_{i + 1}.

31 \displaystyle{\begin{align}\mathbf S(t) = \exp((t - t_i) \log(\mathbf S_{i + 1} \mathbf S_i^{-1})) \mathbf S_i.\end{align}}

To perturb this spline with a curve \boldsymbol{\delta}(t), we update each \mathbf S_i, like so:

32 \displaystyle{\begin{align}\mathbf S_{i, \text{new}} = \mathbf S_i \oplus \boldsymbol{\delta}(t_i)\end{align}}

Alternative approaches for parameterising trajectories include:

5 Parameterisation of the perturbation

When applying perturbations, the curve \boldsymbol{\delta} is a smooth curve which may be thought of as a vector of infinite dimension.

To ensure that the problem is computationally tractable and that \mathbf S_{\text{new}} remains twice-differentiable, we parameterise the perturbation \boldsymbol{\delta} by a finite vector \boldsymbol{\xi}. The vector \boldsymbol{\xi} is the concatenation many six-dimensional vectors \boldsymbol{\xi}_i \in \mathfrak{se}(3), such that

33 \displaystyle{\begin{align}\boldsymbol{\delta}(t) = \sum_i^n \boldsymbol{\xi}_i \beta((t - i)\Delta t).\end{align}}

Notice that, since \boldsymbol{\delta}(t) only applies small local perturbations, it is possible to add together the \boldsymbol{\xi}_i treating them as vectors in \mathbb R^6. The scalar-valued function \beta is a basis function with compact support, which means that it is nonzero for a finite contiguous segment of t and zero everywhere else. A good choice is the piecewise polynomial for a cardinal cubic B-spline:

34 \displaystyle{\begin{align}\beta(t) = \begin{cases}
\frac{1}{6}t^3                         & t \in [0,1]\\
\frac{1}{6}\left(-3t^3 + 12t^2 - 12t+4 \right)& t \in [1,2]\\
\frac{1}{6}\left(3t^3 - 24t^2 +60t-44 \right)  & t \in [2,3]\\
\frac{1}{6}\left(-t^3 + 12t^2 -48t+64 \right)  & t \in [3,4]\\
0 & t \notin [0,4]

We can now redefine the Jacobian to be with respect to the parameters:

35 \displaystyle{\begin{align}\mathbf J = \left. \frac{\partial}{\partial \boldsymbol{\xi}} \mathbf F(\mathbf S \oplus \boldsymbol{\delta}) \right|_{\boldsymbol{\delta} = 0}.\end{align}}

Since \beta is a constant,

36 \displaystyle{\begin{align}\frac{\partial}{\partial \boldsymbol{\xi}_i} = \beta((t - i)\Delta t) \frac{\partial}{\partial \boldsymbol{\delta}(t)}.\end{align}}

The elements \boldsymbol{\xi}_i \in \mathfrak{se}(3) are known as the spline knots. Since each knot is a 6\times 1 vector, the number of columns of \mathbf J is 6 \times n_\text{knots} where n_\text{knots} is the number of spline knots.

As you can see, the derivative of the trajectory at any point t in time is a linear combination of the derivatives of up to four spline knots.

In the next sections we will compute derivatives

37 \displaystyle{\begin{align}\frac{\partial}{\partial \boldsymbol{\delta}(t)}\end{align}}

knowing that each of these blocks will contribute up to four blocks, weighted with scalar weights \beta, to the actual Jacobian where we are optimizing the spline knots \boldsymbol{\xi}.

6 Constraints

The function \mathbf F is known as the residual. It consists of many constraints:

38 \displaystyle{\begin{align}\mathbf F = \begin{bmatrix}
\mathbf F_\text{position}\\
\mathbf F_\text{loop}\\
\mathbf F_\text{gravity}\\

For people familiar with the Ceres library, each constraint is a residual block.

6.1 Position constraint

The position constraint seeks to penalise the distance between the pose’s translational component and a 3D point.

For example, the 3D point may be the GPS position \mathbf p(t) measured at time t.

6.1.1 Residual

The residual is a 3\times 1 vector:

39 \displaystyle{\begin{align}\mathbf F_\text{position} = \mathbf t(\mathbf T(t)) - \mathbf p(t)\end{align}}

Recall from the notation section that \mathbf t(\mathbf T) is the translational component of the SE(3) element \mathbf T.

6.1.2 Left Jacobian

The left Jacobian is 3 \times 6:

40 \displaystyle{\begin{align}\mathbf J_\text{left}
\equiv \frac{\partial \mathbf F_\text{position}}{\partial \boldsymbol{\delta}}
\frac{\partial \mathbf t(\exp(\boldsymbol{\delta}) \mathbf T(t))}{\partial \boldsymbol{\delta}}\\
&= \begin{bmatrix}
-[\mathbf t(\mathbf T(t))]_\times & \mathbf I_{3 \times 3}

The trick is to view \mathbf t(\mathbf T) = \mathbf T \mathbf p where \mathbf p = \mathbf 0. Then we can apply the Jacobian for transforming a point in the derivatives table.

6.1.3 Right Jacobian

The right Jacobian is 3 \times 6:

41 \displaystyle{\begin{align}\mathbf J_\text{right}
\equiv \frac{\partial \mathbf F_\text{position}}{\partial \boldsymbol{\delta}}
\frac{\partial \mathbf t(\mathbf T(t)\exp(\boldsymbol{\delta}))}{\partial \boldsymbol{\delta}}\\
&= \begin{bmatrix}
\mathbf 0_{3 \times 3} & \mathbf R(\mathbf T(t))

6.2 Loop closure constraint

Suppose that we have aligned the poses from two points in time along a trajectory, e.g. using ICP.

This produces a relative transformation \mathbf A.

6.2.1 Residual

The residual is a 6\times 1 vector:

42 \displaystyle{\begin{align}\mathbf F_\text{loop closure}
= \log\left(
\underbrace{\mathbf T(t_1)^{-1} \mathbf T(t_2)}_\text{trajectory} \mathbf A

6.2.2 Left Jacobians

43 \displaystyle{\begin{align}\mathbf J_{\text{left}, t_1}
&\equiv \frac{\partial \mathbf F_\text{loop closure}}{\partial \boldsymbol{\delta}(t_1)}\\
(\exp(\boldsymbol{\delta}) \mathbf T(t_1))^{-1} \mathbf T(t_2) \mathbf A
}{\partial \boldsymbol{\delta}}\\
\mathbf T(t_1)^{-1} \exp(-\boldsymbol{\delta}) \mathbf T(t_2) \mathbf A
}{\partial \boldsymbol{\delta}}\\
&= -
\mathbf T(t_1)^{-1} \exp(\boldsymbol{\delta}) \mathbf T(t_2) \mathbf A
}{\partial \boldsymbol{\delta}}\\
&= -
\exp(\operatorname{Ad}(\mathbf T(t_1)^{-1}) \boldsymbol{\delta}) \mathbf T(t_1)^{-1} \mathbf T(t_2)
}{\partial \boldsymbol{\delta}}\\
&= -
\exp(\boldsymbol{\epsilon}) \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A
}{\partial \boldsymbol{\epsilon}}\operatorname{Ad}(\mathbf T(t_1)^{-1}) \\
&= -
\mathbf D_\text{log}\left(
    \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A
\operatorname{Ad}(\mathbf T(t_1)^{-1})\end{align}}
44 \displaystyle{\begin{align}\mathbf J_{\text{left}, t_2}
&\equiv \frac{\partial \mathbf F_\text{loop closure}}{\partial \boldsymbol{\delta}(t_2)}\\
\mathbf T(t_1)^{-1} \exp(\boldsymbol{\delta}) \mathbf T(t_2) \mathbf A
}{\partial \boldsymbol{\delta}}\\
\mathbf D_\text{log}\left(
    \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A
\operatorname{Ad}(\mathbf T(t_1)^{-1})\end{align}}

6.2.3 Right Jacobians

45 \displaystyle{\begin{align}\mathbf J_{\text{right}, t_1}
&\equiv \frac{\partial \mathbf F_\text{loop closure}}{\partial \boldsymbol{\delta}(t_1)}\\
(\mathbf T(t_1) \exp(\boldsymbol{\delta}))^{-1} \mathbf T(t_2) \mathbf A
}{\partial \boldsymbol{\delta}}\\
\exp(-\boldsymbol{\delta}) \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A
}{\partial \boldsymbol{\delta}}\\
&= -
\mathbf D_\text{log}\left(
    \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A
46 \displaystyle{\begin{align}\mathbf J_{\text{right}, t_2}
&\equiv \frac{\partial \mathbf F_\text{loop closure}}{\partial \boldsymbol{\delta}(t_2)}\\
(\mathbf T(t_1)^{-1} \mathbf T(t_2) \exp(\boldsymbol{\delta}) \mathbf A
}{\partial \boldsymbol{\delta}}\\
\mathbf D_\text{log}\left(
    \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A
\operatorname{Ad}(\mathbf T(t_1)^{-1}\mathbf T(t_2))\end{align}}

6.3 Gravity constraint

The gravity constraint seeks to penalise the distance between up vector of the car \mathbf R(\mathbf T(t)) \mathbf u and the true up vector \mathbf u_\text{true} = [0, 0, 1]^T.

For example, if the sensor were mounted perfectly upright, then \mathbf u = \mathbf u_\text{true}.

We can also use the accelerometer reading to obtain a different \mathbf u at each point in time, as long as you remember to subtract the inertial acceleration.

6.3.1 Residual

The residual is a 3 \times 1 vector:

47 \displaystyle{\begin{align}\mathbf F_\text{gravity} = \left(\mathbf R(\mathbf T(t)) \mathbf u - \mathbf u_\text{true}\right)\end{align}}

6.3.2 Left Jacobian

Since the gravity constraint only depends on rotation and not on translation, we only care about the derivative with respect to \boldsymbol{\omega}(\boldsymbol{\delta}), which we hereafter write as \boldsymbol{\omega}.

48 \displaystyle{\begin{align}\mathbf J_\text{left} \equiv \frac{\partial \mathbf F_\text{gravity}}{\partial \boldsymbol{\omega}}
&=\frac{\partial \exp(\boldsymbol{\omega}) \mathbf R(\mathbf T(t))\mathbf u}{\partial \boldsymbol{\omega}}\\
&= -[\mathbf R(\mathbf T(t)) \mathbf u]_\times\end{align}}

6.3.3 Right Jacobian

49 \displaystyle{\begin{align}\mathbf J_\text{right} \equiv \frac{\partial \mathbf F_\text{gravity}}{\partial \boldsymbol{\omega}}
&= \frac{\partial \mathbf R(\mathbf T(t)) \exp(\boldsymbol{\omega})\mathbf u}{\partial \boldsymbol{\omega}}\\
&= -\mathbf R(\mathbf T(t)) [\mathbf u]_\times\end{align}}

Recall that \mathbf R(\mathbf T) is the rotational component of the pose \mathbf T.

6.4 Point constraint

We are aligning a “moving” or “map” point \mathbf m to a “static” or “scene” point \mathbf s by transforming the moving point with the pose \mathbf T.

The matrix \mathbf A can be used to store the uncertainty of the point \mathbf s, i.e. an information matrix (the inverse of a covariance matrix).

6.4.1 Residual

The residual is a 3\times 1 vector:

50 \displaystyle{\begin{align}\mathbf F_\text{point} = \mathbf A (\mathbf T \mathbf m - \mathbf s).\end{align}}

6.4.2 Left Jacobian

The left Jacobian is k\times 6, where \mathbf A is k\times 3.

51 \displaystyle{\begin{align}\mathbf J_\text{left} &= \mathbf A \begin{bmatrix}
-[\mathbf T \mathbf m]_\times & \mathbf I

6.4.3 Right Jacobian

The right Jacobian is k\times 6, where \mathbf A is k\times 3.

52 \displaystyle{\begin{align}\mathbf J_\text{right} &= \mathbf A \mathbf R(\mathbf T)\begin{bmatrix}
-[\mathbf m]_\times & \mathbf I

Recall that \mathbf R(\mathbf T) is the rotational component of the pose \mathbf T.

6.5 Velocity constraint

The velocity constraint seeks to penalise deviations in vehicle velocity from the 6 degree of freedom velocity estimates from another source.

For ease of implementation, the vehicle velocity is obtained by numerical differentiation, e.g. by evaluating the pose at times t_1 and t_2. Let h = 1 / (t_2 - t_1).

6.5.1 Residual

The residual is a 6\times 1 vector. For concise notation let us define the 6\times 6 matrix \boldsymbol{\Delta} such that

53 \displaystyle{\begin{align}\mathbf F_\text{velocity} &= h \log\left(\boldsymbol{\Delta}\right) - \boldsymbol{\xi}_\text{velocity}\\
&= h \log\left(
    \mathbf T(t_2)
    \mathbf T(t_1)^{-1}
\right) - \boldsymbol{\xi}_\text{velocity}\end{align}}

6.5.2 Left Jacobian

Consider differentiating with respect to left-updates of \mathbf T(t_1).

The Jacobian is 6\times 6.

54 \displaystyle{\begin{align}\mathbf J_\text{left}(t_1)
\equiv \frac{\partial \mathbf F_\text{velocity}}{\partial \boldsymbol{\delta}(t_1)}
&= \frac{\partial}{\partial \boldsymbol{\delta}}
h \log\left(
    \mathbf T(t_2)
        \mathbf T(t_1)
&= \frac{\partial}{\partial \boldsymbol{\delta}}
h \log\left(
    \mathbf T(t_2)
    \mathbf T(t_1)^{-1}
&= \frac{\partial}{\partial \boldsymbol{\delta}}
h \log\left(\exp(\underbrace{-\operatorname{Ad}(\boldsymbol{\Delta})\boldsymbol{\delta}}_{\boldsymbol{\epsilon}}) \boldsymbol{\Delta} \right) \\
&= \frac{\partial}{\partial \boldsymbol{\epsilon}}
h \log\left(\exp(\boldsymbol{\epsilon}) \boldsymbol{\Delta} \right) \frac{\partial \boldsymbol{\epsilon}}{\partial \boldsymbol{\delta}} \\
&= -h\mathbf D_{\log}\left(\boldsymbol{\Delta} \right) \operatorname{Ad}(\boldsymbol{\Delta})\\
&= -h\mathbf D_{\log}(\mathbf T(t_2)\mathbf T(t_1)^{-1}) \operatorname{Ad}(
    \mathbf T(t_2)
    \mathbf T(t_1)^{-1}

Now, consider differentiating with respect to left-updates to \mathbf T(t_2).

55 \displaystyle{\begin{align}\mathbf J_\text{left}(t_2)
\equiv \frac{\partial \mathbf F_\text{velocity}}{\partial \boldsymbol{\delta}(t_2)}
&= \frac{\partial}{\partial \boldsymbol{\delta}}
h \log\left(
    \mathbf T(t_2)
    \mathbf T(t_1)^{-1}
&= h\mathbf D_{\log}(
    \mathbf T(t_2)
    \mathbf T(t_1)^{-1}

6.6 Regularisation constraint

The regularisation constraint implements a basic Tikhonov regulariser. It seeks to dampen the problem to avoid divergent oscillations, overfitting, or other types of poor convergence.

6.6.1 Residual

The residual is 6\times 1:

56 \displaystyle{\begin{align}\mathbf F_\text{regulariser} = \boldsymbol{\xi}.\end{align}}

6.6.2 Jacobian

The Jacobian is the identity:

57 \displaystyle{\begin{align}\mathbf J_\text{regulariser} = \mathbf I.\end{align}}

7 References