SE(3) constraints for robotics

2021-09-08

This document summarizes some common maths used for state estimation of rigid bodies such as in robotics.

1 Transformation parameterisation

Rigid transformations in 3 dimensions are known as the special Euclidean group, SE(3)SE(3), and can be written in the homogeneous form.

1 T4×4=[R3×3t3×101×31]SE(3).\begin{aligned}\mathbf T_{4\times 4} = \begin{bmatrix} \mathbf R_{3\times 3} & \mathbf t_{3\times 1}\\ \mathbf 0_{1\times 3} & 1 \end{bmatrix} \in SE(3).\end{aligned}

The rotation matrix R\mathbf R is in the special orthogonal group SO(3)SO(3), which means that it is orthogonal (its columns are normal and orthogonal to each other) and it has determinant +1+1.

1.1 Transforming a point

An element TSE(3)\mathbf T \in SE(3) may transform a 3D point pR3\mathbf p \in \mathbb R^3:

2 p3×1=[xyz].\begin{aligned}\mathbf p_{3\times 1} &= \begin{bmatrix}x\\y\\z\end{bmatrix}.\end{aligned}

In this document we use it interchangeably with the homogeneous representation:

3 p4×1=[xyz1]\begin{aligned}\mathbf p_{4\times 1} = \begin{bmatrix}x\\ y\\ z\\ 1\end{bmatrix}\end{aligned}

so that points may be transformed rigidly

4 TpRp+t.\begin{aligned}\mathbf T \mathbf p \equiv \mathbf R \mathbf p + \mathbf t.\end{aligned}

1.2 The Lie algebra

Each element in SE(3)SE(3) is associated with an element on the corresponding Lie algebra, se(3)\mathfrak{se}(3):

5 ξ4×4=[[ω]×τ01×30]se(3)=[0ω3ω2τ1ω30ω1τ2ω2ω10τ30000]se(3)\begin{aligned}\bm \xi_{4\times 4} &= \begin{bmatrix} [\bm \omega]_\times & \bm \tau\\ \mathbf 0_{1 \times 3} & 0 \end{bmatrix} \in \mathfrak{se}(3)\\ &= \begin{bmatrix} 0 & -\omega_3 & \omega_2 & \tau_1\\ \omega_3 & 0 & -\omega_1 & \tau_2\\ -\omega_2 & \omega_1 & 0 & \tau_3\\ 0 & 0 & 0 & 0 \end{bmatrix} \in \mathfrak{se}(3)\end{aligned}

where [ω]×[\bm \omega]_\times is the skew symmetric form of the cross product by ω\omega. We may also write it as a 6×16\times 1 vector.

6 ξ6×1=[ω3×1τ3×1].\begin{aligned}\bm \xi_{6\times 1} = \begin{bmatrix} \bm \omega_{3\times 1}\\ \bm \tau_{3\times 1} \end{bmatrix}.\end{aligned}

We will use the 6×16\times 1 and 4×44\times 4 representations interchangeably depending on context.

Note that some other textbooks put the translational part on top and the rotational part below. It doesn’t matter much, but it will affect our notation for things like the adjoint action, differentials, etc below.

1.3 The exponential map

The group SE(3)SE(3) and its algebra se(3)\mathfrak{se}(3) are related by the exponential map:

7 exp:se(3)SE(3)log:SE(3)se(3).\begin{aligned}\exp &: \mathfrak{se}(3) \rightarrow SE(3)\\ \log &: SE(3) \rightarrow \mathfrak{se}(3).\end{aligned}

The definition of exp\exp is based on the Taylor series:

8 exp(ξ)=I4×4+ξ4×4+12!ξ4×42+13!ξ4×43\begin{aligned}\exp(\bm \xi) = \mathbf I_{4\times 4} + \bm \xi_{4\times 4} + \frac{1}{2!} \bm \xi_{4\times 4}^2 + \frac{1}{3!} \bm \xi_{4\times 4}^3 \ldots\end{aligned}

Closed forms exist. See: J. L. Blanco’s report jlblanco.

Note that SE(3)SE(3) is not commutative. The adjoint action relates things in different orders:

9 Texp(δ)=exp(Ad(T)δ)T.\begin{aligned}\mathbf T \exp(\bm \delta) = \exp\left(\operatorname{Ad}(\mathbf T) \bm \delta\right) \mathbf T.\end{aligned}

The adjoint is a 6×66\times 6 matrix:

10 Ad(T)=[R03×3[t]×RR].\begin{aligned}\operatorname{Ad}(\mathbf T) = \begin{bmatrix} \mathbf R & \mathbf 0_{3\times 3}\\ [\mathbf t]_\times \mathbf R & \mathbf R \end{bmatrix}.\end{aligned}

1.4 Notation summary

In general,

Here we define and summarise some notation for the following sections.

DescriptionNotation
skew-symmetric cross product matrix [0t3t2t30t1t2t10]\begin{bmatrix}0 & -t_3 & t_2\\t_3 & 0 & -t_1\\-t_2 & t_1 & 0\end{bmatrix}[t]×[\mathbf t]_\times
translational part of TSE(3)\mathbf T \in SE(3)t(T)\mathbf t(\mathbf T)
rotational part of TSE(3)\mathbf T \in SE(3)R(T)\mathbf R(\mathbf T)
translational part of ξse(3)\bm \xi \in \mathfrak{se}(3)τ(ξ)\bm \tau(\bm \xi)
rotational part of ξse(3)\bm \xi \in \mathfrak{se}(3)ω(ξ)\bm \omega(\bm \xi)
compose T1,T2SE(3)\mathbf T_1, \mathbf T_2 \in SE(3)T1T2T_1 T_2
exp ξse(3)\bm \xi \in \mathfrak{se}(3)exp(ξ)\exp(\bm \xi)
log of TSE(3)\mathbf T \in SE(3)log(T)\log(\mathbf T)
exp ωso(3)\bm \omega \in \mathfrak{so}(3)exp(ω)\exp(\bm \omega)
log of RSO(3)\mathbf R \in SO(3)log(R)\log(\mathbf R)
inverse of ξse(3)\bm \xi \in \mathfrak{se}(3)ξ-\bm \xi
inverse of TSE(3)\mathbf T \in SE(3)T1\mathbf T^{-1}
inverse of ωso(3)\bm \omega \in \mathfrak{so}(3)ω-\bm \omega
inverse of RSO(3)\mathbf R \in SO(3)RT\mathbf R^T
adjoint of TSE(3)\mathbf T \in SE(3)Ad(T)\operatorname{Ad}(\mathbf T)
ii th element of {ξξse(3)}\lbrace \bm \xi\vert \bm \xi \in \mathfrak{se}(3)\rbraceξi\bm \xi_i
ii th element of {TTSE(3)}\lbrace \mathbf T \vert \mathbf T \in SE(3)\rbraceTi\mathbf T_i
transform pR3\mathbf p \in \mathbb R^3 by TSE(3)\mathbf T \in SE(3)Tp\mathbf T \mathbf p
transform pR3\mathbf p \in \mathbb R^3 by ξse(3)\bm \xi \in \mathfrak{se}(3)exp(ξ)p\exp(\bm \xi) \mathbf p
Table 1 Summary of notation and implementation. For the array ones, operations are applied elementwise.

2 Derivatives

Here, we only differentiate with respect to δse(3)\bm \delta \in \mathfrak{se}(3) around the point δ=0\bm \delta = \mathbf 0. In other words, we have a function F(T)\mathbf F(\mathbf T) where TSE(3)\mathbf T \in SE(3) and we would like to perturb T\mathbf T by a very small δ\bm \delta.

Suppose δ=[ωτ]T\bm \delta = \begin{bmatrix}\bm \omega & \bm \tau\end{bmatrix}^T. If F\mathbf F is of dimension nn, then the resulting derivative is an n×6n\times 6 Jacobian.

11 Jn×6F(Tδ)δ]δ=0=[F1ω1F1ω2F1ω3F1τ1F1τ2F1τ3Fnω1Fnω2Fnω3Fnτ1Fnτ2Fnτ3]\begin{aligned}\mathbf J_{n\times 6} \equiv \left.\frac{\partial \mathbf F(\mathbf T \oplus \bm \delta)}{\partial \bm \delta}\right]_{\bm \delta = \mathbf 0} &= \begin{bmatrix} \frac{\partial F_1}{\partial \omega_1} & \frac{\partial F_1}{\partial \omega_2} & \frac{\partial F_1}{\partial \omega_3} & \frac{\partial F_1}{\partial \tau_1} & \frac{\partial F_1}{\partial \tau_2} & \frac{\partial F_1}{\partial \tau_3} \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ \frac{\partial F_n}{\partial \omega_1} & \frac{\partial F_n}{\partial \omega_2} & \frac{\partial F_n}{\partial \omega_3} & \frac{\partial F_n}{\partial \tau_1} & \frac{\partial F_n}{\partial \tau_2} & \frac{\partial F_n}{\partial \tau_3} \\ \end{bmatrix}\end{aligned}

where F(Tδ)=F(exp(δ)T)\mathbf F(\mathbf T \oplus \bm \delta) = \mathbf F(\exp(\bm \delta)\mathbf T) in the case of left-perturbations and F(Texp(δ))\mathbf F(\mathbf T\exp(\bm \delta)) in the case of right-perturbations.

The adjoint action can be used to relate the left and right derivatives by using the chain rule.

12 F(Aexp(δ)B)δ=F(exp(Ad(A)δ)AB)δ=δF(exp(Ad(A)δϵ)AB)=ϵF(exp(ϵ)AB)ϵδ=ϵF(exp(ϵ)AB)Ad(A).\begin{aligned}\frac{\partial \mathbf F(\mathbf A \exp(\bm \delta) \mathbf B)}{\partial \bm \delta} &= \frac{\partial \mathbf F(\exp(\operatorname{Ad}(\mathbf A) \bm \delta) \mathbf{AB})}{\partial \bm \delta}\\ &= \frac{\partial}{\partial \bm \delta} \mathbf F(\exp(\underbrace{\operatorname{Ad}(\mathbf A) \bm \delta}_{\bm \epsilon}) \mathbf{AB})\\ &= \frac{\partial}{\partial \bm \epsilon} \mathbf F(\exp(\bm \epsilon) \mathbf{AB}) \frac{\partial \bm \epsilon}{\partial \bm \delta}\\ &= \frac{\partial}{\partial \bm \epsilon} \mathbf F(\exp(\bm \epsilon) \mathbf{AB}) \operatorname{Ad}(\mathbf A).\end{aligned}

We also have a derivative of the log function:

13 Dlog(T)δ]δ=06×1log(exp(δ)T)=[Dlog(ω)03×3Dlog(ω)BDlog(ω)Dlog(ω)]\begin{aligned}\mathbf D_\text{log}(\mathbf T) &\equiv \left.\frac{\partial}{\partial \bm \delta}\right]_{\bm \delta = \mathbf 0_{6\times 1}}\log(\exp(\bm \delta) \mathbf T)\\ &= \begin{bmatrix} \mathbf D_\text{log}(\bm \omega) & \mathbf 0_{3\times 3} \\ -\mathbf D_\text{log}(\bm \omega) \mathbf B \mathbf D_\text{log}(\bm \omega) & \mathbf D_\text{log}(\bm \omega) \end{bmatrix}\end{aligned}

where

14 Dlog(ω)=I12[ω×]+eθ[ω]×2Bbθ[u]×+cθ(ωuT+uωT)+(ωtu)W(ω)θ=ωaθ=sinθθbθ=1cosθθ2cθ=1aθθ2eθ=bθ2cθ2aθ\begin{aligned}\mathbf D_\text{log}(\bm \omega) &= \mathbf I - \frac{1}{2} [\bm \omega_\times] + e_\theta [\bm \omega]_\times^2\\ \mathbf B &\equiv b_\theta [ \mathbf u]_\times + c_\theta (\bm \omega \mathbf u^T + \mathbf u \bm \omega^T) + (\bm \omega^t \mathbf u) \cdot \mathbf W(\bm \omega)\\ \theta &= \|\bm \omega\|\\ a_\theta &= \frac{\sin\theta}{\theta}\\ b_\theta &= \frac{1-\cos\theta}{\theta^2}\\ c_\theta &= \frac{1-a_\theta}{\theta^2}\\ e_\theta &= \frac{b_\theta - 2c_\theta}{2a_\theta}\end{aligned}

The exact derivation is in Ethan Eade’s report eade.

Let TSE(3)\mathbf T \in SE(3), pR3\mathbf p \in \mathbb R^3, then the following table lists some useful derivatives.

As you will see later, these are basically all the derivatives you need, and all other derivatives can be easily derived from these, often together with using the adjoint.

FunctionDerivative
F(δ)\mathbf F(\bm \delta)Fδ]δ=0\left.\frac{\partial \mathbf F}{\partial \bm\delta} \right]_{\bm \delta = \mathbf 0}
exp(δ)Tp\exp(\bm \delta) \mathbf T \mathbf p[[Tp]×I]\begin{bmatrix}-[\mathbf T \mathbf p]_\times & \mathbf I\end{bmatrix}
Texp(δ)p\mathbf T \exp(\bm \delta) \mathbf pR(T)[[p]×I]\mathbf R(\mathbf T) \begin{bmatrix}-[\mathbf p]_\times & \mathbf I\end{bmatrix}
log(exp(δ)T)\log(\exp(\bm \delta) \mathbf T)Dlog(T)\mathbf D_\text{log} (\mathbf T)
Table 2 Summary of derivatives

3 Optimisation

Suppose we have a trajectory S(t)SE(3)\mathbf S(t) \in SE(3) that is a smooth curve. We would like to minimise some objective function:

15 cost(S)=F(S)TF(S).\begin{aligned}\operatorname{cost}(\mathbf S) = \mathbf F(\mathbf S)^T \mathbf F(\mathbf S).\end{aligned}

We seek to reduce the cost as much as possible. This is a nonlinear least squares problem. During the optimisation process, we perturb it by a small perturbation δ(t)se(3)\bm \delta(t) \in \mathfrak{se}(3):

16 Snew(t)=S(t)δ(t)\begin{aligned}\mathbf S_{\text{new}}(t) = \mathbf S(t) \oplus \bm\delta(t)\end{aligned}

where the \oplus operator can either be the left-update:

17 Sδexp(δ)S\begin{aligned}\mathbf S \oplus \bm \delta \equiv \exp(\bm\delta) \mathbf S\end{aligned}

or the right-update:

18 SδSexp(δ).\begin{aligned}\mathbf S \oplus \bm \delta \equiv \mathbf S \exp(\bm\delta).\end{aligned}

Both are valid depending on numerical properties of the problem.

In any case, we linearise the problem around δ=0\bm \delta = \mathbf 0.

Let the Jacobian matrix be:

19 J=δF(Sδ)]δ=0.\begin{aligned}\mathbf J = \left. \frac{\partial}{\partial \bm \delta} \mathbf F(\mathbf S \oplus \bm\delta) \right]_{\bm \delta = 0}.\end{aligned}

Then we can solve δ\bm \delta using Gauss-Newton:

20 JTJδGN=JTF\begin{aligned}\mathbf J^T \mathbf J \bm \delta_\text{GN} = -\mathbf J^T \mathbf F\end{aligned}

or steepest descent:

21 δs=JTF\begin{aligned}\bm \delta_\text{s} = -\mathbf J^T \mathbf F\end{aligned}

or more sophisticated algorithms. In practice, we use Powell’s Dog Leg. The update is

22 δ=cGNδGN+csδs\begin{aligned}\bm \delta = c_\text{GN} \bm \delta_\text{GN} + c_\text{s} \bm \delta_\text{s}\end{aligned}

where scalar weights cGNc_\text{GN}, csc_\text{s} are chosen such that δΔ\|\bm \delta \| \leq \Delta where the scalar parameter Δ\Delta is the radius of the trust region. When Δ\Delta is small, the problem is behaving badly and the quadratic approximation that Gauss-Newton uses is not very valid. In this case, cGNc_\text{GN} is zero, allowing the optimiser to take timid steps along the steepest descent direction. When Δ\Delta is large, the quadratic approximation is good and we take bigger steps in the Gauss-Newton direction.

Heuristics are used to determine when to increase Δ\Delta and when to decrease it.

3.1 Optimisation under uncertainty

When uncertainty is present, the data is associated with some covariance matrix Σ\bm \Sigma which is an n×nn\times n matrix where nn is the sise of the residual. The Gauss-Newton update then becomes:

23 JTΣ1Jδ=JTΣ1F\begin{aligned}\mathbf J^T \bm \Sigma^{-1} \mathbf J \bm \delta = -\mathbf J^T \bm \Sigma^{-1} \mathbf F\end{aligned}

The matrix Σ1\bm \Sigma^{-1} is also sometimes called the information matrix. In practice, it is useful to factor this matrix, for example, by taking the matrix square root. Let W=Σ12\mathbf W = \bm \Sigma^{-\frac{1}{2}}, then,

24 (WJ)T(WJ)δ=(WJ)T(WF).\begin{aligned}(\mathbf W\mathbf J)^T (\mathbf W\mathbf J) \bm \delta = -(\mathbf W \mathbf J)^T (\mathbf W \mathbf F).\end{aligned}

In other words, we just pre-multiply the Jacobian and residual by a weight matrix. This is a form of whitening. In the common case where Σ\bm \Sigma is a diagonal matrix, we simply divide each row of the Jacobian and the residual by the standard deviation.

Another common factorisation is the eigendecomposition of the covariance matrix Σ\bm \Sigma:

25 Σ=QΛQT\begin{aligned}\bm \Sigma = \mathbf Q \bm \Lambda \mathbf Q^T\end{aligned}

where QSO(n)\mathbf Q \in SO(n) is the square matrix whose columns are the orthonormal eigenvectors of Σ\bm \Sigma and Λ\bm \Lambda is the diagonal matrix whose entries are the eigenvalues. Then,

26 W=Λ12QT.\begin{aligned}\mathbf W = \bm \Lambda^{-\frac{1}{2}} \mathbf Q^T.\end{aligned}

This is useful, for example, in the case of surfel matches. Recall that the covariance matrix is always symmetric positive semidefinite, allowing for easy eigendecomposition.

3.2 Robust loss functions

A least squares problem can be thought of as minimising the negative log likelihood function of a Gaussian. The likelihood function is of the form

27 L=exp(fi2)\begin{aligned}\mathcal L = \prod \exp(-f_i^2)\end{aligned}

Then, the log likelihood is of the least squares form:

28 cost=logL=fi2.\begin{aligned}\operatorname{cost} = -\log \mathcal L = \sum f_i^2.\end{aligned}

However, in many cases, the random variable is not Gaussian. The Gaussian has very thin tails, making it incredibly unlikely to have outliers. In the real life, there are often many outliers that necessitate a more thick-tailed distribution. To model such situations, we need robust loss functions.

Instead of optimising fi2\sum f_i^2, we optimise:

29 cost=ρ(fi)2\begin{aligned}\operatorname{cost} = \sum \rho(f_i)^2\end{aligned}

where ρ\rho is a robust loss function. We can then weigh each row or block of the Jacobian J\mathbf J and the residual F\mathbf F with a robust reweighting factor:

30 riρ(fi)fi.\begin{aligned}r_i \equiv \frac{\partial \rho(f_i)}{\partial f_i}.\end{aligned}
NameDefinition
TrivialLossρ(x)=x\rho(x) = x
CauchyLossρ(x)=log(1+x)\rho(x) = \log(1 + x)
HuberLossρ(x)={xx<k2kxk2xk2\rho(x) = \begin{cases} x & x < k\\2 k\sqrt{x} - k^2 & x \geq k^2\end{cases}
Table 3 Some loss functions for x=fx = |f|.

4 Trajectory representation

The goal is to recover the trajectory of a vehicle as a parametric curve S(t)SE(3)\mathbf S(t) \in SE(3) as a function of time tt.

We may assume the curve does not oscillate faster than some frequency (say, 10~Hz).

Now we should find a trajectory representation that satisfies the following properties:

Our solution to satisfy these requirements is a piecewise linear trajectory. The trajectory is represented as a sequence of elements Si\mathbf S_i that represent S(ti)\mathbf S(t_i) for some tit_i sampled with even spacing at a high frequency (say, 100~Hz). To evaluate S(t)\mathbf S(t), we use a geodesic interpolation for tit<ti+1t_i \leq t < t_{i + 1}.

31 S(t)=exp((tti)log(Si+1Si1))Si.\begin{aligned}\mathbf S(t) = \exp((t - t_i) \log(\mathbf S_{i + 1} \mathbf S_i^{-1})) \mathbf S_i.\end{aligned}

To perturb this spline with a curve δ(t)\bm \delta(t), we update each Si\mathbf S_i, like so:

32 Si,new=Siδ(ti)\begin{aligned}\mathbf S_{i, \text{new}} = \mathbf S_i \oplus \bm \delta(t_i)\end{aligned}

Alternative approaches for parameterising trajectories include:

5 Parameterisation of the perturbation

When applying perturbations, the curve δ\bm \delta is a smooth curve which may be thought of as a vector of infinite dimension.

To ensure that the problem is computationally tractable and that Snew\mathbf S_{\text{new}} remains twice-differentiable, we parameterise the perturbation δ\bm \delta by a finite vector ξ\bm \xi. The vector ξ\bm \xi is the concatenation many six-dimensional vectors ξise(3)\bm \xi_i \in \mathfrak{se}(3), such that

33 δ(t)=inξiβ((ti)Δt).\begin{aligned}\bm \delta(t) = \sum_i^n \bm \xi_i \beta((t - i)\Delta t).\end{aligned}

Notice that, since δ(t)\bm \delta(t) only applies small local perturbations, it is possible to add together the ξi\bm \xi_i treating them as vectors in R6\mathbb R^6. The scalar-valued function β\beta is a basis function with compact support, which means that it is nonzero for a finite contiguous segment of tt and zero everywhere else. A good choice is the piecewise polynomial for a cardinal cubic B-spline:

34 β(t)={16t3t[0,1]16(3t3+12t212t+4)t[1,2]16(3t324t2+60t44)t[2,3]16(t3+12t248t+64)t[3,4]0t[0,4]\begin{aligned}\beta(t) = \begin{cases} \frac{1}{6}t^3 & t \in [0,1]\\ \frac{1}{6}\left(-3t^3 + 12t^2 - 12t+4 \right)& t \in [1,2]\\ \frac{1}{6}\left(3t^3 - 24t^2 +60t-44 \right) & t \in [2,3]\\ \frac{1}{6}\left(-t^3 + 12t^2 -48t+64 \right) & t \in [3,4]\\ 0 & t \notin [0,4] \end{cases}\end{aligned}

We can now redefine the Jacobian to be with respect to the parameters:

35 J=ξF(Sδ)δ=0.\begin{aligned}\mathbf J = \left. \frac{\partial}{\partial \bm \xi} \mathbf F(\mathbf S \oplus \bm\delta) \right|_{\bm \delta = 0}.\end{aligned}

Since β\beta is a constant,

36 ξi=β((ti)Δt)δ(t).\begin{aligned}\frac{\partial}{\partial \bm \xi_i} = \beta((t - i)\Delta t) \frac{\partial}{\partial \bm \delta(t)}.\end{aligned}

The elements ξise(3)\bm \xi_i \in \mathfrak{se}(3) are known as the spline knots. Since each knot is a 6×16\times 1 vector, the number of columns of J\mathbf J is 6×nknots6 \times n_\text{knots} where nknotsn_\text{knots} is the number of spline knots.

As you can see, the derivative of the trajectory at any point tt in time is a linear combination of the derivatives of up to four spline knots.

In the next sections we will compute derivatives

37 δ(t)\begin{aligned}\frac{\partial}{\partial \bm \delta(t)}\end{aligned}

knowing that each of these blocks will contribute up to four blocks, weighted with scalar weights β\beta, to the actual Jacobian where we are optimizing the spline knots ξ\bm \xi.

6 Constraints

The function F\mathbf F is known as the residual. It consists of many constraints:

38 F=[FpositionFloopFgravity]\begin{aligned}\mathbf F = \begin{bmatrix} \mathbf F_\text{position}\\ \mathbf F_\text{loop}\\ \mathbf F_\text{gravity}\\ \vdots \end{bmatrix}\end{aligned}

For people familiar with the Ceres library, each constraint is a residual block.

6.1 Position constraint

The position constraint seeks to penalise the distance between the pose’s translational component and a 3D point.

For example, the 3D point may be the GPS position p(t)\mathbf p(t) measured at time tt.

6.1.1 Residual

The residual is a 3×13\times 1 vector:

39 Fposition=t(T(t))p(t)\begin{aligned}\mathbf F_\text{position} = \mathbf t(\mathbf T(t)) - \mathbf p(t)\end{aligned}

Recall from the notation section that t(T)\mathbf t(\mathbf T) is the translational component of the SE(3)SE(3) element T\mathbf T.

6.1.2 Left Jacobian

The left Jacobian is 3×63 \times 6:

40 JleftFpositionδ=t(exp(δ)T(t))δ=[[t(T(t))]×I3×3]\begin{aligned}\mathbf J_\text{left} \equiv \frac{\partial \mathbf F_\text{position}}{\partial \bm \delta} &= \frac{\partial \mathbf t(\exp(\bm \delta) \mathbf T(t))}{\partial \bm \delta}\\ &= \begin{bmatrix} -[\mathbf t(\mathbf T(t))]_\times & \mathbf I_{3 \times 3} \end{bmatrix}\end{aligned}

The trick is to view t(T)=Tp\mathbf t(\mathbf T) = \mathbf T \mathbf p where p=0\mathbf p = \mathbf 0. Then we can apply the Jacobian for transforming a point in the derivatives table.

6.1.3 Right Jacobian

The right Jacobian is 3×63 \times 6:

41 JrightFpositionδ=t(T(t)exp(δ))δ=[03×3R(T(t))]\begin{aligned}\mathbf J_\text{right} \equiv \frac{\partial \mathbf F_\text{position}}{\partial \bm \delta} &= \frac{\partial \mathbf t(\mathbf T(t)\exp(\bm \delta))}{\partial \bm \delta}\\ &= \begin{bmatrix} \mathbf 0_{3 \times 3} & \mathbf R(\mathbf T(t)) \end{bmatrix}\end{aligned}

6.2 Loop closure constraint

Suppose that we have aligned the poses from two points in time along a trajectory, e.g. using ICP.

This produces a relative transformation A\mathbf A.

6.2.1 Residual

The residual is a 6×16\times 1 vector:

42 Floop closure=log(T(t1)1T(t2)trajectoryA)\begin{aligned}\mathbf F_\text{loop closure} = \log\left( \underbrace{\mathbf T(t_1)^{-1} \mathbf T(t_2)}_\text{trajectory} \mathbf A \right)\end{aligned}

6.2.2 Left Jacobians

43 Jleft,t1Floop closureδ(t1)=log((exp(δ)T(t1))1T(t2)A)δ=log(T(t1)1exp(δ)T(t2)A)δ=log(T(t1)1exp(δ)T(t2)A)δ=log(exp(Ad(T(t1)1)δ)T(t1)1T(t2))δ=log(exp(ϵ)T(t1)1T(t2)A)ϵAd(T(t1)1)=Dlog(T(t1)1T(t2)A)Ad(T(t1)1)\begin{aligned}\mathbf J_{\text{left}, t_1} &\equiv \frac{\partial \mathbf F_\text{loop closure}}{\partial \bm \delta(t_1)}\\ &= \frac{\partial \log\left( (\exp(\bm \delta) \mathbf T(t_1))^{-1} \mathbf T(t_2) \mathbf A \right) }{\partial \bm \delta}\\ &= \frac{\partial \log\left( \mathbf T(t_1)^{-1} \exp(-\bm \delta) \mathbf T(t_2) \mathbf A \right) }{\partial \bm \delta}\\ &= - \frac{\partial \log\left( \mathbf T(t_1)^{-1} \exp(\bm \delta) \mathbf T(t_2) \mathbf A \right) }{\partial \bm \delta}\\ &= - \frac{\partial \log\left( \exp(\operatorname{Ad}(\mathbf T(t_1)^{-1}) \bm \delta) \mathbf T(t_1)^{-1} \mathbf T(t_2) \right) }{\partial \bm \delta}\\ &= - \frac{\partial \log\left( \exp(\bm \epsilon) \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A \right) }{\partial \bm \epsilon}\operatorname{Ad}(\mathbf T(t_1)^{-1}) \\ &= - \mathbf D_\text{log}\left( \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A \right) \operatorname{Ad}(\mathbf T(t_1)^{-1})\end{aligned}
44 Jleft,t2Floop closureδ(t2)=log(T(t1)1exp(δ)T(t2)A)δ=Dlog(T(t1)1T(t2)A)Ad(T(t1)1)\begin{aligned}\mathbf J_{\text{left}, t_2} &\equiv \frac{\partial \mathbf F_\text{loop closure}}{\partial \bm \delta(t_2)}\\ &= \frac{\partial \log\left( \mathbf T(t_1)^{-1} \exp(\bm \delta) \mathbf T(t_2) \mathbf A \right) }{\partial \bm \delta}\\ &= \mathbf D_\text{log}\left( \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A \right) \operatorname{Ad}(\mathbf T(t_1)^{-1})\end{aligned}

6.2.3 Right Jacobians

45 Jright,t1Floop closureδ(t1)=log((T(t1)exp(δ))1T(t2)A)δ=log(exp(δ)T(t1)1T(t2)A)δ=Dlog(T(t1)1T(t2)A)\begin{aligned}\mathbf J_{\text{right}, t_1} &\equiv \frac{\partial \mathbf F_\text{loop closure}}{\partial \bm \delta(t_1)}\\ &= \frac{\partial \log\left( (\mathbf T(t_1) \exp(\bm \delta))^{-1} \mathbf T(t_2) \mathbf A \right) }{\partial \bm \delta}\\ &= \frac{\partial \log\left( \exp(-\bm \delta) \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A \right) }{\partial \bm \delta}\\ &= - \mathbf D_\text{log}\left( \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A \right)\end{aligned}
46 Jright,t2Floop closureδ(t2)=log((T(t1)1T(t2)exp(δ)A)δ=Dlog(T(t1)1T(t2)A)Ad(T(t1)1T(t2))\begin{aligned}\mathbf J_{\text{right}, t_2} &\equiv \frac{\partial \mathbf F_\text{loop closure}}{\partial \bm \delta(t_2)}\\ &= \frac{\partial \log\left( (\mathbf T(t_1)^{-1} \mathbf T(t_2) \exp(\bm \delta) \mathbf A \right) }{\partial \bm \delta}\\ &= \mathbf D_\text{log}\left( \mathbf T(t_1)^{-1} \mathbf T(t_2) \mathbf A \right) \operatorname{Ad}(\mathbf T(t_1)^{-1}\mathbf T(t_2))\end{aligned}

6.3 Gravity constraint

The gravity constraint seeks to penalise the distance between up vector of the car R(T(t))u\mathbf R(\mathbf T(t)) \mathbf u and the true up vector utrue=[0,0,1]T\mathbf u_\text{true} = [0, 0, 1]^T.

For example, if the sensor were mounted perfectly upright, then u=utrue\mathbf u = \mathbf u_\text{true}.

We can also use the accelerometer reading to obtain a different u\mathbf u at each point in time, as long as you remember to subtract the inertial acceleration.

6.3.1 Residual

The residual is a 3×13 \times 1 vector:

47 Fgravity=(R(T(t))uutrue)\begin{aligned}\mathbf F_\text{gravity} = \left(\mathbf R(\mathbf T(t)) \mathbf u - \mathbf u_\text{true}\right)\end{aligned}

6.3.2 Left Jacobian

Since the gravity constraint only depends on rotation and not on translation, we only care about the derivative with respect to ω(δ)\bm \omega(\bm \delta), which we hereafter write as ω\bm \omega.

48 JleftFgravityω=exp(ω)R(T(t))uω=[R(T(t))u]×\begin{aligned}\mathbf J_\text{left} \equiv \frac{\partial \mathbf F_\text{gravity}}{\partial \bm \omega} &=\frac{\partial \exp(\bm \omega) \mathbf R(\mathbf T(t))\mathbf u}{\partial \bm \omega}\\ &= -[\mathbf R(\mathbf T(t)) \mathbf u]_\times\end{aligned}

6.3.3 Right Jacobian

49 JrightFgravityω=R(T(t))exp(ω)uω=R(T(t))[u]×\begin{aligned}\mathbf J_\text{right} \equiv \frac{\partial \mathbf F_\text{gravity}}{\partial \bm \omega} &= \frac{\partial \mathbf R(\mathbf T(t)) \exp(\bm \omega)\mathbf u}{\partial \bm \omega}\\ &= -\mathbf R(\mathbf T(t)) [\mathbf u]_\times\end{aligned}

Recall that R(T)\mathbf R(\mathbf T) is the rotational component of the pose T\mathbf T.

6.4 Point constraint

We are aligning a “moving” or “map” point m\mathbf m to a “static” or “scene” point s\mathbf s by transforming the moving point with the pose T\mathbf T.

The matrix A\mathbf A can be used to store the uncertainty of the point s\mathbf s, i.e. an information matrix (the inverse of a covariance matrix).

6.4.1 Residual

The residual is a 3×13\times 1 vector:

50 Fpoint=A(Tms).\begin{aligned}\mathbf F_\text{point} = \mathbf A (\mathbf T \mathbf m - \mathbf s).\end{aligned}

6.4.2 Left Jacobian

The left Jacobian is k×6k\times 6, where A\mathbf A is k×3k\times 3.

51 Jleft=A[[Tm]×I]\begin{aligned}\mathbf J_\text{left} &= \mathbf A \begin{bmatrix} -[\mathbf T \mathbf m]_\times & \mathbf I \end{bmatrix}\end{aligned}

6.4.3 Right Jacobian

The right Jacobian is k×6k\times 6, where A\mathbf A is k×3k\times 3.

52 Jright=AR(T)[[m]×I]\begin{aligned}\mathbf J_\text{right} &= \mathbf A \mathbf R(\mathbf T)\begin{bmatrix} -[\mathbf m]_\times & \mathbf I \end{bmatrix}\end{aligned}

Recall that R(T)\mathbf R(\mathbf T) is the rotational component of the pose T\mathbf T.

6.5 Velocity constraint

The velocity constraint seeks to penalise deviations in vehicle velocity from the 6 degree of freedom velocity estimates from another source.

For ease of implementation, the vehicle velocity is obtained by numerical differentiation, e.g. by evaluating the pose at times t1t_1 and t2t_2. Let h=1/(t2t1)h = 1 / (t_2 - t_1).

6.5.1 Residual

The residual is a 6×16\times 1 vector. For concise notation let us define the 6×66\times 6 matrix Δ\bm \Delta such that

53 Fvelocity=hlog(Δ)ξvelocity=hlog(T(t2)T(t1)1)ξvelocity\begin{aligned}\mathbf F_\text{velocity} &= h \log\left(\bm \Delta\right) - \bm \xi_\text{velocity}\\ &= h \log\left( \mathbf T(t_2) \mathbf T(t_1)^{-1} \right) - \bm \xi_\text{velocity}\end{aligned}

6.5.2 Left Jacobian

Consider differentiating with respect to left-updates of T(t1)\mathbf T(t_1).

The Jacobian is 6×66\times 6.

54 Jleft(t1)Fvelocityδ(t1)=δhlog(T(t2)(exp(δ)T(t1))1)=δhlog(T(t2)T(t1)1Δexp(δ))=δhlog(exp(Ad(Δ)δϵ)Δ)=ϵhlog(exp(ϵ)Δ)ϵδ=hDlog(Δ)Ad(Δ)=hDlog(T(t2)T(t1)1)Ad(T(t2)T(t1)1).\begin{aligned}\mathbf J_\text{left}(t_1) \equiv \frac{\partial \mathbf F_\text{velocity}}{\partial \bm \delta(t_1)} &= \frac{\partial}{\partial \bm \delta} h \log\left( \mathbf T(t_2) \left( \exp(\bm \delta) \mathbf T(t_1) \right)^{-1} \right)\\ &= \frac{\partial}{\partial \bm \delta} h \log\left( \underbrace{ \mathbf T(t_2) \mathbf T(t_1)^{-1} }_{\bm \Delta} \exp(-\bm \delta) \right)\\ &= \frac{\partial}{\partial \bm \delta} h \log\left(\exp(\underbrace{-\operatorname{Ad}(\bm \Delta)\bm \delta}_{\bm \epsilon}) \bm \Delta \right) \\ &= \frac{\partial}{\partial \bm \epsilon} h \log\left(\exp(\bm \epsilon) \bm \Delta \right) \frac{\partial \bm \epsilon}{\partial \bm \delta} \\ &= -h\mathbf D_{\log}\left(\bm \Delta \right) \operatorname{Ad}(\bm \Delta)\\ &= -h\mathbf D_{\log}(\mathbf T(t_2)\mathbf T(t_1)^{-1}) \operatorname{Ad}( \mathbf T(t_2) \mathbf T(t_1)^{-1} ).\end{aligned}

Now, consider differentiating with respect to left-updates to T(t2)\mathbf T(t_2).

55 Jleft(t2)Fvelocityδ(t2)=δhlog(exp(δ)T(t2)T(t1)1)=hDlog(T(t2)T(t1)1).\begin{aligned}\mathbf J_\text{left}(t_2) \equiv \frac{\partial \mathbf F_\text{velocity}}{\partial \bm \delta(t_2)} &= \frac{\partial}{\partial \bm \delta} h \log\left( \exp(\bm \delta) \mathbf T(t_2) \mathbf T(t_1)^{-1} \right)\\ &= h\mathbf D_{\log}( \mathbf T(t_2) \mathbf T(t_1)^{-1} ).\end{aligned}

6.6 Regularisation constraint

The regularisation constraint implements a basic Tikhonov regulariser. It seeks to dampen the problem to avoid divergent oscillations, overfitting, or other types of poor convergence.

6.6.1 Residual

The residual is 6×16\times 1:

56 Fregulariser=ξ.\begin{aligned}\mathbf F_\text{regulariser} = \bm \xi.\end{aligned}

6.6.2 Jacobian

The Jacobian is the identity:

57 Jregulariser=I.\begin{aligned}\mathbf J_\text{regulariser} = \mathbf I.\end{aligned}

7 References