SE(3) constraints for robotics
This document summarizes some common maths used for state estimation of rigid bodies such as in robotics.
1 Transformation parameterisation
Rigid transformations in 3 dimensions are known as the special Euclidean group, , and can be written in the homogeneous form.
The rotation matrix is in the special orthogonal group , which means that it is orthogonal (its columns are normal and orthogonal to each other) and it has determinant .
1.1 Transforming a point
An element may transform a 3D point :
In this document we use it interchangeably with the homogeneous representation:
so that points may be transformed rigidly
1.2 The Lie algebra
Each element in is associated with an element on the corresponding Lie algebra, :
where is the skew symmetric form of the cross product by . We may also write it as a vector.
We will use the and representations interchangeably depending on context.
Note that some other textbooks put the translational part on top and the rotational part below. It doesn’t matter much, but it will affect our notation for things like the adjoint action, differentials, etc below.
1.3 The exponential map
The group and its algebra are related by the exponential map:
The definition of is based on the Taylor series:
Closed forms exist. See: J. L. Blanco’s report jlblanco.
Note that is not commutative. The adjoint action relates things in different orders:
The adjoint is a matrix:
1.4 Notation summary
In general,
- bold lowercase refers to vectors (e.g. translation )
- bold uppercase refers to matrices (e.g. transformation )
- non-bold lowerase refers to scalars (e.g. time )
Here we define and summarise some notation for the following sections.
Description | Notation |
---|---|
skew-symmetric cross product matrix | |
translational part of | |
rotational part of | |
translational part of | |
rotational part of | |
compose | |
exp | |
log of | |
exp | |
log of | |
inverse of | |
inverse of | |
inverse of | |
inverse of | |
adjoint of | |
th element of | |
th element of | |
transform by | |
transform by |
2 Derivatives
Here, we only differentiate with respect to around the point . In other words, we have a function where and we would like to perturb by a very small .
Suppose . If is of dimension , then the resulting derivative is an Jacobian.
where in the case of left-perturbations and in the case of right-perturbations.
The adjoint action can be used to relate the left and right derivatives by using the chain rule.
We also have a derivative of the log function:
where
The exact derivation is in Ethan Eade’s report eade.
Let , , then the following table lists some useful derivatives.
As you will see later, these are basically all the derivatives you need, and all other derivatives can be easily derived from these, often together with using the adjoint.
Function | Derivative |
---|---|
3 Optimisation
Suppose we have a trajectory that is a smooth curve. We would like to minimise some objective function:
We seek to reduce the cost as much as possible. This is a nonlinear least squares problem. During the optimisation process, we perturb it by a small perturbation :
where the operator can either be the left-update:
or the right-update:
Both are valid depending on numerical properties of the problem.
In any case, we linearise the problem around .
Let the Jacobian matrix be:
Then we can solve using Gauss-Newton:
or steepest descent:
or more sophisticated algorithms. In practice, we use Powell’s Dog Leg. The update is
where scalar weights , are chosen such that where the scalar parameter is the radius of the trust region. When is small, the problem is behaving badly and the quadratic approximation that Gauss-Newton uses is not very valid. In this case, is zero, allowing the optimiser to take timid steps along the steepest descent direction. When is large, the quadratic approximation is good and we take bigger steps in the Gauss-Newton direction.
Heuristics are used to determine when to increase and when to decrease it.
3.1 Optimisation under uncertainty
When uncertainty is present, the data is associated with some covariance matrix which is an matrix where is the sise of the residual. The Gauss-Newton update then becomes:
The matrix is also sometimes called the information matrix. In practice, it is useful to factor this matrix, for example, by taking the matrix square root. Let , then,
In other words, we just pre-multiply the Jacobian and residual by a weight matrix. This is a form of whitening. In the common case where is a diagonal matrix, we simply divide each row of the Jacobian and the residual by the standard deviation.
Another common factorisation is the eigendecomposition of the covariance matrix :
where is the square matrix whose columns are the orthonormal eigenvectors of and is the diagonal matrix whose entries are the eigenvalues. Then,
This is useful, for example, in the case of surfel matches. Recall that the covariance matrix is always symmetric positive semidefinite, allowing for easy eigendecomposition.
3.2 Robust loss functions
A least squares problem can be thought of as minimising the negative log likelihood function of a Gaussian. The likelihood function is of the form
Then, the log likelihood is of the least squares form:
However, in many cases, the random variable is not Gaussian. The Gaussian has very thin tails, making it incredibly unlikely to have outliers. In the real life, there are often many outliers that necessitate a more thick-tailed distribution. To model such situations, we need robust loss functions.
Instead of optimising , we optimise:
where is a robust loss function. We can then weigh each row or block of the Jacobian and the residual with a robust reweighting factor:
Name | Definition |
---|---|
TrivialLoss | |
CauchyLoss | |
HuberLoss |
4 Trajectory representation
The goal is to recover the trajectory of a vehicle as a parametric curve as a function of time .
We may assume the curve does not oscillate faster than some frequency (say, 10~Hz).
Now we should find a trajectory representation that satisfies the following properties:
- For some finite period of time, the trajectory should be approximately correct and differentiable.
- For any given real number within the domain of the curve, we can efficiently evaluate in constant time.
- Local perturbations may be applied to the curve, so that , such that the perturbation may be parameterised with a finite number of parameters, each with compact support.
- The trajectory must be independent of the choice of the inertial reference frame.
Our solution to satisfy these requirements is a piecewise linear trajectory. The trajectory is represented as a sequence of elements that represent for some sampled with even spacing at a high frequency (say, 100~Hz). To evaluate , we use a geodesic interpolation for .
To perturb this spline with a curve , we update each , like so:
Alternative approaches for parameterising trajectories include:
5 Parameterisation of the perturbation
When applying perturbations, the curve is a smooth curve which may be thought of as a vector of infinite dimension.
To ensure that the problem is computationally tractable and that remains twice-differentiable, we parameterise the perturbation by a finite vector . The vector is the concatenation many six-dimensional vectors , such that
Notice that, since only applies small local perturbations, it is possible to add together the treating them as vectors in . The scalar-valued function is a basis function with compact support, which means that it is nonzero for a finite contiguous segment of and zero everywhere else. A good choice is the piecewise polynomial for a cardinal cubic B-spline:
We can now redefine the Jacobian to be with respect to the parameters:
Since is a constant,
The elements are known as the spline knots. Since each knot is a vector, the number of columns of is where is the number of spline knots.
As you can see, the derivative of the trajectory at any point in time is a linear combination of the derivatives of up to four spline knots.
In the next sections we will compute derivatives
knowing that each of these blocks will contribute up to four blocks, weighted with scalar weights , to the actual Jacobian where we are optimizing the spline knots .
6 Constraints
The function is known as the residual. It consists of many constraints:
For people familiar with the Ceres library, each constraint is a residual block.
6.1 Position constraint
The position constraint seeks to penalise the distance between the pose’s translational component and a 3D point.
For example, the 3D point may be the GPS position measured at time .
6.1.1 Residual
The residual is a vector:
Recall from the notation section that is the translational component of the element .
6.1.2 Left Jacobian
The left Jacobian is :
The trick is to view where . Then we can apply the Jacobian for transforming a point in the derivatives table.
6.1.3 Right Jacobian
The right Jacobian is :
6.2 Loop closure constraint
Suppose that we have aligned the poses from two points in time along a trajectory, e.g. using ICP.
This produces a relative transformation .
6.2.1 Residual
The residual is a vector:
6.2.2 Left Jacobians
6.2.3 Right Jacobians
6.3 Gravity constraint
The gravity constraint seeks to penalise the distance between up vector of the car and the true up vector .
For example, if the sensor were mounted perfectly upright, then .
We can also use the accelerometer reading to obtain a different at each point in time, as long as you remember to subtract the inertial acceleration.
6.3.1 Residual
The residual is a vector:
6.3.2 Left Jacobian
Since the gravity constraint only depends on rotation and not on translation, we only care about the derivative with respect to , which we hereafter write as .
6.3.3 Right Jacobian
Recall that is the rotational component of the pose .
6.4 Point constraint
We are aligning a “moving” or “map” point to a “static” or “scene” point by transforming the moving point with the pose .
The matrix can be used to store the uncertainty of the point , i.e. an information matrix (the inverse of a covariance matrix).
- If you are registering the point to a line, you can let be . A common choice is to compute an orthonormal basis from the direction of the ray, e.g. using Duff et al’s approach. This is commonly used in visual odometry.
- If you are registering the point to a plane, you can let be .
6.4.1 Residual
The residual is a vector:
6.4.2 Left Jacobian
The left Jacobian is , where is .
6.4.3 Right Jacobian
The right Jacobian is , where is .
Recall that is the rotational component of the pose .
6.5 Velocity constraint
The velocity constraint seeks to penalise deviations in vehicle velocity from the 6 degree of freedom velocity estimates from another source.
For ease of implementation, the vehicle velocity is obtained by numerical differentiation, e.g. by evaluating the pose at times and . Let .
6.5.1 Residual
The residual is a vector. For concise notation let us define the matrix such that
6.5.2 Left Jacobian
Consider differentiating with respect to left-updates of .
The Jacobian is .
Now, consider differentiating with respect to left-updates to .
6.6 Regularisation constraint
The regularisation constraint implements a basic Tikhonov regulariser. It seeks to dampen the problem to avoid divergent oscillations, overfitting, or other types of poor convergence.
6.6.1 Residual
The residual is :
6.6.2 Jacobian
The Jacobian is the identity: