Summary of LIDAR scan matching algorithms
===
# Introduction
These days I'm working on [LIDAR](https://en.wikipedia.org/wiki/Lidar) odometry and mapping. That is, for a mobile robot equipped with a LIDAR sensor, we wish to find out what the robot's pose is, and generate a map of the world. This is known as simultaneous localization and mapping (SLAM).
Algorithms that do this can be largely divided into broad categories:
* **Odometry** algorithms run in an online setting when new data is constantly being gathered, and updates the robot's pose estimate as soon as possible. This then allows the point set data to be corrected of motion error (de-warping).
* **Mapping**, or **scan matching** algorithms combine two or more point clouds into a single consistent one. This is more time consuming and is not run as often, or may be omitted altogether in favour of loop closure algorithms.
* **Loop closure** corrects for drift by performing a global optimization over all past data, and is especially useful if a robot returns to a previously-visited area. This is very computationally expensive, and gets much worse as more data is collected.
Just odometry by itself is prone to drifting away from the true trajectory over time, so combining the de-warped point clouds in the mapping part can mitigate this. Even then there may be some drift, but some such open-loop algorithms (e.g. (#4) ) are quite successful.
pic http://i.imgur.com/RVuLgSQ.png Odometry vs mapping : Picture showing point sets obtained from odometry being registered to a bigger point set during the mapping step. Image credit (#4).
This blog post will focus on the scan matching process, since I was given some de-warped data that need to be aligned with minimal drift.
# Point to point registration
A LIDAR sensor measures distance to the nearest obstacle in many directions. So its output is a point cloud with one point for each of those measurements.
pic https://upload.wikimedia.org/wikipedia/commons/0/00/Registration_outdoor.png Point set registration : Registration of two scans (blue, and grey), with the red lines showing correspondences between features. Image credit [Martin Holzkothen, Michael Korn](http://docs.pointclouds.org/trunk/group__registration.html).
[Point set registration](https://en.wikipedia.org/wiki/Point_set_registration)
is the problem of aligning two point sets, often denoted _M_ and _S_ (for map
and scene). Here we will talk about registering _S_ to _M_ but in many papers
they talk about registering _M_ (model) to _S_ instead.
Point set registration algorithms often use the following iterative algorithm:
1. Estimate correspondences between points.
2. Solve for the optimal transform assuming those correspondences.
Since step 2 might change the correspondences, these two steps are repeated until
convergence. There are multiple ways to estimate correspondences.
* In **iterative closest point** (ICP) (#0), every point in _S_ is assumed
to correspond to the closest point in _M_.
* In **robust point matching** (#1), every point in _S_ is assumed to have some
correspondence value between 0 and 1 to every point in _M_. As the algorithm
progresses, it applies deterministic annealing to converge to a
one-to-one correspondence between points in _M_ and points in _S_.
* In **coherent point drift** (#2), every point in _S_ is assumed to have some
correspondence value to every point in _M_ given by a Bayesian approach assuming
each point has a Gaussian centered at it and then computing the posterior
probability of finding the new Gaussian given the old distribution. The sigma,
same for all points, decreases as the algorithm approaches convergence.
ICP and its variants are so ubiquitous that almost all point set registration
and scan matching programs use it. However, it sometimes get stuck in local minima
when the initial positions are not well-aligned.
By relaxing the correspondence assumptions, the latter two algorithms are more
robust against poor initial alignment, noisy outliers and removed points.
In all cases, the cost function is some (possibly weighted) sum of distances
of each point in _S_ to its corresponding point in _M_.
There are various ways to estimate the transformation that minimizes the cost.
It depends on how you parametrize the transformation (e.g. rigid transformation,
vs non-rigid using thin plate splines, etc). Then, you can use least squares, the
Levenberg-Marquadt algorithm, or something else.
There are also other point set registration algorithms that don't adhere to the
idea of iteratively estimating correspondence and then solving for a transformation.
For example, the kernel correlation method (#3) assigns a kernel (e.g. a Gaussian)
centered at each of the points and then uses gradient descent to maximize the
correlation between those kernels.
Note that in addition to spatial distance, point set registration algorithms can
also make use of normals information and colour information where available.
# Higher level features
In laser scan matching, especially using data produced by LIDAR sensors that have
a small number of lasers, data sets are often sparse and may appear to have
spurious patterns (i.e. "rows" of points along each scan line) caused by the way
the scanner collects data. Thus, assuming that each point in _S_ corresponds to
points in _M_ may not always work. One way to get around this is to use a model
specifically tailored to the pattern of points produced by a LIDAR, e.g. polar
matching (#7). This does not work in the general case, for example when points
have been de-warped using an external source. A higher-level representation of
the point cloud can be used for robust scan matching.
* **Segmentation** methods separate points into local structures, most commonly
planes and edges. The cost function is the distance of plane-like points
in _S_ to planar patches in _M_; and of edge-like points
in _S_ to edge lines _M_.
** LOAM method: The LOAM algorithm (#4) takes advantage of the assumption that
points are produced as a laser beam is rotated at a constant speed. This
yields the simple heuristic for finding curvature: a point is likely to be
in a plane if it is close to the mean position of its neighbours obtained
before and after it, and likely part of an edge otherwise.
* **Grid** or **voxel** methods divide the space into a regular grid, and obtains
a representation of points within each grid cell. A regular grid naturally
removes any nonuniformity in the manner in which points are collected.
** The **normal distribution transform** (#5) uses a Gaussian to model the
distribution of points within each grid cell. This is used in (#8) for a
mining robot and compared favourably against ICP.
The cost function is the negative of the goodness function obtained by
evaluating the NDT of _M_ at the positions of each point in _S_.
Since Gaussians are differentiable, the authors of this method proposed
optimizing the cost function using Newton's method.
** **Surface elements** or **surfels** are local surface patch estimates by
fitting a plane or ellipsoid to points within each grid cell. These are used
in (#6) for a mine exploration robot equipped with a LIDAR.
The cost function is a weighted Euclidean sum of distance between the centers of
surfels, and the difference in orientations thereof.
* **Invariant feature** methods (#9)(#10) use geometric descriptors
to associate each point with a scalar value. A point is a feature if it has
a rare descriptor value. Using a histogram of descriptor values to identify
features in less populated bins, the algorithm extracts pairs of features in
_S_ and _M_ with known correspondences. The scans can be matched by minimizing
the cost that is the distance between corresponding pairs. If needed, this can
subsequently be refined with ICP.
** **Curvature** can be computed using moments and spherical harmonics from
surrounding points, but is sensitive to noise and sampling rate. (#10)
** The **integral volume descriptor** (#9) is the fraction of the volume
in a ball centered at each point that is considered to be inside the point
cloud. The "insideness" is determined using methods such as ray shooting, and
the volumes are calculated using a voxel occupancy grid.
pic http://i.imgur.com/CjX9PZU.png some NDTs : Some two-dimensional NDTs
In most cases, the cost function may be minimized using any nonlinear regression
method such as the Levenberg-Marquadt. As with point set registration, for
additional robustness against outliers, we can pass the cost through a robust
function such as the Cauchy distribution --- this is known as an M-estimator.
# Other constraints
In the specific context of a mobile robot, there are other constraints useful
for laser scan matching, which arise from physical assumptions about the robot.
These introduce additional terms in the cost function to penalize unfeasible
solutions caused by numerical stability issues due to noise. The following
constraints have been used in (#6) for a mine exploration robot.
* The **smoothness constraints** take into consideration a model of the physical
limitations on the robot's linear and rotational accelerations and penalize large
transformations in a short time.
* The **gravity constraint** assumes that the top of the robot
is mostly pointing straight upwards or else it would have flipped over. This is
especially useful when combined with an IMU to determine the up direction
(after filtering out accelerations caused by motion).
# Ideas for improvement
* Poisson disk sampling instead of regular grid
* Fit multiple planes using RANSAC to each cell instead of one surfel
* Fit multiple Gaussians to cell using EM instead of one
# References
* [#0] [Besl and McKay (1992) A method for registration of 3-D shapes](http://dx.doi.org/10.1109%2F34.121791)
* [#1] [Gold et al (1998) New algorithms for 2d and 3d point matching](http://dx.doi.org/10.1016%2FS0031-3203%2898%2980010-1)
* [#2] [Myronenko and Song (2010) Point set registration, coherent point drift.](http://dx.doi.org/10.1109%2Ftpami.2010.46)
* [#3] [Tsin and Kanade (2004) A correlation-based approach to robust point set registration](http://dx.doi.org/10.1007%2F978-3-540-24672-5_44)
* [#4] [Zhang and Singh (2014) LOAM: Lidar odometry and mapping in real time](http://www.frc.ri.cmu.edu/~jizhang03/Publications/RSS_2014.pdf)
* [#5] [Biber (2003) The normal distributions transform](http://www.gris.uni-tuebingen.de/fileadmin/user_upload/Paper/Biber-2003-TheNormal-TR.pdf)
* [#6] [Zlot and Bosse (2012) Efficient large scale 3D mobile mapping and surface reconstruction of an underground mine](http://dx.doi.org/10.1007%2F978-3-642-40686-7_32)
* [#7] [Diosi and Kleeman (2005) Laser scan matching in polar coordinates with application to SLAM](http://www.diosirobotics.net/publications/diosi_kleeman_iros05.pdf)
* [#8] [Magnusson et al (2007) Scan registration for autonomous minimg vehicles using 3D-NDT](http://eprints.lincoln.ac.uk/1615/1/mm_al_td_jfr2007.pdf)
* [#9] [Gelfand et al (2005) Robust global registration](http://vecg.cs.ucl.ac.uk/Projects/SmartGeometry/global_registration/paper_docs/global_registration_sgp_05.pdf)
* [#10] [Sharp et al (2002) ICP registration using invariant features](http://dx.doi.org/10.1109%2F34.982886)