---
title: "Summary of lidar scan matching algorithms"
date: "2014-11-25"
---


# Introduction

These days I'm working on [lidar](https://en.wikipedia.org/wiki/lidar) odometry and mapping. That is, for a mobile robot equipped with a lidar sensor, we wish to find out what the robot's pose is, and generate a map of the world. This is known as simultaneous localization and mapping (SLAM). 

Algorithms that do this can be largely divided into broad categories:

- **Odometry** algorithms run in an online setting when new data is constantly being gathered, and updates the robot's pose estimate as soon as possible. This then allows the point set data to be corrected of motion error (de-warping).
- **Mapping**, or **scan matching** algorithms combine two or more point clouds into a single consistent one. This is more time consuming and is not run as often, or may be omitted altogether in favour of loop closure algorithms.
- **Loop closure** corrects for drift by performing a global optimization over all past data, and is especially useful if a robot returns to a previously-visited area. This is very computationally expensive, and gets much worse as more data is collected.

Just odometry by itself is prone to drifting away from the true trajectory over time, so combining the de-warped point clouds in the mapping part can mitigate this. Even then there may be some drift, but some such open-loop algorithms (e.g. [4](#4) ) are quite successful.

![Picture showing point sets obtained from odometry being registered to a bigger point set during the mapping step. Image credit [4](#4).](http://i.imgur.com/RVuLgSQ.png)

This blog post will focus on the scan matching process, since I was given some de-warped data that need to be aligned with minimal drift.

# Point to point registration

A lidar sensor measures distance to the nearest obstacle in many directions. So its output is a point cloud with one point for each of those measurements. 

![Registration of two scans (blue, and grey), with the red lines showing correspondences between features. Image credit [Martin Holzkothen, Michael Korn](http://docs.pointclouds.org/trunk/group__registration.html).](https://upload.wikimedia.org/wikipedia/commons/0/00/Registration_outdoor.png)

[Point set registration](https://en.wikipedia.org/wiki/Point_set_registration)
is the problem of aligning two point sets, often denoted _M_ and _S_ (for map
and scene). Here we will talk about registering _S_ to _M_ but in many papers
they talk about registering _M_ (model) to _S_ instead.

Point set registration algorithms often use the following iterative algorithm:

1. Estimate correspondences between points.
2. Solve for the optimal transform assuming those correspondences.

Since step 2 might change the correspondences, these two steps are repeated until
convergence. There are multiple ways to estimate correspondences.

- In **iterative closest point** (ICP) [0](#0), every point in _S_ is assumed
  to correspond to the closest point in _M_.
- In **robust point matching** [1](#1), every point in _S_ is assumed to have some
  correspondence value between 0 and 1 to every point in _M_. As the algorithm
  progresses, it applies deterministic annealing to converge to a
  one-to-one correspondence between points in _M_ and points in _S_.
- In **coherent point drift** [2](#2), every point in _S_ is assumed to have some
  correspondence value to every point in _M_ given by a Bayesian approach assuming
  each point has a Gaussian centered at it and then computing the posterior
  probability of finding the new Gaussian given the old distribution. The sigma,
  same for all points, decreases as the algorithm approaches convergence.

ICP and its variants are so ubiquitous that almost all point set registration
and scan matching programs use it. However, it sometimes get stuck in local minima
when the initial positions are not well-aligned.
By relaxing the correspondence assumptions, the latter two algorithms are more
robust against poor initial alignment, noisy outliers and removed points.
In all cases, the cost function is some (possibly weighted) sum of distances
of each point in _S_ to its corresponding point in _M_.

There are various ways to estimate the transformation that minimizes the cost.
It depends on how you parametrize the transformation (e.g. rigid transformation,
vs non-rigid using thin plate splines, etc). Then, you can use least squares, the
Levenberg-Marquadt algorithm, or something else.

There are also other point set registration algorithms that don't adhere to the
idea of iteratively estimating correspondence and then solving for a transformation.
For example, the kernel correlation method [3](#3) assigns a kernel (e.g. a Gaussian)
centered at each of the points and then uses gradient descent to maximize the
correlation between those kernels.

Note that in addition to spatial distance, point set registration algorithms can
also make use of normals information and colour information where available.

# Higher level features

In laser scan matching, especially using data produced by lidar sensors that have
a small number of lasers, data sets are often sparse and may appear to have
spurious patterns (i.e. "rows" of points along each scan line) caused by the way
the scanner collects data. Thus, assuming that each point in _S_ corresponds to
points in _M_ may not always work. One way to get around this is to use a model
specifically tailored to the pattern of points produced by a lidar, e.g. polar
matching [7](#7). This does not work in the general case, for example when points
have been de-warped using an external source. A higher-level representation of
the point cloud can be used for robust scan matching.

- **Segmentation** methods separate points into local structures, most commonly
  planes and edges. The cost function is the distance of plane-like points
  in _S_ to planar patches in _M_; and of edge-like points
  in _S_ to edge lines _M_.
  - LOAM method: The LOAM algorithm [4](#4) takes advantage of the assumption that
    points are produced as a laser beam is rotated at a constant speed. This
    yields the simple heuristic for finding curvature: a point is likely to be
    in a plane if it is close to the mean position of its neighbours obtained
    before and after it, and likely part of an edge otherwise.
- **Grid** or **voxel** methods divide the space into a regular grid, and obtains
  a representation of points within each grid cell. A regular grid naturally
  removes any nonuniformity in the manner in which points are collected.
  - The **normal distribution transform** [5](#5) uses a Gaussian to model the
    distribution of points within each grid cell. This is used in [8](#8) for a
    mining robot and compared favourably against ICP.
    The cost function is the negative of the goodness function obtained by
    evaluating the NDT of _M_ at the positions of each point in _S_.
    Since Gaussians are differentiable, the authors of this method proposed
    optimizing the cost function using Newton's method.
  - **Surface elements** or **surfels** are local surface patch estimates by
    fitting a plane or ellipsoid to points within each grid cell. These are used
    in [6](#6) for a mine exploration robot equipped with a lidar.
    The cost function is a weighted Euclidean sum of distance between the centers of
    surfels, and the difference in orientations thereof.
- **Invariant feature** methods [9](#9)[10](#10) use geometric descriptors
  to associate each point with a scalar value. A point is a feature if it has
  a rare descriptor value. Using a histogram of descriptor values to identify
  features in less populated bins, the algorithm extracts pairs of features in
  _S_ and _M_ with known correspondences. The scans can be matched by minimizing
  the cost that is the distance between corresponding pairs. If needed, this can
  subsequently be refined with ICP.
  - **Curvature** can be computed using moments and spherical harmonics from
    surrounding points, but is sensitive to noise and sampling rate. [10](#10)
  - The **integral volume descriptor** [9](#9) is the fraction of the volume
    in a ball centered at each point that is considered to be inside the point
    cloud. The "insideness" is determined using methods such as ray shooting, and
    the volumes are calculated using a voxel occupancy grid.

![Some two-dimensional NDTs](http://i.imgur.com/CjX9PZU.png)

In most cases, the cost function may be minimized using any nonlinear regression
method such as the Levenberg-Marquadt. As with point set registration, for
additional robustness against outliers, we can pass the cost through a robust
function such as the Cauchy distribution --- this is known as an M-estimator.

# Other constraints

In the specific context of a mobile robot, there are other constraints useful
for laser scan matching, which arise from physical assumptions about the robot.
These introduce additional terms in the cost function to penalize unfeasible
solutions caused by numerical stability issues due to noise. The following
constraints have been used in [6](#6) for a mine exploration robot.

- The **smoothness constraints** take into consideration a model of the physical
  limitations on the robot's linear and rotational accelerations and penalize large
  transformations in a short time.
- The **gravity constraint** assumes that the top of the robot
  is mostly pointing straight upwards or else it would have flipped over. This is
  especially useful when combined with an IMU to determine the up direction
  (after filtering out accelerations caused by motion).

# Ideas for improvement

- Poisson disk sampling instead of regular grid
- Fit multiple planes using RANSAC to each cell instead of one surfel
- Fit multiple Gaussians to cell using EM instead of one

# References

- <cite class="refname" id="0">0</cite> [Besl and McKay (1992) A method for registration of 3-D shapes](http://dx.doi.org/10.1109%2F34.121791)
- <cite class="refname" id="1">1</cite> [Gold et al (1998) New algorithms for 2d and 3d point matching](http://dx.doi.org/10.1016%2FS0031-3203%2898%2980010-1)
- <cite class="refname" id="2">2</cite> [Myronenko and Song (2010) Point set registration, coherent point drift.](http://dx.doi.org/10.1109%2Ftpami.2010.46)
- <cite class="refname" id="3">3</cite> [Tsin and Kanade (2004) A correlation-based approach to robust point set registration](http://dx.doi.org/10.1007%2F978-3-540-24672-5_44)
- <cite class="refname" id="4">4</cite> [Zhang and Singh (2014) LOAM: LIDAR odometry and mapping in real time](http://www.frc.ri.cmu.edu/~jizhang03/Publications/RSS_2014.pdf)
- <cite class="refname" id="5">5</cite> [Biber (2003) The normal distributions transform](http://www.gris.uni-tuebingen.de/fileadmin/user_upload/Paper/Biber-2003-TheNormal-TR.pdf)
- <cite class="refname" id="6">6</cite> [Zlot and Bosse (2012) Efficient large scale 3D mobile mapping and surface reconstruction of an underground mine](http://dx.doi.org/10.1007%2F978-3-642-40686-7_32)
- <cite class="refname" id="7">7</cite> [Diosi and Kleeman (2005) Laser scan matching in polar coordinates with application to SLAM](http://www.diosirobotics.net/publications/diosi_kleeman_iros05.pdf)
- <cite class="refname" id="8">8</cite> [Magnusson et al (2007) Scan registration for autonomous minimg vehicles using 3D-NDT](http://eprints.lincoln.ac.uk/1615/1/mm_al_td_jfr2007.pdf)
- <cite class="refname" id="9">9</cite> [Gelfand et al (2005) Robust global registration](http://vecg.cs.ucl.ac.uk/Projects/SmartGeometry/global_registration/paper_docs/global_registration_sgp_05.pdf)
- <cite class="refname" id="10">10</cite> [Sharp et al (2002) ICP registration using invariant features](http://dx.doi.org/10.1109%2F34.982886)
