Marginalization

2021-09-03

Suppose that

\begin{aligned}\mathbf z_1 = \mathbf x_1 + \mathbf n_1\end{aligned}

where

$\mathbf z_1, \mathbf x_1, \mathbf n_1$ are 3D vectors in $\mathbb R^3$ ,
$\mathbf n_1$ is randomly distributed w ith normal distribution $\mathcal N(\mathbf 0, \bm \Sigma_1)$ , with $\bm \Sigma_1$ being $3\times 3$

and

\begin{aligned}\mathbf z_2 = \mathbf A \begin{bmatrix}\mathbf x_1\\\mathbf x_2\end{bmatrix} + \mathbf b + \mathbf n_2\end{aligned}

where

$\mathbf A$ is a $6\times 6$ matrix
$\mathbf x_2$ is a 3D vector in $\mathbb R^3$
$\mathbf z_2, \mathbf n_2$ are 6D vectors in $\mathbb R^6$ , and
$\mathbf n_2$ is randomly distributed with normal distribution $\mathcal N(\mathbf 0, \bm \Sigma_2)$ , with $\bm \Sigma_2$ being $6\times 6$

Supposing that $\mathbf x_1$ and $\mathbf x_2$ are unknowns, what is the optimal estimate of $\mathbf x_2$ given everything else?

We can first observe that $\mathbf A$ can be partitioned into its left and right $6\times 3$ submatrices:

\begin{aligned}\mathbf A = \begin{bmatrix}\mathbf A_1 & \mathbf A_2\end{bmatrix}\end{aligned}

such that

\begin{aligned}\mathbf z_2 &= \mathbf A_1 \mathbf x_1 + \mathbf A_2 \mathbf x_2 + \mathbf b + \mathbf n_2\\ &=\mathbf A_1 \mathbf z_1 - \mathbf A_1 \mathbf n_1 + \mathbf A_2 \mathbf x_2 + \mathbf b + \mathbf n_2\end{aligned}

Let

\begin{aligned}\mathbf n_3 = \mathbf n_2 - \mathbf A_1 \mathbf n_1\end{aligned}

From the Matrix Cookbook cookbook we can see that $\mathbf n_3$ is distributed with normal distribution $\mathcal N(\mathbf 0, \bm \Sigma_2 + \mathbf A_1 \bm \Sigma_1 \mathbf A_1^T)$

Now we have:

\begin{aligned}\mathbf z_2 = \mathbf A_1 \mathbf z_1 + \mathbf A_2 \mathbf x_2 + \mathbf b + \mathbf n_3\end{aligned}

Let

\begin{aligned}\mathbf y = \mathbf A_2 \mathbf x_2 + \mathbf A_1 \mathbf z_1 + \mathbf b - \mathbf z_2\end{aligned}

Now, the probability of observing $\mathbf z_2$ is

\begin{aligned}p(\mathbf z_2) = \exp\left(-\mathbf y^T \bm \Sigma_3^T \mathbf y\right)\end{aligned}

To maximize the likelihood we can simply minimize the log likelihood:

\begin{aligned}\operatorname{min}_{\mathbf x_2} \mathbf y^T \bm \Sigma_3^T \mathbf y\end{aligned}

Minimizing this quadratic is straightforward.

1 References

cookbook Matrix Cookbook,Kaare Brandt Petersen, Michael Syskind Pedersen. link, link2.