Marginalization

2021-09-03

Suppose that

1 z1=x1+n1\begin{aligned}\mathbf z_1 = \mathbf x_1 + \mathbf n_1\end{aligned}

where

and

2 z2=A[x1x2]+b+n2\begin{aligned}\mathbf z_2 = \mathbf A \begin{bmatrix}\mathbf x_1\\\mathbf x_2\end{bmatrix} + \mathbf b + \mathbf n_2\end{aligned}

where

Supposing that x1\mathbf x_1 and x2\mathbf x_2 are unknowns, what is the optimal estimate of x2\mathbf x_2 given everything else?

We can first observe that A\mathbf A can be partitioned into its left and right 6×36\times 3 submatrices:

3 A=[A1A2]\begin{aligned}\mathbf A = \begin{bmatrix}\mathbf A_1 & \mathbf A_2\end{bmatrix}\end{aligned}

such that

4 z2=A1x1+A2x2+b+n2=A1z1A1n1+A2x2+b+n2\begin{aligned}\mathbf z_2 &= \mathbf A_1 \mathbf x_1 + \mathbf A_2 \mathbf x_2 + \mathbf b + \mathbf n_2\\ &=\mathbf A_1 \mathbf z_1 - \mathbf A_1 \mathbf n_1 + \mathbf A_2 \mathbf x_2 + \mathbf b + \mathbf n_2\end{aligned}

Let

5 n3=n2A1n1\begin{aligned}\mathbf n_3 = \mathbf n_2 - \mathbf A_1 \mathbf n_1\end{aligned}

From the Matrix Cookbook cookbook we can see that n3\mathbf n_3 is distributed with normal distribution N(0,Σ2+A1Σ1A1T)\mathcal N(\mathbf 0, \bm \Sigma_2 + \mathbf A_1 \bm \Sigma_1 \mathbf A_1^T)

Now we have:

6 z2=A1z1+A2x2+b+n3\begin{aligned}\mathbf z_2 = \mathbf A_1 \mathbf z_1 + \mathbf A_2 \mathbf x_2 + \mathbf b + \mathbf n_3\end{aligned}

Let

7 y=A2x2+A1z1+bz2\begin{aligned}\mathbf y = \mathbf A_2 \mathbf x_2 + \mathbf A_1 \mathbf z_1 + \mathbf b - \mathbf z_2\end{aligned}

Now, the probability of observing z2\mathbf z_2 is

8 p(z2)=exp(yTΣ3Ty)\begin{aligned}p(\mathbf z_2) = \exp\left(-\mathbf y^T \bm \Sigma_3^T \mathbf y\right)\end{aligned}

To maximize the likelihood we can simply minimize the log likelihood:

9 minx2yTΣ3Ty\begin{aligned}\operatorname{min}_{\mathbf x_2} \mathbf y^T \bm \Sigma_3^T \mathbf y\end{aligned}

Minimizing this quadratic is straightforward.

1 References