In this section I will rederive the coherent point-drift algorithm, described in Point Set Registration: Coherent Point Drift, Myronenko, A., Xubo Song, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, no.12, pp.2262,2275, Dec. 2010.
Meaning | Set |
---|---|
Probability space | |
Actual data; random element | |
Measurable space | |
Measurable space | |
Observation function; measurable | |
Observed data | |
Set of distribution parameters | |
Set of probability densities |
The maximum-likelihood estimator for the distribution parameter — given — is a random element such that
Since is an increasing function, and probability densities are non-negative, this is equivalent to
The expectation-maximization algorithm, or the EM algorithm, is an iterative algorithm for finding a maximum-likelihood estimator in a neighborhood of the initial parameter — given observed data . The EM-iteration is given by
for all .
Ideally, we would like to compute from Equation . However, this is not possible, since we only observe , and not .
Therefore, we integrate out from the information that is not contained in by the conditional expectational, and then try to maximize the remaining likelihood. However, this is often not possible in closed form.
Therefore, we change the strategy to an iterative approximation instead. We assume that the conditional expectation is taken with respect to constant parameters; those from the previous step. If this can be maximized in closed form, then we can improve the approximation by repeating the local maximization. This provides us with a local maximum in the limit. The local maximum may or not may not be a global maximum.
The coherent point-drift algorithm is a point-set registration algorithm. It treats the points in one point-set as the centroids of normal distributions, and then finds the parameters for the normal distributions to best fit the other point-set. This is done by maximimizing likelihood using the EM algorithm. This is useful, because the information on which normal distribution generated the point is lost — something the EM algorithm can deal with.
The normal distributions share the same isotropic variance. The coherent point-drift algorithm includes the variance as an estimated parameter. This requires to add yet another component to the mixture model. The choice here is an improper uniform “distribution”. It is here where the probabilistic interpretability of the algorithm suffers somewhat. However, from the viewpoint of likelihood maximization there are no problems.
Meaning | Set |
---|---|
Model point-set | |
Scene point-set | |
Set of transformation parameters | |
Set of transformations | |
Initial transformation parameter | |
Initial variance parameter | |
Noise tolerance |
Name | Set |
---|---|
Transformation parameter | |
Variance parameter |
Here matches well — in some sense.
Let be i.i.d. random elements — the actual data — such that
for all and . Here
The missing component is an improper uniform distribution:
We assume the to be a realization of — there exists such that — and that the observation function is such that .
The probability density of is given by
for , and
The logarithm of the former is
As a marginal distribution, we have that
for all .
Substituting location-index and location,
The conditional expectation of given is given by
The conditional expectation of given is given by
Let us denote
Since the first and the last term of the joint conditional expectation do not depend on or , maximizing it is equivalent to maximizing
However, this does not seem to be possible in closed form. Instead — following the EM algorithm — we assume constant parameters, and minimize
A necessary condition for minimizing is that the partial derivative of with respect to is zero:
This can be used to solve for :
Substituting back, we get that we must minimize
Since is increasing, minimizing this is equivalent to minimizing
Therefore,
The solution of this problem for many subgroups of affine transformations can be found from here.