Least-Squares Transformations between Point-sets

Back to Publications


Least-Squares Transformations between Point-Sets, Kalle Rutanen, Germán Gómez-Herrero, Sirkka-Liisa Eriksson, Karen Egiazarian, 18th Scandinavian Conference on Image Analysis, pp.501-511, June 17-20, 2013.

Official paper


This paper derives formulas for least-squares transformations between point-sets in . We consider affine transformations, with optional constraints for linearity, scaling, and orientation. We base the derivations hierarchically on reductions, and use trace manipulation to achieve short derivations. For the unconstrained problems, we provide a new formula which maximizes the orthogonality of the transform matrix.

Thought process

When working on printed electronics with Tapio Manninen in 2007, we were introduced to the problem of point-pattern matching, which is about matching point-sets for which no correspondence is known. To implement point-pattern matching, one first needs to solve the corresponding problem for the paired point-sets. When we researched the literature, it seemed hard to find general results, where the consideration of several restrictions to the solutions created a combinatorial explosion of similar problems.

In 2008 I posted to the Usenet newsgroup sci.math with the title Least squares similarity transformation, requesting help with deriving the least-squares transformation between point-sets in the case of an conformal affine transformation. It was the use of the first variation by Rob Johnson that was one of the essential ideas in this paper. Embarrassingly, I forgot to acknowledge Rob in the paper! At first I used Rob’s first variation technique to find the solutions. However, then I found out a more direct way for solving the rigid problem, which is given in the paper. The variation technique is still used for the scaling problem.

It occurred to me that I could reduce the affine problem to a linear problem, thus halving the number of problems. This led me to search for more of such reductions. The next important step was to decompose the matrix part of the transformation into a scaling matrix, and an orthogonal matrix. By these matrices I was able to formulate an orthogonal set of properties, which generated a combinatorial amount of different problems.

Next I became interested in the freedom that the Moore-Penrose pseudo-inverse left for the solution of the linear problem. I used that freedom to select the transformation as orthogonal as possible. At first I worked with the singular value decomposition. After doing a lot of derivations, my intuition started to suggest that I am missing some important symmetry in the problem. The answer to this intuition was given by the generalized singular value decomposition, which provided an elegant formula.

The paper on the coherent point drift algorithm provided a generalized form of the problem, where the transformation included an additional weighting matrix. I was interested in whether I could reduce that problem to the problems I already had. Indeed, that was the case. As a result, I was able to include weighting to the error function.

Missing transformations

Due to lack of space I had to leave out the derivation for when the scaling matrix is diagonal, and the orthogonal matrix is the identity. Using the first variation the derivation is not that hard; it is left as an exercise:) But the following problem is open: scaling matrix diagonal, and a free orthogonal matrix. I found this a bit annoying, since it was the only one I could not solve, but unfortunately I haven’t had the time to go back to it. Leave me a post if you manage to solve this.


I have implemented a Matlab-function that can solve all of the problems by using the reductions given in the paper. It is available in the Pastel library. However, it uses the pseudo-inverse for the solution of the linear problem, because Matlab’s implementation of the generalized singular value decomposition (GSVD) does not correspond to the one I rely on.

I have implemented the generalized singular value decomposition by following the constructive proof of how it is derived. While this worked to demonstrate that the derivation is correct, I did not want to plug it into Pastel, since I have no idea about its numerical behaviour. If you want, you can get that implementation from here. Only the files test_my_gsvd.m and the my_gsvd.m are written by me. The other files are Matlab code written by Ivo Houtzager to compute the QR/RQ/QL/LQ matrix factorizations (which has unfortunately been prebuilt only for Windows). It seems to crash on the 64-bit Matlab 2013b (the prebuild is old), so you should run the test on a 32-bit Matlab.


Rob Johnson pointed me to the first variation, and applied it to solve the affine problem and the conformal affine problem. It was from these posts where I first learned of the variation technique.

The research was done under funding from the Finnish National Doctoral Programme in Mathematics and its Applications.


In the paper I use the term directional derivative, although I really mean first variation. First variation is easier to compute, since one can then apply differentiation rules, rather than taking a limit.