Back to Kozachenko-Leonenko estimator
This page takes a deeper look at the implementation of the Kozachenko-Leonenko estimator and derives an upper bound to the maximum absolute error introduced by allowing relative error in nearest neighbor searching.
Let be a point-set and such that is the distance from to its k:th nearest neighbor. The Kozachenko-Leonenko estimator for Shannon differential entropy is computed by
where
In case all are zero, a NaN (Not-A-Number) is returned.
Because of the guaranteed efficiency of approximate nearest neighbor searching, it would be interesting to see if some relative error can be allowed in without introducing too much error in the result. Assume the distances have a amount of relative error in them, where . Then the estimator is affected as follows:
That is, the absolute error is bounded by
Since in practice needs to be greater than 1 to get the benefits of approximate nearest neighbor searching, the resulting error is unfortunately too large. Therefore we shall always compute exact nearest neighbor distances.