Nearest neighbors searching

Back to Subset searches

Let ''P = {p_1, ..., p_m} sub RR^n'' be a finite point-set, and ''d : RR^n xx RR^n -> RR'' a metric. Let ''q in RR^n'' and ''<_q sub RR^n xx RR^n'' be the (total) relation

''<_q = {(p_i, p_j) in RR^n xx RR^n | d(p_i, q) < d(p_j, q) or [d(p_i, q) = d(p_j, q) and i < j]}''.

If ''p_i <_q p_j'' then ''p_i'' is said to be nearer to q than ''p_j''. If ''P'' is ordered with ''<_q'', then the k:th nearest neighbor of ''q'' in ''P'' is the k:th smallest element of ''P'', denoted ''p_{w_k}''. The exact nearest neighbor problem asks for the ''k'' points nearest to ''q'', i.e., ''p_w = {p_{w_1}, ..., p_{w_k}} sub P''. The approximate nearest neighbor problem asks for ''k'' points ''P_v = {p_{v_1}, ..., p_{v_k}} sub P'' such that

''forall i in [1, k]: d(p_{v_i}, q) < (1 + epsilon) d(p_{w_i}, q)'',

i.e., the found points have no more than ''(1 + epsilon)'' amount of relative error when comparing the distances of the found neighbors to the distances of the exact neighbors.


Pastel implements nearest-neighbor searching and counting in both approximate and exact forms. There are several algorithms that might be appropriate in different situations. The brute-force algorithm is a naive try-all algorithm, which scales in performance as ''O(n m^2)''. A more elegant algorithm is made possible by a branch-and-bound algorithm using a kd-tree. The advantage of the brute-force algorithm is that it has minimal dependence to dimension. Due to curse of dimensionality, in dimensions high enough (e.g. dimension > 32), it is the case that the kd-tree has to inspect almost all of the points, giving the same asymptotic performance as the brute-force search. In these cases, it can make sense to use the brute-force algorithm instead for better practical performance. However, in all other cases you should use the kd-tree.

Learn more


An aggregate file nearest neighbors searching.


Generic neighbor searching for PointKdTree



K-nearest-neighbors searching for PointKdTree



Nearest neighbor counting for PointKdTree



Nearest neighbor searching for PointKdTree