pyann.nn2(data, query=None, k=None, treetype='kd', searchtype='standard', radius=0.0, eps=0.0)

Overview

Uses a kd-tree to find the p number of near neighbours for each point in an input/output dataset. The advantage of the kd-tree is that it runs in \(O(M \log {M})\) time.

The Pyann package utilizes the Approximate Near Neighbor (ANN) C++ library, which can give the exact near neighbours or approximate near neighbours to within a specified error bound.

For more information on the ANN library please visit http://www.cs.umd.edu/~mount/ANN/.

Search types

  • priority: visits cells in increasing order of distance from the query point, and hence, should converge more rapidly on the true nearest neighbour, but standard is usually faster for exact searches.
  • radius: only searches for neighbours within a specified radius of the point. If there are no neighbours then nn_idx will contain 0 and nn_dists will contain \(1.340781e^{154}\) for that point.

Parameters

  • data: array_like

    • \(\small{M\times D}\) np.matrix, where each of the M rows is a point
    • \(\small{M\times D}\) np.ndarray, where D == 1 or None.
  • query: array_like, optional

    points that will be queried against data.

    • \(\small{N\times D}\) np.matrix
    • \(\small{N\times D}\) np.array, where D == 1 or None

    query.shape[1] must == data.shape[1]. if None (default), query == data

  • k: float, int, optional

    The maximum number of nearest neighbours to compute. if None (default), k is set to data.shape[0] or 10, whichever smaller.

  • treetype: str, optional

    Options:

    • 'kd': standard kd tree
    • 'bd': bd (box-decomposition, AMNSW98) tree which may perform better for larger point sets

    default is 'kd'

  • searchtype: str, optional

    Options:

    • 'standard'
    • 'priority'
    • 'radius'

    See above for more detail. default is 'standard'.

  • radius: float, int, optional

    Radius of search for searchtype='radius'. default is 0.0.

  • eps: float, int, optional

    error bound. default of 0.0 implies exact nearest neighbour search.

Return

  • <class 'pyann.nn2.NN2Results'>

    Object of class NN2Results with two attributes:

    • nn_idx: A \(\small{N\times k}\) integer np.matrix returning the near neighbour indices.
    • nn_dists: A \(\small{N\times k}\) np.matrix returning the near neighbour Euclidean distances.

Example

Run pyann.nn2 and assign output object to results:

results = pyann.nn2(np.matrix([[1, 0],
                               [2, 0]]),

                    np.matrix([[1.01, 0],
                               [3,    0],
                               [4.0,  0]]),
                    k=1)

The results object

results is now an instance of the class pyann.nn2.NN2Results:

print(type(results))
## <class 'pyann.nn2.NN2Results'>

str representation of results:

print(results)
##     nn_dists
##            0
##   0     0.01
##   1     1.00
##   2     2.00
##     nn_idx
##          0
##   0      1
##   1      2
##   2      2

Attributes of results:

print(results.__dict__.keys())
## dict_keys(['nn_idx', 'nn_dists'])

Access the values of the attributes of results:

print(results.nn_idx)
## matrix([[1],
##         [2],
##         [2]])
print(results.nn_dists)
## matrix([[0.01],
##         [1.  ],
##         [2.  ]])

Convert results to <class 'numpy.ndarray'>:

print(results.to_array())
## array([[[1.  ],
##         [2.  ],
##         [2.  ]],
##        [[0.01],
##         [1.  ],
##         [2.  ]]])