flucoma · tremblap · Dec 9, 2022 · Dec 8, 2022 · Dec 9, 2022 · Dec 9, 2022
diff --git a/doc/DataSet.rst b/doc/DataSet.rst
@@ -89,6 +89,14 @@
 
    Merge sourceDataSet in the current DataSet. It will update the value of points with the same identifier if overwrite is set to 1. To add columns instead, see the 'transformJoin' method of FluidDataSetQuery.
 
+:message kNearest:
+
+   :arg buffer: A |buffer| containing a data point to match against. The number of frames in the buffer must match the dimensionality of the DataSet.
+
+   :arg k: The number of nearest neighbours to return. The identifiers will be sorted, beginning with the nearest.
+
+   Returns the identifiers of the ``k`` points nearest to the one passed. Note that this is a brute force distance measure, and comparatively inefficient for repeated queries against large datasets. For such cases, :fluid-obj:`KDTree` will be more efficient.
+
 :message print:
 
    Post an abbreviated content of the DataSet in the window by default, but you can supply a custom action instead. 

diff --git a/doc/KDTree.rst b/doc/KDTree.rst
@@ -7,6 +7,8 @@
 :discussion: 
    :fluid-obj:`KDTree` facilitates efficient nearest neighbour searches of multi-dimensional data stored in a :fluid-obj:`DataSet`. 
 
+   k-d trees are most useful for *repeated* querying of a dataset, because there is a cost associated with building them. If you just need to do a single lookup then using the kNearest message of :fluid-obj:`DataSet` will probably be quicker
+
    Whilst k-d trees can offer very good performance relative to naïve search algorithms, they suffer from something called “the curse of dimensionality” (like many algorithms for multi-dimensional data). In practice, this means that as the number of dimensions of your data goes up, the relative performance gains of a k-d tree go down.
 
 :control numNeighbours:

diff --git a/example-code/sc/DataSet.scd b/example-code/sc/DataSet.scd
@@ -250,4 +250,29 @@ fork{
 }
 )
 
+::
+strong::Nearest Neighbour Search in a DataSet::
+
+Note: A FluidDataSet can be queried with an input point to return the nearest match to that point. Note: This feature is can be computationally expensive on a large dataset, as it needs to compute the distance of the queried point to each point in the dataset. If you need to perform multiple nearest neighbour queries on a fluid.dataset~ it is recommended to use FluidKDTree. This facility is most useful with smaller, ephemeral datasets such as those returned by FluidDataSetQuery.
+
+code::
+
+// create a small DataSet...
+f = FluidDataSet(s)
+// and fill it with a grid of data
+f.load(Dictionary.newFrom(["cols", 2, "data", Dictionary.newFrom(9.collect{|i|["item-%".format(i), [i.div(3), i.mod(3)] / 2]}.flatten(1))]))
+
+// the data looks like this
+// (item-0 -> [ 0.0, 0.0 ]) (item-1 -> [ 0.0, 0.5 ]) (item-2 -> [ 0.0, 1.0 ])
+// (item-3 -> [ 0.5, 0.0 ]) (item-4 -> [ 0.5, 0.5 ]) (item-5 -> [ 0.5, 1.0 ])
+// (item-6 -> [ 1.0, 0.0 ]) (item-7 -> [ 1.0, 0.5 ]) (item-8 -> [ 1.0, 1.0 ])
+
+// create a query buffer...
+b = Buffer.alloc(s,2)
+
+// and fill it with a point
+b.sendCollection([1,0]);
+
+// and request 9 nearest neighbours
+f.kNearest(b,9,{|x|x.postln;})
 ::