KDTree¶
geoindex_rs.kdtree ¶
ArrayLike
module-attribute
¶
ArrayLike = Union[ndarray, ArrowArrayExportable, Buffer]
A type alias for accepted input to the KDTreeBuilder.add
method.
Accepted input includes numpy arrays, Arrow arrays, and C-contiguous buffer protocol input.
IndexLike
module-attribute
¶
IndexLike = Union[ndarray, ArrowArrayExportable, Buffer, KDTree]
A type alias for accepted input as a KDTree.
KDTree ¶
Bases: Buffer
A fast, memory-efficient, zero-copy-compatible, 2D KDTree.
Use KDTreeBuilder
, and then call
finish
to construct this.
This uses a binary-stable memory layout under the hood. So you can write the index to disk, then later load it and perform queries on it.
This class implements the Python buffer protocol, so you can pass it to the Python
bytes
constructor to copy the underlying binary memory into a Python bytes
object.
KDTreeBuilder ¶
A builder class to create a KDTree.
Example:
import numpy as np
from geoindex_rs import kdtree as kd
builder = kd.KDTreeBuilder(3)
x = np.arange(0, 3)
y = np.arange(2, 5)
builder.add(x, y)
tree = builder.finish()
results = kd.range(tree, 2, 4, 7, 9)
assert results.to_pylist() == [2]
__init__ ¶
__init__(
num_items: int,
node_size: int = 64,
coord_type: Literal["float32", "float64"] = "float64",
) -> None
Initialize a KDTree with the given parameters.
This will provision memory for the KDTree, to be filled in the add
method.
Parameters:
-
num_items
(int
) –The number of items in the tree
-
node_size
(int
, default:64
) –The node size of the tree. Defaults to 64.
-
coord_type
(Literal['float32', 'float64']
, default:'float64'
) –The coordinate type to use in the tree. Currently only float32 and float64 are permitted. Defaults to "float64".
add ¶
Insert points in this KDTree.
There are multiple ways to call this method:
- Two numpy or Arrow numeric arrays, passed as
x
,y
. - A 2D numpy array, with shape
(N, 2)
. It must be C-contiguous. - One argument with an Arrow array of type FixedSizeList or Struct. The
FixedSizeListArray must have a
list_size
of 2, and theStructArray
must have 2 children, orderedx
andy
.
In this case, all other parameters should be left as None
.
Note
Most of the time, you should call add
only once. This function is
vectorized, and you should avoid calling it in a loop with a few rows of
data at a time.
In some cases, it may be useful to call this in a loop, such as if the input geometries do not all fit in memory at once.
It's important to add arrays at a time. This should usually not be called in a loop.
Parameters:
-
x
(ArrayLike
) –array-like input
-
y
(ArrayLike | None
, default:None
) –array-like input. Defaults to None.
Returns:
-
Array
–An Arrow array with the insertion index of each element, which provides a lookup back into the original data.
This can be converted to a
pyarrow.Array
by passing topyarrow.array
.
KDTreeMetadata ¶
Common metadata to describe a KDTree.
This can be used to know the number of items, node information, or total byte size of a KDTree.
Additionally, this can be used to know how much memory a KDTree would use with the given number of items and node size. A KDTree with 1 million items and a node size of 64 (the default) would take up 20 MiB.
from geoindex_rs import kdtree as kd
metadata = kd.KDTreeMetadata(num_items=1_000_000, node_size=64)
assert metadata.num_bytes == 20_000_008
num_bytes
property
¶
num_bytes: int
The number of bytes that a KDTree with this metadata would have.
__init__ ¶
__init__(
num_items: int,
node_size: int = 64,
coord_type: Literal["float32", "float64"] = "float64",
) -> None
Create a new KDTreeMetadata given a number of items and node size.
Parameters:
from_index
classmethod
¶
from_index(index: IndexLike) -> KDTreeMetadata
Create from an existing KDTree buffer.
range ¶
range(
index: IndexLike,
min_x: int | float,
min_y: int | float,
max_x: int | float,
max_y: int | float,
) -> Array
Search a KDTree for elements intersecting the provided bounding box.
Results are the insertion indexes of items that match the query.
Example:
import numpy as np
from geoindex_rs import kdtree as kd
builder = kd.KDTreeBuilder(3)
x = np.arange(0, 3)
y = np.arange(2, 5)
builder.add(x, y)
tree = builder.finish()
results = kd.range(tree, 2, 4, 7, 9)
assert results.to_pylist() == [2]
Parameters:
-
index
(IndexLike
) –the KDTree to search.
-
min_x
(int | float
) –The
min_x
coordinate of the query bounding box. -
min_y
(int | float
) –The
min_y
coordinate of the query bounding box. -
max_x
(int | float
) –The
max_x
coordinate of the query bounding box. -
max_y
(int | float
) –The
max_y
coordinate of the query bounding box.
Returns:
-
Array
–A uint32-typed Arrow array with the insertion indexes of query results.
within ¶
Search a KDTree for elements within the given distance of the query point.
Results are the insertion indexes of items that match the query.
Parameters:
-
index
(IndexLike
) –the KDTree to search.
-
qx
(int | float
) –The
x
coordinate of the query point. -
qy
(int | float
) –The
y
coordinate of the query point. -
r
(int | float
) –The radius from the query point to use for searching.
Returns:
-
Array
–A uint32-typed Arrow array with the insertion indexes of query results.