Tiered Index

View as Markdown

Python module: cuvs.neighbors.tiered_index

Index

1cdef class Index

Tiered Index object.

Members

NameKind
trainedproperty

trained

1def trained(self)

IndexParams

1cdef class IndexParams

Parameters to build index for Tiered Index nearest neighbor search

Parameters

NameTypeDescription
metricstr, default = "sqeuclidean"String denoting the metric type. Valid values for metric: [“sqeuclidean”, “inner_product”, “euclidean”, “cosine”], where
- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \sum_i (a_i - b_i)^2,
- euclidean is the euclidean distance
- inner product distance is defined as distance(a, b) = \sum_i a_i * b_i.
- cosine distance is defined as distance(a, b) = 1 - \sum_i a_i * b_i / ( ||a||_2 * ||b||_2).
algostr, default = "cagra"The algorithm to use for the ANN portion of the tiered index
upstream_paramsobject, optionalThe IndexParams for the upstream ANN object to use (ie the Cagra IndexParams for cagra etc)
min_ann_rowsintThe minimum number of rows necessary to create an ann index
create_ann_index_on_extendboolWhether or not to create a new ann index on extend, if the number of rows in the incremental (bfknn) portion is above min_ann_rows

Constructor

1def __init__(self, *, metric="sqeuclidean", algo="cagra", upstream_params=None, min_ann_rows=None, create_ann_index_on_extend=None,)

Members

NameKind
metricproperty
algoproperty
min_ann_rowsproperty
create_ann_index_on_extendproperty
upstream_paramsproperty

metric

1def metric(self)

algo

1def algo(self)

min_ann_rows

1def min_ann_rows(self)

create_ann_index_on_extend

1def create_ann_index_on_extend(self)

upstream_params

1def upstream_params(self)

build

@auto_sync_resources

1def build(IndexParams index_params, dataset, resources=None)

Build the Tiered index from the dataset for efficient search.

Parameters

NameTypeDescription
index_paramscuvs.neighbors.tiered_index.IndexParams
datasetCUDA array interface compliant matrix shape (n_samples, dim)Supported dtype [float32]
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
indexcuvs.neighbors.tiered_index.Index

Examples

1>>> import cupy as cp
2>>> from cuvs.neighbors import cagra, tiered_index
3>>> n_samples = 50000
4>>> n_features = 50
5>>> n_queries = 1000
6>>> k = 10
7>>> dataset = cp.random.random_sample((n_samples, n_features),
8... dtype=cp.float32)
9>>> build_params = tiered_index.IndexParams(metric="sqeuclidean",
10... algo="cagra")
11>>> index = tiered_index.build(build_params, dataset)
12>>> distances, neighbors = tiered_index.search(cagra.SearchParams(),
13... index, dataset, k)
14>>> distances = cp.asarray(distances)
15>>> neighbors = cp.asarray(neighbors)

extend

@auto_sync_resources

1def extend(Index index, new_vectors, resources=None)

Extend an existing index with new vectors.

The input array can be either CUDA array interface compliant matrix or array interface compliant matrix in host memory.

Parameters

NameTypeDescription
indextiered_index.IndexTrained tiered_index object.
new_vectorsarray interface compliant matrix shape (n_samples, dim)Supported dtype [float32]
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
indexcuvs.neighbors.tiered_index.Index

Examples

1>>> import cupy as cp
2>>> from cuvs.neighbors import tiered_index
3>>> n_samples = 50000
4>>> n_features = 50
5>>> n_queries = 1000
6>>> dataset = cp.random.random_sample((n_samples, n_features),
7... dtype=cp.float32)
8>>> index = tiered_index.build(tiered_index.IndexParams(), dataset)
9>>> n_rows = 100
10>>> more_data = cp.random.random_sample((n_rows, n_features),
11... dtype=cp.float32)
12>>> index = tiered_index.extend(index, more_data)

@auto_sync_resources @auto_convert_output

1def search(search_params, Index index, queries, k, neighbors=None, distances=None, resources=None, filter=None)

Find the k nearest neighbors for each query.

Parameters

NameTypeDescription
search_paramsSearchParams for the upstream ANN index
indexcuvs.neighbors.tiered_index.IndexTrained Tiered index.
queriesCUDA array interface compliant matrix shape (n_samples, dim)Supported dtype [float32]
kintThe number of neighbors.
neighborsOptional CUDA array interface compliant matrix shape(n_queries, k), dtype int64_t. If supplied, neighbor indices will be written here in-place. (default None)
distancesOptional CUDA array interface compliant matrix shape(n_queries, k) If supplied, the distances to the neighbors will be written here in-place. (default None)
filterOptional cuvs.neighbors.cuvsFilter can be used to filterneighbors based on a given bitset. (default None)
resourcescuvs.common.Resources, optional

Examples

1>>> import cupy as cp
2>>> from cuvs.neighbors import cagra, tiered_index
3>>> n_samples = 50000
4>>> n_features = 50
5>>> n_queries = 1000
6>>> dataset = cp.random.random_sample((n_samples, n_features),
7... dtype=cp.float32)
8>>> # Build the index
9>>> index = tiered_index.build(tiered_index.IndexParams(algo="cagra"),
10... dataset)
11>>>
12>>> # Search using the built index
13>>> queries = cp.random.random_sample((n_queries, n_features),
14... dtype=cp.float32)
15>>> k = 10
16>>> search_params = cagra.SearchParams()
17>>>
18>>> distances, neighbors = tiered_index.search(search_params, index,
19... queries, k)