Multi-GPU IVF Flat

View as Markdown

Python module: cuvs.neighbors.mg.ivf_flat

Index

1cdef class Index

Multi-GPU IVF-Flat index object. Stores the trained multi-GPU IVF-Flat index state which can be used to perform nearest neighbors searches across multiple GPUs.

Members

NameKind
trainedproperty

trained

1def trained(self)

IndexParams

1cdef class IndexParams(SingleGpuIndexParams)

Parameters to build multi-GPU IVF-Flat index for efficient search. Extends single-GPU IndexParams with multi-GPU specific parameters.

Parameters

NameTypeDescription
distribution_modestr, default = "sharded"Distribution mode for multi-GPU setup. Valid values: [“replicated”, “sharded”]
**kwargsAdditional parameters passed to single-GPU IndexParams

Constructor

1def __init__(self, *, distribution_mode="sharded", **kwargs)

Members

NameKind
get_handlemethod
distribution_modeproperty

get_handle

1def get_handle(self)

distribution_mode

1def distribution_mode(self)

SearchParams

1cdef class SearchParams(SingleGpuSearchParams)

Parameters to search multi-GPU IVF-Flat index.

Constructor

1def __init__(self, *, n_probes=1, search_mode="load_balancer", merge_mode="merge_on_root_rank", n_rows_per_batch=1000, **kwargs)

Members

NameKind
get_handlemethod
search_modeproperty
search_modemethod
merge_modeproperty
merge_modemethod
n_rows_per_batchproperty
n_rows_per_batchmethod

get_handle

1def get_handle(self)

search_mode

1def search_mode(self)

Get the search mode for multi-GPU search.

search_mode

1def search_mode(self, value)

Set the search mode for multi-GPU search.

merge_mode

1def merge_mode(self)

Get the merge mode for multi-GPU search.

merge_mode

1def merge_mode(self, value)

Set the merge mode for multi-GPU search.

n_rows_per_batch

1def n_rows_per_batch(self)

Get the number of rows per batch for multi-GPU search.

n_rows_per_batch

1def n_rows_per_batch(self, value)

Set the number of rows per batch for multi-GPU search.

build

@auto_sync_multi_gpu_resources

1def build(IndexParams index_params, dataset, resources=None)

Build the multi-GPU IVF-Flat index from the dataset for efficient search.

Parameters

NameTypeDescription
index_paramscuvs.neighbors.ivf_flat.IndexParams
datasetArray interface compliant matrix shape (n_samples, dim)Supported dtype [float32, float16, int8, uint8] IMPORTANT: For multi-GPU IVF-Flat, the dataset MUST be in host memory (CPU). If using CuPy/device arrays, transfer to host with array.get() or cp.asnumpy(array).
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
indexcuvs.neighbors.ivf_flat.Index

Examples

1>>> import numpy as np
2>>> from cuvs.neighbors.mg import ivf_flat
3>>> n_samples = 50000
4>>> n_features = 50
5>>> n_queries = 1000
6>>> k = 10
7>>> # For multi-GPU IVF-Flat, use host (NumPy) arrays
8>>> dataset = np.random.random_sample((n_samples, n_features)).astype(
9... np.float32)
10>>> build_params = ivf_flat.IndexParams(metric="sqeuclidean")
11>>> index = ivf_flat.build(build_params, dataset)
12>>> distances, neighbors = ivf_flat.search(
13... ivf_flat.SearchParams(),
14... index, dataset, k)
15>>> # Results are already in host memory (NumPy arrays)

extend

@auto_sync_multi_gpu_resources

1def extend(Index index, new_vectors, new_indices=None, resources=None)

Extend the multi-GPU IVF-Flat index with new vectors.

Parameters

NameTypeDescription
indexcuvs.neighbors.ivf_flat.Index
new_vectorsArray interface compliant matrix shape (n_new_vectors, dim)Supported dtype [float32, float16, int8, uint8] IMPORTANT: For multi-GPU IVF-Flat, new_vectors MUST be in host memory (CPU). If using CuPy/device arrays, transfer to host with array.get() or cp.asnumpy(array).
new_indicesArray interface compliant matrix shape (n_new_vectors,), optional If provided, these indices will be used for the new vectors. If not provided, indices will be automatically assigned. IMPORTANT: Must be in host memory (CPU) for multi-GPU IVF-Flat.
resourcescuvs.common.Resources, optional

Examples

1>>> import numpy as np
2>>> from cuvs.neighbors.mg import ivf_flat
3>>> n_samples = 50000
4>>> n_features = 50
5>>> n_new_vectors = 1000
6>>> # For multi-GPU IVF-Flat, use host (NumPy) arrays
7>>> dataset = np.random.random_sample((n_samples, n_features)).astype(
8... np.float32)
9>>> new_vectors = np.random.random_sample(
10... (n_new_vectors, n_features)).astype(np.float32)
11>>> new_indices = np.arange(n_samples, n_new_vectors, dtype=np.int64)
12>>> build_params = ivf_flat.IndexParams(metric="sqeuclidean")
13>>> index = ivf_flat.build(build_params, dataset)
14>>> ivf_flat.extend(index, new_vectors, new_indices)

@auto_sync_multi_gpu_resources @auto_convert_output

1def search(SearchParams search_params, Index index, queries, k, neighbors=None, distances=None, resources=None)

Search the multi-GPU IVF-Flat index for the k-nearest neighbors of each query.

Parameters

NameTypeDescription
search_paramscuvs.neighbors.ivf_flat.SearchParams
indexcuvs.neighbors.ivf_flat.Index
queriesArray interface compliant matrix shape (n_queries, dim)Supported dtype [float32, float16, int8, uint8] IMPORTANT: For multi-GPU IVF-Flat, queries MUST be in host memory (CPU). If using CuPy/device arrays, transfer to host with array.get() or cp.asnumpy(array).
kintThe number of neighbors to search for each query.
neighborsArray interface compliant matrix shape (n_queries, k), optionalIf provided, this array will be filled with the indices of the k-nearest neighbors. If not provided, a new host array will be allocated. IMPORTANT: Must be in host memory (CPU) for multi-GPU IVF-Flat.
distancesArray interface compliant matrix shape (n_queries, k), optionalIf provided, this array will be filled with the distances to the k-nearest neighbors. If not provided, a new host array will be allocated. IMPORTANT: Must be in host memory (CPU) for multi-GPU IVF-Flat.
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
distancesnumpy.ndarrayThe distances to the k-nearest neighbors for each query (in host memory).
neighborsnumpy.ndarrayThe indices of the k-nearest neighbors for each query (in host memory).

Examples

1>>> import numpy as np
2>>> from cuvs.neighbors.mg import ivf_flat
3>>> n_samples = 50000
4>>> n_features = 50
5>>> n_queries = 1000
6>>> k = 10
7>>> # For multi-GPU IVF-Flat, use host (NumPy) arrays
8>>> dataset = np.random.random_sample((n_samples, n_features)).astype(
9... np.float32)
10>>> queries = np.random.random_sample((n_queries, n_features)).astype(
11... np.float32)
12>>> build_params = ivf_flat.IndexParams(metric="sqeuclidean")
13>>> index = ivf_flat.build(build_params, dataset)
14>>> distances, neighbors = ivf_flat.search(
15... ivf_flat.SearchParams(),
16... index, queries, k)
17>>> # Results are already in host memory (NumPy arrays)

save

@auto_sync_multi_gpu_resources

1def save(Index index, filename, resources=None)

Serialize the multi-GPU IVF-Flat index to a file.

Parameters

NameTypeDescription
indexcuvs.neighbors.ivf_flat.Index
filenamestrThe filename to serialize the index to.
resourcescuvs.common.Resources, optional

Examples

1>>> import numpy as np
2>>> from cuvs.neighbors.mg import ivf_flat
3>>> n_samples = 50000
4>>> n_features = 50
5>>> # For multi-GPU IVF-Flat, use host (NumPy) arrays
6>>> dataset = np.random.random_sample((n_samples, n_features)).astype(
7... np.float32)
8>>> build_params = ivf_flat.IndexParams(metric="sqeuclidean")
9>>> index = ivf_flat.build(build_params, dataset)
10>>> ivf_flat.save(index, "index.bin")

load

@auto_sync_multi_gpu_resources

1def load(filename, resources=None)

Deserialize the multi-GPU IVF-Flat index from a file.

Parameters

NameTypeDescription
filenamestrThe filename to deserialize the index from.
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
indexIndexThe deserialized index.

Examples

1>>> from cuvs.neighbors.mg import ivf_flat
2>>> index = ivf_flat.load("index.bin") # doctest: +SKIP

distribute

@auto_sync_multi_gpu_resources

1def distribute(filename, resources=None)

Distribute a single-GPU IVF-Flat index across multiple GPUs from a file.

Parameters

NameTypeDescription
filenamestrThe filename to distribute the index from.
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
indexIndexThe distributed index.

Examples

1>>> from cuvs.neighbors.mg import ivf_flat
2>>> index = ivf_flat.distribute("single_gpu_index.bin") # doctest: +SKIP