IVF PQ

View as Markdown

Python module: cuvs.neighbors.ivf_pq

Index

1cdef class Index

IvfPq index object. This object stores the trained IvfPq index state which can be used to perform nearest neighbors searches.

Members

NameKind
trainedproperty
n_listsproperty
dimproperty
pq_dimproperty
pq_lenproperty
pq_bitsproperty
centersproperty
centers_paddedproperty
pq_centersproperty
centers_rotproperty
rotation_matrixproperty
list_sizesproperty
listsmethod
list_datamethod
list_indicesmethod

trained

1def trained(self)

n_lists

1def n_lists(self)

The number of inverted lists (clusters)

dim

1def dim(self)

dimensionality of the cluster centers

pq_dim

1def pq_dim(self)

The dimensionality of an encoded vector after compression by PQ

pq_len

1def pq_len(self)

The dimensionality of a subspace, i.e. the number of vector components mapped to a subspace

pq_bits

1def pq_bits(self)

The bit length of an encoded vector element after compression by PQ.

centers

1def centers(self)

Get the cluster centers corresponding to the lists in the original space

centers_padded

1def centers_padded(self)

Get the padded cluster centers [n_lists, dim_ext] where dim_ext = round_up(dim + 1, 8). This returns contiguous data suitable for build_precomputed.

pq_centers

1def pq_centers(self)

Get the PQ cluster centers

centers_rot

1def centers_rot(self)

Get the rotated cluster centers [n_lists, rot_dim] where rot_dim = pq_len * pq_dim

rotation_matrix

1def rotation_matrix(self)

Get the rotation matrix [rot_dim, dim] Transform matrix (original space -> rotated padded space)

list_sizes

1def list_sizes(self)

Get the sizes of each list

lists

1def lists(self, resources=None)

Iterates through the pq-encoded list data

This function returns an iterator over each list, with each value being the pq-encoded data for the entire list

Parameters

NameTypeDescription
resourcescuvs.common.Resources, optional

list_data

1def list_data(self, label, n_rows=0, offset=0, out_codes=None, resources=None)

Gets unpacked list data for a single list (cluster)

Parameters

NameTypeDescription
label, intThe cluster to get data for
n_rows, intThe number of rows to return for the cluster (0 is all rows)
offset, intThe row to start getting data at out_codes, CAI Optional buffer to hold memory. Will be created if None
resourcescuvs.common.Resources, optional

list_indices

1def list_indices(self, label, n_rows=0)

Gets indices for a single cluster (list)

Parameters

NameTypeDescription
label, intThe cluster to get data for n_rows, int, optional Number of rows in the list

IndexParams

1cdef class IndexParams

Parameters to build index for IvfPq nearest neighbor search

Parameters

NameTypeDescription
n_listsint, default = 1024The number of clusters used in the coarse quantizer.
metricstr, default="sqeuclidean"String denoting the metric type. Valid values for metric: [“sqeuclidean”, “inner_product”, “euclidean”, “cosine”], where:

- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \sum_i (a_i - b_i)^2,
- euclidean is the euclidean distance
- inner product distance is defined as distance(a, b) = \sum_i a_i * b_i.
- cosine distance is defined as distance(a, b) = 1 - \sum_i a_i * b_i / ( ||a||_2 * ||b||_2).
kmeans_n_itersint, default = 20The number of iterations searching for kmeans centers during index building.
kmeans_trainset_fractionint, default = 0.5If kmeans_trainset_fraction is less than 1, then the dataset is subsampled, and only n_samples * kmeans_trainset_fraction rows are used for training.
pq_bitsint, default = 8The bit length of the vector element after quantization.
pq_dimint, default = 0The dimensionality of a the vector after product quantization. When zero, an optimal value is selected using a heuristic. Note pq_dim * pq_bits must be a multiple of 8. Hint: a smaller ‘pq_dim’ results in a smaller index size and better search performance, but lower recall. If ‘pq_bits’ is 8, ‘pq_dim’ can be set to any number, but multiple of 8 are desirable for good performance. If ‘pq_bits’ is not 8, ‘pq_dim’ should be a multiple of 8. For good performance, it is desirable that ‘pq_dim’ is a multiple of 32. Ideally, ‘pq_dim’ should be also a divisor of the dataset dim.
codebook_kindstring, default = "subspace"Valid values [“subspace”, “cluster”]
force_random_rotationbool, default = FalseApply a random rotation matrix on the input data and queries even if dim % pq_dim == 0. Note: if dim is not multiple of pq_dim, a random rotation is always applied to the input data and queries to transform the working space from dim to rot_dim, which may be slightly larger than the original space and and is a multiple of pq_dim (rot_dim % pq_dim == 0). However, this transform is not necessary when dim is multiple of pq_dim (dim == rot_dim, hence no need in adding “extra” data columns / features). By default, if dim == rot_dim, the rotation transform is initialized with the identity matrix. When force_random_rotation == True, a random orthogonal transform matrix is generated regardless of the values of dim and pq_dim.
add_data_on_buildbool, default = TrueAfter training the coarse and fine quantizers, we will populate the index with the dataset if add_data_on_build == True, otherwise the index is left empty, and the extend method can be used to add new vectors to the index.
conservative_memory_allocationbool, default = TrueBy default, the algorithm allocates more space than necessary for individual clusters (list_data). This allows to amortize the cost of memory allocation and reduce the number of data copies during repeated calls to extend (extending the database). To disable this behavior and use as little GPU memory for the database as possible, set this flat to True.
max_train_points_per_pq_codeint, default = 256The max number of data points to use per PQ code during PQ codebook training. Using more data points per PQ code may increase the quality of PQ codebook but may also increase the build time. The parameter is applied to both PQ codebook generation methods, i.e., PER_SUBSPACE and PER_CLUSTER. In both cases, we will use pq_book_size * max_train_points_per_pq_code training points to train each codebook.
codes_layoutstring, default = "interleaved"Memory layout of the IVF-PQ list data. Valid values [“flat”, “interleaved”]

- flat: Codes are stored contiguously, one vector’s codes after another.
- interleaved: Codes are interleaved for optimized search performance. This is the default and recommended for search workloads.

Constructor

1def __init__(self, *, n_lists=1024, metric="sqeuclidean", metric_arg=2.0, kmeans_n_iters=20, kmeans_trainset_fraction=0.5, pq_bits=8, pq_dim=0, codebook_kind="subspace", force_random_rotation=False, add_data_on_build=True, conservative_memory_allocation=False, max_train_points_per_pq_code=256, codes_layout="interleaved")

Members

NameKind
get_handlemethod
metricproperty
metric_argproperty
add_data_on_buildproperty
n_listsproperty
kmeans_n_itersproperty
kmeans_trainset_fractionproperty
pq_bitsproperty
pq_dimproperty
codebook_kindproperty
force_random_rotationproperty
add_data_on_buildproperty
conservative_memory_allocationproperty
max_train_points_per_pq_codeproperty
codes_layoutproperty
get_handlemethod

get_handle

1def get_handle(self)

metric

1def metric(self)

metric_arg

1def metric_arg(self)

add_data_on_build

1def add_data_on_build(self)

n_lists

1def n_lists(self)

kmeans_n_iters

1def kmeans_n_iters(self)

kmeans_trainset_fraction

1def kmeans_trainset_fraction(self)

pq_bits

1def pq_bits(self)

pq_dim

1def pq_dim(self)

codebook_kind

1def codebook_kind(self)

force_random_rotation

1def force_random_rotation(self)

add_data_on_build

1def add_data_on_build(self)

conservative_memory_allocation

1def conservative_memory_allocation(self)

max_train_points_per_pq_code

1def max_train_points_per_pq_code(self)

codes_layout

1def codes_layout(self)

get_handle

1def get_handle(self)

SearchParams

1cdef class SearchParams

Supplemental parameters to search IVF-Pq index

Parameters

NameTypeDescription
n_probesintThe number of clusters to search.
lut_dtypedefault = np.float32Data type of look up table to be created dynamically at search time. The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimansionality. Note that the recall is slightly degraded when low-precision type is selected. Possible values [np.float32, np.float16, np.uint8]
internal_distance_dtypedefault = np.float32Storage data type for distance/similarity computation. Possible values [np.float32, np.float16]
coarse_search_dtypedefault = np.float32[Experimental] The data type to use as the GEMM element type when searching the clusters to probe. Possible values: [np.float32, np.float16, np.int8].
- Legacy default: np.float32
- Recommended for performance: np.float16 (half)
- Experimental/low-precision: np.int8
max_internal_batch_sizedefault = 4096Set the internal batch size to improve GPU utilization at the cost of larger memory footprint.

Constructor

1def __init__(self, *, n_probes=20, lut_dtype=np.float32, internal_distance_dtype=np.float32, coarse_search_dtype=np.float32, max_internal_batch_size=4096)

Members

NameKind
get_handlemethod
n_probesproperty
lut_dtypeproperty
internal_distance_dtypeproperty
coarse_search_dtypeproperty
max_internal_batch_sizeproperty
get_handlemethod

get_handle

1def get_handle(self)

n_probes

1def n_probes(self)

lut_dtype

1def lut_dtype(self)

internal_distance_dtype

1def internal_distance_dtype(self)

coarse_search_dtype

1def coarse_search_dtype(self)

max_internal_batch_size

1def max_internal_batch_size(self)

get_handle

1def get_handle(self)

build

@auto_sync_resources

1def build(IndexParams index_params, dataset, resources=None)

Build the IvfPq index from the dataset for efficient search.

The input dataset array can be either CUDA array interface compliant matrix or an array interface compliant matrix in host memory.

Parameters

NameTypeDescription
index_paramscuvs.neighbors.ivf_pq.IndexParamsParameters on how to build the index
datasetArray interface compliant matrix shape (n_samples, dim)Supported dtype [float32, float16, int8, uint8]
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
indexcuvs.neighbors.ivf_pq.Index

Examples

1>>> import cupy as cp
2>>> from cuvs.neighbors import ivf_pq
3>>> n_samples = 50000
4>>> n_features = 50
5>>> n_queries = 1000
6>>> k = 10
7>>> dataset = cp.random.random_sample((n_samples, n_features),
8... dtype=cp.float32)
9>>> build_params = ivf_pq.IndexParams(metric="sqeuclidean")
10>>> index = ivf_pq.build(build_params, dataset)
11>>> distances, neighbors = ivf_pq.search(ivf_pq.SearchParams(),
12... index, dataset,
13... k)
14>>> distances = cp.asarray(distances)
15>>> neighbors = cp.asarray(neighbors)

build_precomputed

@auto_sync_resources

1def build_precomputed(IndexParams index_params, uint32_t dim, pq_centers, centers, centers_rot, rotation_matrix, resources=None)

Build a view-type IVF-PQ index from precomputed centroids and codebook.

This function creates a non-owning index that stores a reference to the provided device data. All parameters must be provided with correct extents. The caller is responsible for ensuring the lifetime of the input data exceeds the lifetime of the returned index.

The index_params must be consistent with the provided matrices. Specifically:

  • index_params.codebook_kind determines the expected shape of pq_centers
  • index_params.metric will be stored in the index
  • index_params.conservative_memory_allocation will be stored in the index

Parameters

NameTypeDescription
index_paramscuvs.neighbors.ivf_pq.IndexParamsParameters that must be consistent with the provided matrices
dimintDimensionality of the input data
pq_centersCUDA array interface compliant tensorPQ codebook on device memory with required shape:
- codebook_kind “subspace”: [pq_dim, pq_len, pq_book_size]
- codebook_kind “cluster”: [n_lists, pq_len, pq_book_size] Supported dtype: float32
centersCUDA array interface compliant matrixCluster centers in the original space [n_lists, dim_ext] where dim_ext = round_up(dim + 1, 8). Supported dtype: float32
centers_rotCUDA array interface compliant matrixRotated cluster centers [n_lists, rot_dim] where rot_dim = pq_len * pq_dim. Supported dtype: float32
rotation_matrixCUDA array interface compliant matrixTransform matrix (original space -> rotated padded space) [rot_dim, dim]. Supported dtype: float32
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
indexcuvs.neighbors.ivf_pq.Index

Examples

1>>> import cupy as cp
2>>> from cuvs.neighbors import ivf_pq
3>>> n_lists = 100
4>>> dim = 128
5>>> pq_dim = 16
6>>> pq_bits = 8
7>>> pq_len = (dim + pq_dim - 1) // pq_dim # ceil division
8>>> pq_book_size = 1 << pq_bits
9>>> rot_dim = pq_len * pq_dim
10>>> dim_ext = ((dim + 1 + 7) // 8) * 8 # round_up(dim + 1, 8)
11>>>
12>>> # Prepare precomputed matrices (example with random data)
13>>> pq_centers = cp.random.random((pq_dim, pq_len, pq_book_size),
14... dtype=cp.float32)
15>>> centers = cp.random.random((n_lists, dim_ext), dtype=cp.float32)
16>>> centers_rot = cp.random.random((n_lists, rot_dim), dtype=cp.float32)
17>>> rotation_matrix = cp.random.random((rot_dim, dim), dtype=cp.float32)
18>>>
19>>> # Build index from precomputed data
20>>> build_params = ivf_pq.IndexParams(n_lists=n_lists, pq_dim=pq_dim,
21... pq_bits=pq_bits,
22... codebook_kind="subspace")
23>>> index = ivf_pq.build_precomputed(build_params, dim, pq_centers,
24... centers, centers_rot, rotation_matrix)

extend

@auto_sync_resources

1def extend(Index index, new_vectors, new_indices, resources=None)

Extend an existing index with new vectors.

The input array can be either CUDA array interface compliant matrix or array interface compliant matrix in host memory.

Parameters

NameTypeDescription
indexivf_pq.IndexTrained ivf_pq object.
new_vectorsarray interface compliant matrix shape (n_samples, dim)Supported dtype [float, int8, uint8]
new_indicesarray interface compliant vector shape (n_samples)Supported dtype [int64]
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
indexcuvs.neighbors.ivf_pq.Index

Examples

1>>> import cupy as cp
2>>> from cuvs.neighbors import ivf_pq
3>>> n_samples = 50000
4>>> n_features = 50
5>>> n_queries = 1000
6>>> dataset = cp.random.random_sample((n_samples, n_features),
7... dtype=cp.float32)
8>>> index = ivf_pq.build(ivf_pq.IndexParams(), dataset)
9>>> n_rows = 100
10>>> more_data = cp.random.random_sample((n_rows, n_features),
11... dtype=cp.float32)
12>>> indices = n_samples + cp.arange(n_rows, dtype=cp.int64)
13>>> index = ivf_pq.extend(index, more_data, indices)
14>>> # Search using the built index
15>>> queries = cp.random.random_sample((n_queries, n_features),
16... dtype=cp.float32)
17>>> distances, neighbors = ivf_pq.search(ivf_pq.SearchParams(),
18... index, queries,
19... k=10)

load

@auto_sync_resources

1def load(filename, resources=None)

Loads index from file.

Saving / loading the index is experimental. The serialization format is subject to change, therefore loading an index saved with a previous version of cuvs is not guaranteed to work.

Parameters

NameTypeDescription
filenamestringName of the file.
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
indexIndex

save

@auto_sync_resources

1def save(filename, Index index, bool include_dataset=True, resources=None)

Saves the index to a file.

Saving / loading the index is experimental. The serialization format is subject to change.

Parameters

NameTypeDescription
filenamestringName of the file.
indexIndexTrained IVF-PQ index.
resourcescuvs.common.Resources, optional

Examples

1>>> import cupy as cp
2>>> from cuvs.neighbors import ivf_pq
3>>> n_samples = 50000
4>>> n_features = 50
5>>> dataset = cp.random.random_sample((n_samples, n_features),
6... dtype=cp.float32)
7>>> # Build index
8>>> index = ivf_pq.build(ivf_pq.IndexParams(), dataset)
9>>> # Serialize and deserialize the ivf_pq index built
10>>> ivf_pq.save("my_index.bin", index)
11>>> index_loaded = ivf_pq.load("my_index.bin")

@auto_sync_resources @auto_convert_output

1def search(SearchParams search_params, Index index, queries, k, neighbors=None, distances=None, resources=None)

Find the k nearest neighbors for each query.

Parameters

NameTypeDescription
search_paramscuvs.neighbors.ivf_pq.SearchParamsParameters on how to search the index
indexcuvs.neighbors.ivf_pq.IndexTrained IvfPq index.
queriesCUDA array interface compliant matrix shape (n_samples, dim)Supported dtype [float, int8, uint8]
kintThe number of neighbors.
neighborsOptional CUDA array interface compliant matrix shape(n_queries, k), dtype int64_t. If supplied, neighbor indices will be written here in-place. (default None)
distancesOptional CUDA array interface compliant matrix shape(n_queries, k) If supplied, the distances to the neighbors will be written here in-place. (default None)
resourcescuvs.common.Resources, optional

Examples

1>>> import cupy as cp
2>>> from cuvs.neighbors import ivf_pq
3>>> n_samples = 50000
4>>> n_features = 50
5>>> n_queries = 1000
6>>> dataset = cp.random.random_sample((n_samples, n_features),
7... dtype=cp.float32)
8>>> # Build the index
9>>> index = ivf_pq.build(ivf_pq.IndexParams(), dataset)
10>>>
11>>> # Search using the built index
12>>> queries = cp.random.random_sample((n_queries, n_features),
13... dtype=cp.float32)
14>>> k = 10
15>>> search_params = ivf_pq.SearchParams(n_probes=20)
16>>>
17>>> distances, neighbors = ivf_pq.search(search_params, index, queries,
18... k)

transform

@auto_sync_resources

1def transform(Index index, input_dataset, output_labels=None, output_dataset=None, resources=None)

Transform a dataset by applying pq-encoding to the vectors.

Parameters

NameTypeDescription
indexivf_pq.IndexTrained ivf_pq object.
input_datasetarray interface compliant matrix shape (n_samples, dim)Supported dtype [float]
new_indicesOptional array interface compliant vector shape (n_samples)Supported dtype [uint32]
output_datasetOptional array interface compliant matrix shape (n_samples, pq_dim)Supported dtype [uint8]
resourcescuvs.common.Resources, optional

Returns

NameTypeDescription
output_labels, output_datasetThe cluster that each point in the dataset belongs to, and the transformed dataset