IVF SQ

View as Markdown

Source header: cuvs/neighbors/ivf_sq.hpp

IVF-SQ index build parameters

kIndexGroupSize

IVF-SQ index build parameters

1constexpr static uint32_t kIndexGroupSize = 32;

neighbors::ivf_sq::index_params

IVF-SQ index build parameters.

IVF-SQ currently uses 8-bit scalar quantization, storing one uint8_t code per vector dimension.

1struct index_params : cuvs::neighbors::index_params {
2 uint32_t n_lists;
3 uint32_t kmeans_n_iters;
4 uint32_t max_train_points_per_cluster;
5 bool conservative_memory_allocation;
6 bool add_data_on_build;
7};

Fields

NameTypeDescription
n_listsuint32_tThe number of inverted lists (clusters)
kmeans_n_itersuint32_tThe number of iterations searching for kmeans centers (index building).
max_train_points_per_clusteruint32_tThe number of data vectors (per cluster) to use during iterative kmeans building.
conservative_memory_allocationboolBy default, the algorithm allocates more space than necessary for individual clusters (list_data). This allows to amortize the cost of memory allocation and reduce the number of data copies during repeated calls to extend (extending the database).

The alternative is the conservative allocation behavior; when enabled, the algorithm always allocates the minimum amount of memory required to store the given number of records. Set this flag to true if you prefer to use as little GPU memory for the database as possible.
add_data_on_buildboolWhether to add the dataset content to the index, i.e.:

- true means the index is filled with the dataset vectors and ready to search after calling build.
- false means build only trains the underlying model (e.g. quantizer or clustering), but the index is left empty; you’d need to call extend on the index afterwards to populate it.

IVF-SQ index search parameters

neighbors::ivf_sq::search_params

IVF-SQ index search parameters

1struct search_params : cuvs::neighbors::search_params {
2 uint32_t n_probes;
3};

Fields

NameTypeDescription
n_probesuint32_tThe number of clusters to search.

IVF-SQ list storage spec

neighbors::ivf_sq::list_spec

IVF-SQ list storage spec

1template <typename SizeT, typename CodeT, typename IdxT>
2struct list_spec {
3 SizeT align_max;
4 SizeT align_min;
5 uint32_t dim;
6};

Fields

NameTypeDescription
align_maxSizeT
align_minSizeT
dimuint32_t

IVF-SQ index

neighbors::ivf_sq::index

IVF-SQ index.

In the IVF-SQ index, a database vector is first assigned to the nearest cluster center using an inverted file (IVF) structure, and then compressed using scalar quantization (SQ).

Scalar quantization independently maps each dimension of the per-cluster residual (the input vector minus its assigned centroid) to a fixed-width integer code. For 8-bit quantization (uint8_t), each residual component is linearly mapped to an integer in [0, 255] using learned per-dimension minimum (sq_vmin) and step-size (sq_delta) values.

For a vector component x_i, centroid component centroid_i, residual minimum vmin_i, and quantization step delta_i, the stored code is:

codei=clamp(round((xicentroidivmini)/deltai),0,255)code_i = clamp(round((x_i - centroid_i - vmin_i) / delta_i), 0, 255)

The corresponding reconstructed component is:

xicentroidi+vmini+codeideltaix_i \approx centroid_i + vmin_i + code_i \cdot delta_i

This provides a compact representation (1 byte per dimension) while preserving the relative distances between vectors with high fidelity, offering a good trade-off between index size, search speed, and recall compared to flat (uncompressed) and product-quantized (PQ) representations.

Note: CodeT is the storage type for scalar-quantized residual codes in the inverted lists, not the input dataset type. The public build and search APIs accept float and half input vectors, then store the quantized residual components as CodeT inside the index. Currently, IVF-SQ supports only uint8_t codes, so use CodeT = uint8_t. Each code uses the full 8-bit range [0, 255].

1template <typename CodeT>
2struct index;

IVF-SQ index build

neighbors::ivf_sq::build

Build the index from the dataset for efficient search.

1auto build(raft::resources const& handle,
2const cuvs::neighbors::ivf_sq::index_params& index_params,
3raft::device_matrix_view<const float, int64_t, raft::row_major> dataset)
4-> cuvs::neighbors::ivf_sq::index<uint8_t>;

NB: Currently, the following distance metrics are supported:

  • L2Expanded
  • L2SqrtExpanded
  • InnerProduct
  • CosineExpanded

Usage example:

Parameters

NameDirectionTypeDescription
handleinraft::resources const&
index_paramsinconst cuvs::neighbors::ivf_sq::index_params&configure the index building
datasetinraft::device_matrix_view<const float, int64_t, raft::row_major>a device pointer to a row-major matrix [n_rows, dim]

Returns

cuvs::neighbors::ivf_sq::index<uint8_t>

Additional overload: neighbors::ivf_sq::build

Build the index from the dataset for efficient search.

1auto build(raft::resources const& handle,
2const cuvs::neighbors::ivf_sq::index_params& index_params,
3raft::device_matrix_view<const half, int64_t, raft::row_major> dataset)
4-> cuvs::neighbors::ivf_sq::index<uint8_t>;

Usage example:

Parameters

NameDirectionTypeDescription
handleinraft::resources const&
index_paramsinconst cuvs::neighbors::ivf_sq::index_params&configure the index building
datasetinraft::device_matrix_view<const half, int64_t, raft::row_major>a device pointer to a row-major matrix [n_rows, dim]

Returns

cuvs::neighbors::ivf_sq::index<uint8_t>

Additional overload: neighbors::ivf_sq::build

Build the index from the dataset for efficient search.

1auto build(raft::resources const& handle,
2const cuvs::neighbors::ivf_sq::index_params& index_params,
3raft::host_matrix_view<const float, int64_t, raft::row_major> dataset)
4-> cuvs::neighbors::ivf_sq::index<uint8_t>;

Usage example:

Parameters

NameDirectionTypeDescription
handleinraft::resources const&
index_paramsinconst cuvs::neighbors::ivf_sq::index_params&configure the index building
datasetinraft::host_matrix_view<const float, int64_t, raft::row_major>a host pointer to a row-major matrix [n_rows, dim]

Returns

cuvs::neighbors::ivf_sq::index<uint8_t>

Additional overload: neighbors::ivf_sq::build

Build the index from the dataset for efficient search.

1auto build(raft::resources const& handle,
2const cuvs::neighbors::ivf_sq::index_params& index_params,
3raft::host_matrix_view<const half, int64_t, raft::row_major> dataset)
4-> cuvs::neighbors::ivf_sq::index<uint8_t>;

Usage example:

Parameters

NameDirectionTypeDescription
handleinraft::resources const&
index_paramsinconst cuvs::neighbors::ivf_sq::index_params&configure the index building
datasetinraft::host_matrix_view<const half, int64_t, raft::row_major>a host pointer to a row-major matrix [n_rows, dim]

Returns

cuvs::neighbors::ivf_sq::index<uint8_t>

IVF-SQ index extend

neighbors::ivf_sq::extend

Extend the index with the new data in-place.

1void extend(raft::resources const& handle,
2raft::device_matrix_view<const float, int64_t, raft::row_major> new_vectors,
3std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
4cuvs::neighbors::ivf_sq::index<uint8_t>* idx);

Usage example:

Parameters

NameDirectionTypeDescription
handleinraft::resources const&
new_vectorsinraft::device_matrix_view<const float, int64_t, raft::row_major>a device matrix view to a row-major matrix [n_rows, idx.dim()]
new_indicesinstd::optional<raft::device_vector_view<const int64_t, int64_t>>a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).
idxinoutcuvs::neighbors::ivf_sq::index<uint8_t>*pointer to ivf_sq::index

Returns

void

Additional overload: neighbors::ivf_sq::extend

Extend the index with the new data in-place.

1void extend(raft::resources const& handle,
2raft::device_matrix_view<const half, int64_t, raft::row_major> new_vectors,
3std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
4cuvs::neighbors::ivf_sq::index<uint8_t>* idx);

Usage example:

Parameters

NameDirectionTypeDescription
handleinraft::resources const&
new_vectorsinraft::device_matrix_view<const half, int64_t, raft::row_major>a device matrix view to a row-major matrix [n_rows, idx.dim()]
new_indicesinstd::optional<raft::device_vector_view<const int64_t, int64_t>>a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).
idxinoutcuvs::neighbors::ivf_sq::index<uint8_t>*pointer to ivf_sq::index

Returns

void

Additional overload: neighbors::ivf_sq::extend

Extend the index with the new data in-place.

1void extend(raft::resources const& handle,
2raft::host_matrix_view<const float, int64_t, raft::row_major> new_vectors,
3std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
4cuvs::neighbors::ivf_sq::index<uint8_t>* idx);

Usage example:

Parameters

NameDirectionTypeDescription
handleinraft::resources const&
new_vectorsinraft::host_matrix_view<const float, int64_t, raft::row_major>a host matrix view to a row-major matrix [n_rows, idx.dim()]
new_indicesinstd::optional<raft::host_vector_view<const int64_t, int64_t>>a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).
idxinoutcuvs::neighbors::ivf_sq::index<uint8_t>*pointer to ivf_sq::index

Returns

void

Additional overload: neighbors::ivf_sq::extend

Extend the index with the new data in-place.

1void extend(raft::resources const& handle,
2raft::host_matrix_view<const half, int64_t, raft::row_major> new_vectors,
3std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
4cuvs::neighbors::ivf_sq::index<uint8_t>* idx);

Usage example:

Parameters

NameDirectionTypeDescription
handleinraft::resources const&
new_vectorsinraft::host_matrix_view<const half, int64_t, raft::row_major>a host matrix view to a row-major matrix [n_rows, idx.dim()]
new_indicesinstd::optional<raft::host_vector_view<const int64_t, int64_t>>a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).
idxinoutcuvs::neighbors::ivf_sq::index<uint8_t>*pointer to ivf_sq::index

Returns

void

IVF-SQ index serialize

neighbors::ivf_sq::serialize

Save the index to file.

1void serialize(raft::resources const& handle,
2const std::string& filename,
3const cuvs::neighbors::ivf_sq::index<uint8_t>& index);

Experimental, both the API and the serialization format are subject to change.

Parameters

NameDirectionTypeDescription
handleinraft::resources const&the raft handle
filenameinconst std::string&the file name for saving the index
indexinconst cuvs::neighbors::ivf_sq::index<uint8_t>&IVF-SQ index

Returns

void

neighbors::ivf_sq::deserialize

Load index from file.

1void deserialize(raft::resources const& handle,
2const std::string& filename,
3cuvs::neighbors::ivf_sq::index<uint8_t>* index);

Experimental, both the API and the serialization format are subject to change.

Parameters

NameDirectionTypeDescription
handleinraft::resources const&the raft handle
filenameinconst std::string&the name of the file that stores the index
indexoutcuvs::neighbors::ivf_sq::index<uint8_t>*IVF-SQ index

Returns

void