Vamana

View as Markdown

Source header: cuvs/neighbors/vamana.h

C API for Vamana index build

cuvsVamanaIndexParams

Supplemental parameters to build Vamana Index

1struct cuvsVamanaIndexParams { ... };

Fields

NameTypeDescription
metriccuvsDistanceTypeDistance type.
graph_degreeuint32_tMaximum degree of graph; corresponds to the R parameter of Vamana algorithm in the literature.
visited_sizeuint32_tMaximum number of visited nodes per search during Vamana algorithm. Loosely corresponds to the L parameter in the literature.
vamana_itersfloatThe number of times all vectors are inserted into the graph. If > 1, all vectors are re-inserted to improve graph quality.
alphafloatUsed to determine how aggressive the pruning will be.
max_fractionfloatThe maximum batch size is this fraction of the total dataset size. Larger gives faster build but lower graph quality.
batch_basefloatBase of growth rate of batch sizes *
queue_sizeuint32_tSize of candidate queue structure - should be (2^x)-1
reverse_batchsizeuint32_tMax batchsize of reverse edge processing (reduces memory footprint)

cuvsVamanaIndexParamsCreate

Allocate Vamana Index params, and populate with default values

1CUVS_EXPORT cuvsError_t cuvsVamanaIndexParamsCreate(cuvsVamanaIndexParams_t* params);

Parameters

NameDirectionTypeDescription
paramsincuvsVamanaIndexParams_t*cuvsVamanaIndexParams_t to allocate

Returns

CUVS_EXPORT cuvsError_t

cuvsVamanaIndexParamsDestroy

De-allocate Vamana Index params

1CUVS_EXPORT cuvsError_t cuvsVamanaIndexParamsDestroy(cuvsVamanaIndexParams_t params);

Parameters

NameDirectionTypeDescription
paramsincuvsVamanaIndexParams_tcuvsVamanaIndexParams_t to de-allocate

Returns

CUVS_EXPORT cuvsError_t

Vamana index

cuvsVamanaIndex

Struct to hold address of cuvs::neighbors::vamana::index and its active trained dtype

1typedef struct { ... } cuvsVamanaIndex;

Fields

NameTypeDescription
addruintptr_t
dtypeDLDataType

cuvsVamanaIndexCreate

Allocate Vamana index

1CUVS_EXPORT cuvsError_t cuvsVamanaIndexCreate(cuvsVamanaIndex_t* index);

Parameters

NameDirectionTypeDescription
indexincuvsVamanaIndex_t*cuvsVamanaIndex_t to allocate

Returns

CUVS_EXPORT cuvsError_t

cuvsVamanaIndexDestroy

De-allocate Vamana index

1CUVS_EXPORT cuvsError_t cuvsVamanaIndexDestroy(cuvsVamanaIndex_t index);

Parameters

NameDirectionTypeDescription
indexincuvsVamanaIndex_tcuvsVamanaIndex_t to de-allocate

Returns

CUVS_EXPORT cuvsError_t

cuvsVamanaIndexGetDims

Get the dimension of the index

1CUVS_EXPORT cuvsError_t cuvsVamanaIndexGetDims(cuvsVamanaIndex_t index, int* dim);

Parameters

NameDirectionTypeDescription
indexincuvsVamanaIndex_tcuvsVamanaIndex_t to get dimension of
dimoutint*pointer to dimension to set

Returns

CUVS_EXPORT cuvsError_t

Vamana index build

cuvsVamanaBuild

Build Vamana index

1CUVS_EXPORT cuvsError_t cuvsVamanaBuild(cuvsResources_t res,
2cuvsVamanaIndexParams_t params,
3DLManagedTensor* dataset,
4cuvsVamanaIndex_t index);

Build the index from the dataset for efficient DiskANN search.

The build uses the Vamana insertion-based algorithm to create the graph. The algorithm starts with an empty graph and iteratively inserts batches of nodes. Each batch involves performing a greedy search for each vector to be inserted, and inserting it with edges to all nodes traversed during the search. Reverse edges are also inserted and robustPrune is applied to improve graph quality. The index_params struct controls the degree of the final graph.

The following distance metrics are supported:

  • L2

Usage example:

Parameters

NameDirectionTypeDescription
resincuvsResources_tcuvsResources_t opaque C handle
paramsincuvsVamanaIndexParams_tcuvsVamanaIndexParams_t used to build Vamana index
datasetinDLManagedTensor*DLManagedTensor* training dataset
indexoutcuvsVamanaIndex_tcuvsVamanaIndex_t Vamana index

Returns

CUVS_EXPORT cuvsError_t

Vamana index serialize

cuvsVamanaSerialize

Save Vamana index to file

1CUVS_EXPORT cuvsError_t cuvsVamanaSerialize(cuvsResources_t res,
2const char* filename,
3cuvsVamanaIndex_t index,
4bool include_dataset);

Matches the file format used by the DiskANN open-source repository, allowing cross-compatibility.

Serialized Index is to be used by the DiskANN open-source repository for graph search.

Parameters

NameDirectionTypeDescription
resincuvsResources_tcuvsResources_t opaque C handle
filenameinconst char*the file prefix for where the index is saved
indexincuvsVamanaIndex_tcuvsVamanaIndex_t to serialize
include_datasetinboolwhether to include the dataset in the serialized index

Returns

CUVS_EXPORT cuvsError_t