cuVS Bench Backends
This page explains how cuVS Bench separates benchmark orchestration from the system that actually builds and searches an index. Use it when you want to understand the built-in C++ benchmark backend, add a new backend for another product or service, or add a new indexing algorithm to the existing C++ backend.
cuVS Bench uses two pieces for each backend:
Both pieces are registered under the same backend type name. The default backend type is cpp_gbench, which runs the C++ Google Benchmark executables.
How a benchmark run works
- The user calls
BenchmarkOrchestrator(...).run_benchmark(...). - The orchestrator finds the config loader registered for the requested backend type.
- The config loader returns a
DatasetConfigand one or moreBenchmarkConfigobjects. - The orchestrator creates the backend registered for the same backend type.
- The backend runs
build(...)andsearch(...), then returnsBuildResultandSearchResultobjects.
The config loader decides what to run. The backend decides how to run it.
Configuration contract
A config loader receives the arguments passed to run_benchmark(), such as dataset, dataset_path, algorithms, count, batch_size, groups, and backend-specific options. It returns:
Each IndexConfig describes one index to benchmark:
The following minimal loader creates one dataset, one index, and one search configuration:
Adding a backend
Add a new backend when cuVS Bench needs to drive a different execution path, such as a vector database, remote service, or custom benchmark runner.
- Implement a config loader by subclassing
ConfigLoaderfromcuvs_bench.orchestrator.config_loaders. Itsload()method should return(DatasetConfig, List[BenchmarkConfig]). - Implement a backend by subclassing
BenchmarkBackendfromcuvs_bench.backends.base. Itsbuild()method should returnBuildResult; itssearch()method should returnSearchResult. - Register both pieces with the same backend type name.
- Run benchmarks with
BenchmarkOrchestrator(backend_type="my_backend").
Example: Elasticsearch backend
This example shows the shape of a network backend. The loader creates the dataset and benchmark configs. The backend uses backend_config to connect to the service, build the index, run search, and return cuVS Bench result objects.
Components at a glance
C++ Backend
The built-in CppGoogleBenchmarkBackend uses backend_type="cpp_gbench". Its config loader reads YAML under config/datasets and config/algos, expands parameter combinations, and validates constraints. Its backend runs the C++ benchmark executables and merges their results.
Adding a new C++ algorithm usually means adding another executable and YAML config for this backend. It does not require a new backend type.
Implementation and configuration
New algorithms should be C++ classes that inherit class ANN from cpp/bench/ann/src/ann.h and implement all pure virtual functions.
Define separate build and search parameter structs. The search parameter struct should inherit struct ANN<T>::AnnSearchParam.
The benchmark program consumes generated JSON files for indexes, build parameters, and search parameters. The JSON objects map to YAML build_param objects and search_param arrays.
Parse build and search parameters from JSON:
Add matching if cases to create_algo() and create_search_param() in cpp/bench/ann/. The string literal must match the algo value in the configuration file.
Adding a CMake target
cuvs/cpp/bench/ann/CMakeLists.txt provides a CMake helper for new benchmark targets:
Example target for HNSWLIB:
This creates HNSWLIB_ANN_BENCH, which runs HNSWLIB benchmarks.
Add an algos.yaml entry that maps the algorithm name to its executable and declares whether the algorithm requires a GPU:
executable specifies the binary used to build and search the index. cuVS Bench expects it to be available in cuvs/cpp/build/. requires_gpu tells cuVS Bench whether the algorithm must run on a GPU node.
Summary
cuVS Bench backends let the same benchmark workflow run against different execution targets. A config loader describes the dataset and parameter combinations, while a backend performs the build and search work. Use a new backend type for a new execution environment, and use the existing C++ backend when you are only adding another C++ ANN benchmark executable.