Brute-force
Brute-force
Brute-force, or a flat index, is the simplest nearest-neighbor index. It stores the dataset and checks every vector during search.
Brute-force works well when exact results matter, the dataset is small enough to scan, or filters remove most of the dataset before search.
Example API Usage
C API | C++ API | Python API | Java API | Rust API | Go API
Building an index
C
C++
Python
Java
Rust
Go
Searching an index
C
C++
Python
Java
Rust
Go
Saving and loading an index
Serialize a Brute-force index when you want to reuse the same stored dataset without rebuilding it. The examples below save the dataset with the index so the loaded index can be searched immediately.
Rust and Go do not currently expose Brute-force save/load wrappers.
C
C++
Python
Java
How Brute-force works
Brute-force stores the vectors and compares each query against every vector in the dataset. There is no partitioning step, graph traversal, or approximation step.
The build step is mostly a data setup step. The search step does the real work: for each query, cuVS computes distances to every vector, keeps the best k candidates, and returns exact nearest neighbors.
When to use Brute-force
Use brute-force when exact nearest neighbors are required.
Use brute-force when the dataset fits comfortably in device memory and search latency is acceptable.
Use brute-force as a ground-truth baseline for tuning approximate indexes.
Use brute-force for heavily filtered queries. If a filter excludes most vectors, brute-force can search only the remaining candidates while IVF and graph indexes may skip useful vectors before the filter is applied.
Use an approximate index instead when the dataset is large enough that scanning every vector is too slow.
Using Filters
Brute-force supports filtered search. A filter tells the search which vectors are allowed before distances are computed.
This is useful when queries should only search a small subset of the dataset. For example, if a filter removes 90%-99% of the vectors, brute-force may do much less work while still returning exact results within the allowed subset.
Unlike partitioned or graph-based indexes, brute-force does not need to guess which part of the index might contain the answer. If a vector passes the filter, it can be considered.
The examples below use a bitmap filter. A bitmap can express a different allow list for each query. A bit value of 1 means the query may consider that vector; a bit value of 0 means the vector is filtered out for that query.
These examples cover the Brute-force bindings that currently expose filters. The Rust and Go Brute-force wrappers currently search without a filter argument.
C
C++
Python
Java
Configuration parameters
Build parameters
Search parameters
The C++ brute_force::search_params struct currently has no tunable fields.
Filters are passed as search function arguments in bindings that expose filtered search, not as fields in brute_force::search_params.
Tuning Considerations
Brute-force has very few tuning knobs. The main decisions are distance metric, query batch size, output k, and whether a filter can reduce the amount of work.
Brute-force is exact, but ties can make result order look different across runs. If many vectors have the same distance near the cutoff, increasing k can make comparisons against ground truth more stable.
For large batches, memory allocation can affect performance. Reuse output buffers and memory resources when the API supports it.
Memory footprint
Brute-force memory is dominated by the dataset, optional vector norms, current query batch, and output buffers. It does not build an additional graph or partitioned index.
Variables:
N: Number of database vectors.D: Vector dimension.B: Bytes per dataset value, such as4for fp32.Q: Query batch size.K: Number of neighbors returned per query.S_idx: Bytes per returned neighbor ID.R_norm:1when the selected distance metric stores vector norms, otherwise0.
Scratch and maximum vectors
The scratch term covers temporary distance tiles, allocator padding, CUDA library workspaces, and memory held by the active memory resource. Use H = 0.10 for a first search estimate. If you can measure a representative run, use:
Then set:
The capacity variables in this subsection are:
M_free: Free memory in the relevant memory space before the operation starts. Use device memory for GPU-resident formulas and host memory for formulas explicitly marked as host memory.M_other: Memory reserved for arrays, memory pools, concurrent work, or application buffers that are not included in the formula.H: Scratch headroom fraction reserved for temporary buffers and allocator overhead.M_usable: Memory budget left for the formula after subtractingM_otherand reserving headroom.observed_peak: Peak memory observed during a smaller representative run.formula_without_scratch: Value of the selected peak formula with explicitscratchterms removed and without applying headroom.peak_without_scratch(count): The selected peak formula rewritten as a function of the count being estimated, excluding scratch and headroom. The count is usuallyNfor rows or vectors andBfor K-selection batch rows.B_per_row/B_per_vector: Bytes added by one more row or vector in the selected formula. For linear formulas, add the coefficients of the count being estimated after fixed values such asD,K,Q, andLare substituted.B_fixed: Bytes in the selected formula that do not change with the estimated count, such as codebooks, centroids, fixed query batches, capped training buffers, or metadata.N_max/B_max: Estimated largest row, vector, or batch-row count that fits inM_usable.
For fixed D, Q, and K, solve:
where B_per_vector = D * B + R_norm * 4. B_fixed includes query and result buffers.
Index footprint
The raw vectors use:
Vector norms, for distance metrics that use them, use:
Search peak memory usage
Search adds the query batch and output buffers:
The total search peak is approximately: