Integrations
cuVS can be adopted at several layers of a vector-search stack. Applications can call cuVS directly, use a database or search engine with cuVS-backed indexing, rely on data platforms that bring NVIDIA accelerated computing closer to enterprise data, or use libraries that expose cuVS algorithms through familiar APIs.
This page summarizes where each integration fits and links to vendor documentation for setup, supported configurations, and operational details.
Databases
Use these integrations when vector search should be managed by a database or search engine. The system owns ingestion, indexing, query APIs, and operations, while cuVS-backed paths accelerate supported index build or search workflows.
Amazon OpenSearch Service
Amazon OpenSearch Service provides managed GPU acceleration for vector indexing on OpenSearch 3.1+ domains and OpenSearch Serverless vector collections. OpenSearch Service detects supported Faiss vector index builds, offloads the build work to managed GPU capacity, and applies acceleration to indexing and force-merge operations without requiring users to manage GPU instances. See the AWS docs for enabling GPU acceleration, creating GPU-accelerated vector indexes, and indexing vector data.
CyborgDB
CyborgDB is an encrypted vector database proxy for confidential vector search across backing stores such as PostgreSQL, Redis, S3, and in-memory storage. Cyborg and NVIDIA demonstrated a cuVS-accelerated proof of concept that speeds encrypted vector index build and retrieval while preserving CyborgDB’s confidentiality model. See the CyborgDB overview, quickstart, embedded deployment docs, and NVIDIA’s Cyborg and cuVS technical blog.
Elasticsearch
Elasticsearch GPU accelerated vector indexing uses NVIDIA cuVS to accelerate dense-vector HNSW index construction. This path is intended for ingestion-heavy workloads where CPU HNSW graph construction is a bottleneck. Elastic’s implementation builds graph structures on the GPU and converts them for Elasticsearch search workflows. See Elastic’s GPU vector indexing reference, plus the GPU acceleration writeups for chapter 1 and chapter 2.
Kinetica
Kinetica is a GPU-accelerated database with native vector columns, SQL vector operators, and Python APIs for vector search. Kinetica supports CAGRA indexes on vector columns for GPU-oriented approximate nearest-neighbor search, while HNSW remains the automatically maintained option for mutable vector data. See Kinetica’s CAGRA index docs, NVIDIA partner page, vector I/O example, and vector search example.
KX / KDB.AI
KDB.AI is KX’s vector database for AI applications and similarity search workflows. The cuVS-enabled KDB.AI Server image, kdbai-db-cuvs, bundles the GPU dependencies needed to build and search CAGRA indexes with NVIDIA cuVS while keeping the standard KDB.AI client APIs. KDB.AI also integrates with kdb+, allowing existing q and kdb+ datasets to participate in vector search workflows. See KX’s kdb+ product page and NVIDIA partner page for broader KX and NVIDIA integration context.
Milvus
Milvus GPU indexes provide GPU-accelerated options for high-throughput and high-recall vector search. Supported index types include GPU_CAGRA, GPU_IVF_FLAT, GPU_IVF_PQ, and GPU_BRUTE_FORCE. GPU_CAGRA also supports hybrid deployments where a GPU-built graph can be adapted at load time for CPU search in Milvus 2.6.4 and newer. See the Milvus docs for building indexes and installing Milvus with GPU support.
OpenSearch
OpenSearch Vector Engine supports vector search, hybrid search, and RAG-oriented retrieval in OpenSearch. The OpenSearch GPU indexing design offloads vector index builds to GPU workers, uses cuVS through Faiss to build CAGRA graphs, converts those graphs into an HNSW-compatible form for CPU search, and falls back to CPU index building when needed. See the OpenSearch vector search docs, GPU-accelerated vector search blog, k-NN RFCs for GPU acceleration and remote vector index build, and Amazon OpenSearch Service GPU acceleration docs.
Oracle AI Database 26ai
Oracle AI Database 26ai includes Oracle AI Vector Search for storing embeddings, creating vector indexes, and running similarity search inside Oracle Database. Oracle’s Vector Index Service uses GPU-enabled containers and NVIDIA cuVS to generate Oracle In-Memory Neighbor Graph vector indexes, then returns the index to Oracle AI Database for query execution. See Oracle’s NVIDIA collaboration announcement, Vector Index Service introduction, and getting started guide.
Solr
Apache Solr dense vector search can use cuVS through the cuvs-lucene vector format. Solr exposes this through CuVSCodecFactory and the CuVSCodec module, allowing CAGRA graph construction on GPU and serialization into an HNSW graph for search. The Solr guide covers setup steps, including the cuvs module jars, knnAlgorithm="cagra_hnsw" on DenseVectorField, and codec configuration in solrconfig.xml.
Data Platforms
Use these integrations when data locality, storage throughput, and RAG pipeline architecture are central to the deployment. These platforms usually do not expose cuVS APIs directly; they bring NVIDIA accelerated computing, networking, and AI software closer to enterprise data through the NVIDIA AI Data Platform reference design.
Cloudian
Cloudian HyperScale AI Data Platform combines HyperStore S3-compatible object storage with NVIDIA accelerated computing and NVIDIA AI Enterprise software for on-premises AI factories. Cloudian positions the platform for agentic RAG, semantic search, and enterprise knowledge retrieval, with integrated vector database capabilities for ingesting, embedding, and indexing multimodal content. See Cloudian’s platform launch, AI inferencing overview, and NVIDIA integration page.
DDN
DDN Infinia is DDN’s data platform for AI workflows across core, cloud, and edge environments. In DDN’s NVIDIA AI Data Platform example, Infinia is paired with NVIDIA NIM microservices, NVIDIA Spectrum-X, BlueField DPUs, and Milvus to support RAG, vector search, and inference-serving pipelines. See DDN’s RAG workflow writeup and Data Intelligence Platform overview.
Dell AI Data Platform
Dell AI Data Platform with NVIDIA combines Dell storage, modular data engines, NVIDIA accelerated computing, networking, and NVIDIA AI Enterprise software for enterprise AI data pipelines. Dell describes a GPU Accelerated Data Search Engine that applies NVIDIA cuVS to vector indexing and search over unstructured data, alongside NVIDIA cuDF for data processing and Dell data orchestration for preparing governed AI-ready datasets. See Dell’s platform launch blog, press release, and GPU-fed AI data platform overview.
MinIO
MinIO AIStor is an S3-compatible object data platform for NVIDIA AI Factory and NVIDIA STX deployments. In vector-search and RAG pipelines, AIStor provides the durable object layer for embeddings, segment objects, index artifacts, and retrieval data while GPU-accelerated services such as Milvus and cuVS perform index construction and search. MinIO has published a Milvus and cuVS benchmark showing how AIStor, NVIDIA GPUDirect RDMA for S3-compatible storage, and cuVS fit together in large-scale vector index creation. See also MinIO’s NVIDIA STX announcement and GPUDirect RDMA overview.
NetApp
NetApp AIPod supports NVIDIA AI Data Platform deployments with NetApp ONTAP data management, scalable storage, and NVIDIA accelerated computing. The integration is designed for governed RAG and inference pipelines that scan, index, classify, and retrieve enterprise documents for AI agents. See NetApp’s AI Data Platform announcement and AI infrastructure documentation.
Pure Storage
Pure Storage FlashBlade integrates with the NVIDIA AI Data Platform reference design for agentic AI, RAG, and inference workflows. Pure Storage positions FlashBlade as a high-throughput storage layer for NVIDIA accelerated compute and NVIDIA AI Enterprise software, while FlashBlade//EXA targets large-scale AI and HPC workloads that need high metadata performance and low-latency access to multimodal data.
WEKA
WEKA Data Platform integrates with the NVIDIA AI Data Platform reference design to provide a high-performance storage foundation for agentic AI reasoning and inference. WEKA’s NVIDIA work includes Augmented Memory Grid, NVIDIA Cloud Partner certification, NVIDIA-Certified Systems Storage validation, and DGX reference architectures. See WEKA’s NVIDIA Cloud Partner certification and DGX BasePOD reference architecture.
Libraries
Use these integrations when you want library-level control inside an application or service. These options expose familiar APIs while letting developers use cuVS-backed indexing or search paths where supported.
Faiss
Faiss integrates NVIDIA cuVS as an optional backend for GPU vector indexes. The cuVS-backed path keeps the Faiss API model while accelerating supported GPU indexes such as GpuIndexFlat, GpuIndexIVFFlat, GpuIndexIVFPQ, and GpuIndexCagra. Faiss can also move between CPU and GPU indexes, including converting GPU-built CAGRA indexes into HNSW-compatible CPU indexes with IndexHNSWCagra. See the Faiss installation guide, GPU Faiss with cuVS, cuVS usage guide, and example notebook.
KIOXIA AiSAQ
KIOXIA AiSAQ is an SSD-oriented ANN library based on DiskANN and Vamana for vector collections that are too large to keep entirely in DRAM. AiSAQ can use cuVS to accelerate index build, including Vamana graph construction and k-means workflows. It is also available as a disk-based vector index in Milvus 2.6.4 and newer. See KIOXIA’s 4.8B-vector scaling demo, technical blog, and the cuVS Vamana indexing guide.
Lucene
cuVS Lucene provides a Lucene KnnVectorFormat that lets Java search applications use NVIDIA cuVS through Lucene codecs. The package is published as com.nvidia.cuvs.lucene:cuvs-lucene and builds on the cuVS Java APIs. The integration targets GPU-accelerated vector indexing and search paths for Lucene-based systems, including CAGRA graph construction, filtering, index merge support, and off-heap data movement. See the SearchScale and NVIDIA writeup on Apache Lucene accelerated with cuVS.