Integrations

NVIDIA cuVS can be adopted at several layers of a vector-search stack. Applications can call NVIDIA cuVS directly, use a database or search engine with NVIDIA cuVS-backed indexing, rely on data platforms that bring NVIDIA accelerated computing closer to enterprise data, or use libraries that expose NVIDIA cuVS algorithms through familiar APIs.

This page summarizes where each integration fits and links to vendor documentation for setup, supported configurations, and operational details.

Databases

Use these integrations when vector search should be managed by a database or search engine. The system owns ingestion, indexing, query APIs, and operations, while NVIDIA cuVS-backed paths accelerate supported index build or search workflows.

Amazon OpenSearch Service

Amazon OpenSearch Service provides managed GPU acceleration for vector indexing on OpenSearch 3.1+ domains and OpenSearch Serverless vector collections. OpenSearch Service detects supported Faiss vector index builds, offloads the build work to managed GPU capacity, and applies acceleration to indexing and force-merge operations without requiring users to manage GPU instances. See the AWS docs for enabling GPU acceleration, creating GPU-accelerated vector indexes, and indexing vector data.

CyborgDB

CyborgDB is an encrypted vector database proxy for confidential vector search across backing stores such as PostgreSQL, Redis, S3, and in-memory storage. Cyborg and NVIDIA demonstrated a proof of concept accelerated by NVIDIA cuVS that speeds encrypted vector index build and retrieval while preserving CyborgDB’s confidentiality model. See the CyborgDB overview, quickstart, embedded deployment docs, and NVIDIA’s technical blog about Cyborg and NVIDIA cuVS.

Elasticsearch

Elasticsearch GPU accelerated vector indexing uses NVIDIA cuVS to accelerate dense-vector HNSW index construction. This path is intended for ingestion-heavy workloads where CPU HNSW graph construction is a bottleneck. Elastic’s implementation builds graph structures on the GPU and converts them for Elasticsearch search workflows. See Elastic’s GPU vector indexing reference, plus the GPU acceleration writeups for chapter 1 and chapter 2.

Kinetica

Kinetica is a GPU-accelerated database with native vector columns, SQL vector operators, and Python APIs for vector search. Kinetica supports CAGRA indexes on vector columns for GPU-oriented approximate nearest-neighbor search, while HNSW remains the automatically maintained option for mutable vector data. See Kinetica’s CAGRA index docs, NVIDIA partner page, vector I/O example, and vector search example.

KX / KDB.AI

KDB.AI is KX’s vector database for AI applications and similarity search workflows. The NVIDIA cuVS-enabled KDB.AI Server image, kdbai-db-cuvs, bundles the GPU dependencies needed to build and search CAGRA indexes with NVIDIA cuVS while keeping the standard KDB.AI client APIs. KDB.AI also integrates with kdb+, allowing existing q and kdb+ datasets to participate in vector search workflows. See KX’s kdb+ product page and NVIDIA partner page for broader KX and NVIDIA integration context.

Milvus

Milvus GPU indexes provide GPU-accelerated options for high-throughput and high-recall vector search. Supported index types include GPU_CAGRA, GPU_IVF_FLAT, GPU_IVF_PQ, and GPU_BRUTE_FORCE. GPU_CAGRA also supports hybrid deployments where a GPU-built graph can be adapted at load time for CPU search in Milvus 2.6.4 and newer. See the Milvus docs for building indexes and installing Milvus with GPU support.

OpenSearch

OpenSearch Vector Engine supports vector search, hybrid search, and RAG-oriented retrieval in OpenSearch. The OpenSearch GPU indexing design offloads vector index builds to GPU workers, uses NVIDIA cuVS through Faiss to build CAGRA graphs, converts those graphs into an HNSW-compatible form for CPU search, and falls back to CPU index building when needed. See the OpenSearch vector search docs, GPU-accelerated vector search blog, k-NN RFCs for GPU acceleration and remote vector index build, and Amazon OpenSearch Service GPU acceleration docs.

Oracle AI Database 26ai

Oracle AI Database 26ai includes Oracle AI Vector Search for storing embeddings, creating vector indexes, and running similarity search inside Oracle Database. Oracle’s Vector Index Service uses GPU-enabled containers and NVIDIA cuVS to generate Oracle In-Memory Neighbor Graph vector indexes, then returns the index to Oracle AI Database for query execution. See Oracle’s NVIDIA collaboration announcement, Vector Index Service introduction, and getting started guide.

Solr

Apache Solr dense vector search can use NVIDIA cuVS through the cuvs-lucene vector format. Solr exposes this through CuVSCodecFactory and the CuVSCodec module, allowing CAGRA graph construction on GPU and serialization into an HNSW graph for search. The Solr guide covers setup steps, including the cuvs module jars, knnAlgorithm="cagra_hnsw" on DenseVectorField, and codec configuration in solrconfig.xml.

Data Platforms

Use these integrations when data locality, storage throughput, and RAG pipeline architecture are central to the deployment. These platforms usually do not expose NVIDIA cuVS APIs directly; they bring NVIDIA accelerated computing, networking, and AI software closer to enterprise data through the NVIDIA AI Data Platform reference design.

Cloudian

Cloudian HyperScale AI Data Platform combines HyperStore S3-compatible object storage with NVIDIA accelerated computing and NVIDIA AI Enterprise software for on-premises AI factories. Cloudian positions the platform for agentic RAG, semantic search, and enterprise knowledge retrieval, with integrated vector database capabilities for ingesting, embedding, and indexing multimodal content. See Cloudian’s platform launch, AI inferencing overview, and NVIDIA integration page.

DDN

DDN Infinia is DDN’s data platform for AI workflows across core, cloud, and edge environments. In DDN’s NVIDIA AI Data Platform example, Infinia is paired with NVIDIA NIM microservices, NVIDIA Spectrum-X, BlueField DPUs, and Milvus to support RAG, vector search, and inference-serving pipelines. See DDN’s RAG workflow writeup and Data Intelligence Platform overview.

Dell AI Data Platform

Dell AI Data Platform with NVIDIA combines Dell storage, modular data engines, NVIDIA accelerated computing, networking, and NVIDIA AI Enterprise software for enterprise AI data pipelines. Dell describes a GPU Accelerated Data Search Engine that applies NVIDIA cuVS to vector indexing and search over unstructured data, alongside NVIDIA cuDF for data processing and Dell data orchestration for preparing governed AI-ready datasets. See Dell’s platform launch blog, press release, and GPU-fed AI data platform overview.

MinIO

MinIO AIStor is an S3-compatible object data platform for NVIDIA AI Factory and NVIDIA STX deployments. In vector-search and RAG pipelines, AIStor provides the durable object layer for embeddings, segment objects, index artifacts, and retrieval data while GPU-accelerated services such as Milvus and NVIDIA cuVS perform index construction and search. MinIO has published a Milvus and NVIDIA cuVS benchmark showing how AIStor, NVIDIA GPUDirect RDMA for S3-compatible storage, and NVIDIA cuVS fit together in large-scale vector index creation. See also MinIO’s NVIDIA STX announcement and GPUDirect RDMA overview.

NetApp

NetApp AIPod supports NVIDIA AI Data Platform deployments with NetApp ONTAP data management, scalable storage, and NVIDIA accelerated computing. The integration is designed for governed RAG and inference pipelines that scan, index, classify, and retrieve enterprise documents for AI agents. See NetApp’s AI Data Platform announcement and AI infrastructure documentation.

Everpure

Everpure FlashBlade integrates with the NVIDIA AI Data Platform reference design for agentic AI, RAG, and inference workflows. Everpure positions FlashBlade as a high-throughput storage layer for NVIDIA accelerated compute and NVIDIA AI Enterprise software, while FlashBlade//EXA targets large-scale AI and HPC workloads that need high metadata performance and low-latency access to multimodal data.

WEKA

WEKA Data Platform integrates with the NVIDIA AI Data Platform reference design to provide a high-performance storage foundation for agentic AI reasoning and inference. WEKA’s NVIDIA work includes Augmented Memory Grid, NVIDIA Cloud Partner certification, NVIDIA-Certified Systems Storage validation, and DGX reference architectures. See WEKA’s NVIDIA Cloud Partner certification and DGX BasePOD reference architecture.

Libraries

Use these integrations when you want library-level control inside an application or service. These options expose familiar APIs while letting developers use NVIDIA cuVS-backed indexing or search paths where supported.

Faiss

Faiss integrates NVIDIA cuVS as an optional backend for GPU vector indexes. The NVIDIA cuVS-backed path keeps the Faiss API model while accelerating supported GPU indexes such as GpuIndexFlat, GpuIndexIVFFlat, GpuIndexIVFPQ, and GpuIndexCagra. Faiss can also move between CPU and GPU indexes, including converting GPU-built CAGRA indexes into HNSW-compatible CPU indexes with IndexHNSWCagra. See the Faiss installation guide, GPU Faiss with NVIDIA cuVS, NVIDIA cuVS usage guide, and example notebook.

KIOXIA AiSAQ

KIOXIA AiSAQ is an SSD-oriented ANN library based on DiskANN and Vamana for vector collections that are too large to keep entirely in DRAM. AiSAQ can use NVIDIA cuVS to accelerate index build, including Vamana graph construction and k-means workflows. It is also available as a disk-based vector index in Milvus 2.6.4 and newer. See KIOXIA’s 4.8B-vector scaling demo, technical blog, and the NVIDIA cuVS Vamana indexing guide.

Lucene

NVIDIA cuVS Lucene provides a Lucene KnnVectorFormat that lets Java search applications use NVIDIA cuVS through Lucene codecs. The package is published as com.nvidia.cuvs.lucene:cuvs-lucene and builds on the NVIDIA cuVS Java APIs. The integration targets GPU-accelerated vector indexing and search paths for Lucene-based systems, including CAGRA graph construction, filtering, index merge support, and off-heap data movement. See the SearchScale and NVIDIA writeup on Apache Lucene accelerated with NVIDIA cuVS.