Get an API Key#
You need to generate an API key to access NVIDIA NIM Microservices from the NVIDIA RAG Blueprint. You need an API key to access models hosted in the NVIDIA API Catalog, and to download models on-premises. For more information, refer to NGC API Keys.
To generate an API key, use the following procedure.
Go to https://org.ngc.nvidia.com/setup/api-keys.
Click Generate Personal Key.
Enter a Key Name.
For Expiration, choose Never Expire.
For Services Included, select NGC Catalog and Public API Endpoints.
Click Generate Personal Key.
Copy your key and save it somewhere safe and private.
(Important) Export your key as an environment variable by using the following code.
export NGC_API_KEY="<your-ngc-api-key>"
API Key Expiration#
If your API key expires, do one of the following:
Create a new key by using the previous procedure, and then delete the expired key.
Use the Action menu to Rotate your key.
You must update the new key information in your environment variables and code.
Service-Specific API Keys#
The RAG Blueprint supports service-specific API keys to provide fine-grained control over API access for different components. This enables separate billing accounts, different rate limits, security segregation, usage tracking per service, or mixed cloud/on-premises deployments.
How It Works#
Service-specific API keys override global NVIDIA_API_KEY or NGC_API_KEY. The system uses the following fallback order to determine which API key to use:
service-specific key → NVIDIA_API_KEY → NGC_API_KEY → None
For example, if you set APP_LLM_APIKEY, the LLM service will use that key instead of the global NVIDIA_API_KEY.
Supported Service Keys#
Service Key |
Purpose |
Used By |
|---|---|---|
|
Main language model for RAG responses |
RAG Server (generate endpoint) |
|
Text embedding generation |
Ingestor Server, RAG Server (search) |
|
Document reranking |
RAG Server (reranker) |
|
Query rewriting and optimization |
RAG Server (query decomposition) |
|
Metadata filter generation from natural language |
RAG Server (search with filters) |
|
Vision-Language Model for image understanding |
RAG Server (multimodal queries) |
|
Document summarization |
Ingestor Server (summary generation) |
|
Self-reflection and answer validation |
RAG Server (reflection mode) |
Library Mode (Python Package)#
Configure in config.yaml.
llm:
api_key: "your-llm-api-key" # Optional: overrides NVIDIA_API_KEY
embeddings:
api_key: "your-embeddings-api-key" # Optional: overrides NVIDIA_API_KEY
Docker Compose#
export APP_LLM_APIKEY="your-llm-api-key"
export APP_EMBEDDINGS_APIKEY="your-embeddings-api-key"
# Additional service-specific keys can be set as needed
Note: For security reasons, API keys must be configured (via NvidiaRAGConfig object or environment variables) before initialization and cannot be passed as runtime parameters in API requests, unlike model names and endpoints which can be overridden at runtime.
Helm#
Use --set flags to pass API keys securely via command line:
helm upgrade --install rag -n rag <chart-path-or-url> \
--set imagePullSecret.password=$NGC_API_KEY \
--set ngcApiSecret.password=$NGC_API_KEY \
--set apiKeysSecret.llmApiKey=$APP_LLM_APIKEY \
--set apiKeysSecret.embeddingsApiKey=$APP_EMBEDDINGS_APIKEY
Additional service-specific keys can be configured as needed: rankingApiKey, queryRewriterApiKey, filterExpressionGeneratorApiKey, vlmApiKey, summaryLlmApiKey, reflectionLlmApiKey.