API Key Authentication for RAG and Retriever Evaluations#

NeMo Evaluator evaluator supports API key authentication for external services used in RAG (Retrieval Augmented Generation) and retriever evaluation workflows. This feature enables secure integration with NVIDIA NIM services while maintaining the security of your authentication credentials.

Overview#

The API key authentication feature allows you to securely provide authentication credentials for various components in your RAG and retriever evaluation pipelines:

  • Query embedding models - For encoding user queries using NVIDIA models

  • Index embedding models - For encoding documents in your knowledge base using NVIDIA models

  • Reranking models - For improving retrieval relevance

  • Large Language Models - For answer generation

  • Judge models - For evaluation metrics

API keys are handled securely and are never logged or exposed in evaluation outputs.


Supported Authentication Methods#

API Key Authentication#

The most common authentication method for external services. Specify the api_key field in your model endpoint configurations.

{
  "api_endpoint": {
    "url": "https://integrate.api.nvidia.com/v1/embeddings",
    "model_id": "nvidia/nv-embedqa-e5-v5",
    "api_key": "your-nvidia-api-key"
  }
}

Supported Models and Services#

Warning

Current Implementation: NeMo Evaluator currently supports NVIDIA embedding models only. Third-party embedding services are not yet implemented.

The authentication feature works with NVIDIA NIM services, including:

  • NVIDIA Embedding Models - For embeddings (e.g., nvidia/nv-embedqa-e5-v5)

  • NVIDIA Reranking Models - For reranking (e.g., nvidia/nv-rerankqa-mistral-4b-v3)

  • LLM Services - For answer generation and judge models

  • OpenAI-compatible endpoints - Any service following OpenAI-compatible format


Configuration Examples#

RAG Pipeline with Authentication#

Here’s an example of a complete RAG pipeline using authenticated NVIDIA services:

{
  "type": "rag",
  "name": "my-authenticated-rag-target",
  "namespace": "my-organization",
  "rag": {
    "pipeline": {
      "retriever": {
        "pipeline": {
          "query_embedding_model": {
            "api_endpoint": {
              "url": "https://integrate.api.nvidia.com/v1/embeddings",
              "model_id": "nvidia/nv-embedqa-e5-v5",
              "api_key": "your-nvidia-api-key"
            }
          },
          "index_embedding_model": {
            "api_endpoint": {
              "url": "https://integrate.api.nvidia.com/v1/embeddings",
              "model_id": "nvidia/nv-embedqa-e5-v5",
              "api_key": "your-nvidia-api-key"
            }
          },
          "reranker_model": {
            "api_endpoint": {
              "url": "http://nemo-ranking-ms.nemo-retrieval.svc.cluster.local:8080/v1/ranking",
              "model_id": "nvidia/nv-rerankqa-mistral-4b-v3",
              "api_key": "your-nvidia-api-key"
            }
          },
          "top_k": 5
        }
      },
      "model": {
        "api_endpoint": {
          "url": "https://integrate.api.nvidia.com/v1/chat/completions",
          "model_id": "meta/llama-3.1-70b-instruct",
          "api_key": "your-nvidia-api-key"
        }
      }
    }
  }
}
{
  "type": "rag",
  "name": "my-rag-evaluation-config",
  "namespace": "my-organization",
  "params": {
    "temperature": 0.1,
    "max_tokens": 512
  },
  "tasks": {
    "my-beir-task": {
      "type": "beir",
      "dataset": {
        "files_url": "file://nfcorpus/"
      },
      "params": {
        "judge_llm": {
          "api_endpoint": {
            "url": "https://integrate.api.nvidia.com/v1",
            "model_id": "meta/llama-3.1-8b-instruct",
            "api_key": "your-judge-api-key"
          }
        },
        "judge_embeddings": {
          "api_endpoint": {
            "url": "https://integrate.api.nvidia.com/v1/embeddings",
            "model_id": "nvidia/nv-embedqa-e5-v5",
            "api_key": "your-nvidia-api-key"
          }
        },
        "judge_timeout": 300,
        "judge_max_retries": 5,
        "judge_max_workers": 16
      },
      "metrics": {
        "recall_5": {"type": "recall_5"},
        "ndcg_cut_5": {"type": "ndcg_cut_5"},
        "recall_10": {"type": "recall_10"},
        "ndcg_cut_10": {"type": "ndcg_cut_10"},
        "faithfulness": {"type": "faithfulness"},
        "answer_relevancy": {"type": "answer_relevancy"}
      }
    }
  }
}

Retriever Pipeline with Authentication#

For retriever-only evaluations using NVIDIA embedding models:

{
  "type": "retriever", 
  "name": "my-authenticated-retriever-target",
  "namespace": "my-organization",
  "retriever": {
    "pipeline": {
      "query_embedding_model": {
        "api_endpoint": {
          "url": "https://integrate.api.nvidia.com/v1/embeddings",
          "model_id": "nvidia/nv-embedqa-e5-v5",
          "api_key": "your-nvidia-api-key"
        }
      },
      "index_embedding_model": {
        "api_endpoint": {
          "url": "https://integrate.api.nvidia.com/v1/embeddings",
          "model_id": "nvidia/nv-embedqa-e5-v5",
          "api_key": "your-nvidia-api-key"
        }
      },
      "top_k": 10
    }
  }
}