RAG Pipeline Targets#

Retrieval Augmented Generation (RAG) pipelines are built by combining NeMo Retriever and LLM. A retriever pipeline is used to retrieve relevant documents based on a query, and the LLM is used to generate answers based on the query and the retrieved documents. For more information, refer to RAG Evaluation Flow.

Authentication Support#

RAG pipeline targets support API key authentication for external services. This enables secure integration with third-party embedding models, reranking services, and LLMs in your pipeline configuration.

Note

All target configuration examples below can include api_key fields for external services. For comprehensive authentication guidance, refer to API Key Authentication.

Example with Authentication#

{
  "query_embedding_model": {
    "api_endpoint": {
      "url": "<my-external-embedding-url>",
      "model_id": "<my-external-embedding-model>",
      "api_key": "<my-api-key>"
    }
  }
}

Answer Evaluation#

NeMo Evaluator supports Answer Evaluation RAG pipelines. The rag pipeline is replaced by a cached_outputs field that contains pre-generated retrieved documents and pre-generated answers. For more information, refer to Using Custom Data.

To Create a RAG Answer Evaluation Target#

Choose one of the following options to create a RAG answer evaluation target.

import os
from nemo_microservices import NeMoMicroservices

# Initialize the client
client = NeMoMicroservices(
    base_url=os.environ['EVALUATOR_BASE_URL']
)

# Create a RAG answer evaluation target
client.evaluation.targets.create(
    type="rag",
    name="my-target-rag-1",
    namespace="my-organization",
    rag={
        "cached_outputs": {
            "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
        }
    }
)

print("RAG answer evaluation target created successfully")
curl -X "POST" "${EVALUATOR_BASE_URL}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
     "type": "rag",
     "name": "my-target-rag-1",
     "namespace": "my-organization",
     "rag": {
        "cached_outputs": {
           "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
        }
     }
  }'

Answer Generation and Evaluation#

NeMo Evaluator supports Answer Generation + Answer Evaluation RAG pipelines. The retriever pipeline is replaced by a cached_outputs field that contains pre-generated retrieved documents. For more information, refer to Using Custom Data.

To Create a RAG Answer Generation + Evaluation Target#

Choose one of the following options to create a RAG answer generation + evaluation target.

import os
from nemo_microservices import NeMoMicroservices

# Initialize the client
client = NeMoMicroservices(
    base_url=os.environ['EVALUATOR_BASE_URL']
)

# Create a RAG answer generation + evaluation target
client.evaluation.targets.create(
    type="rag",
    name="my-target-rag-2",
    namespace="my-organization",
    rag={
        "pipeline": {
            "retriever": {
                "cached_outputs": {
                    "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
                }
            },
            "model": {
                "api_endpoint": {
                    "url": "<my-nim-deployment-base-url>/v1/chat/completions",
                    "model_id": "<my-model>"
                }
            }
        }
    }
)

print("RAG answer generation + evaluation target created successfully")
curl -X "POST" "${EVALUATOR_BASE_URL}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
     "type": "rag",
     "name": "my-target-rag-2",
     "namespace": "my-organization",
     "rag": {
        "pipeline": {
           "retriever": {
              "cached_outputs": {
                 "files_url": "hf://datasets/<my-dataset-namespace>/<my-dataset-name>/<my-dataset-file-path>"
              }
           },
           "model": {
              "api_endpoint": {
                 "url": "<my-nim-deployment-base-url>/v1/chat/completions",
                 "model_id": "<my-model>"
              }
           }
        }
     }
  }'

Retrieval#

Embedding + Answer Generation and Evaluation#

NeMo Evaluator supports Retrieval + Answer Generation + Answer Evaluation RAG pipelines. The rag pipeline field contains a retriever pipeline and a model.

To Create a RAG Embedding + Answer Generation + Evaluation Target#

Choose one of the following options to create a RAG embedding + answer generation + evaluation target.

import os
from nemo_microservices import NeMoMicroservices

# Initialize the client
client = NeMoMicroservices(
    base_url=os.environ['EVALUATOR_BASE_URL']
)

# Create a RAG embedding + answer generation + evaluation target
client.evaluation.targets.create(
    type="rag",
    name="my-target-rag-3",
    namespace="my-organization",
    rag={
        "pipeline": {
            "retriever": {
                "pipeline": {
                    "query_embedding_model": {
                        "api_endpoint": {
                            "url": "<my-query-embedding-url>",
                            "model_id": "<my-query-embedding-model>"
                        }
                    },
                    "index_embedding_model": {
                        "api_endpoint": {
                            "url": "<my-index-embedding-url>",
                            "model_id": "<my-index-embedding-model>"
                        }
                    },
                    "top_k": 3
                }
            },
            "model": {
                "api_endpoint": {
                    "url": "<my-nim-deployment-base-url>/v1/chat/completions",
                    "model_id": "<my-model>"
                }
            }
        }
    }
)

print("RAG embedding + generation + evaluation target created successfully")
curl -X "POST" "${EVALUATOR_BASE_URL}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
     "type": "rag",
     "name": "my-target-rag-3",
     "namespace": "my-organization",
     "rag": {
        "pipeline": {
           "retriever": {
              "pipeline": {
                 "query_embedding_model": {
                    "api_endpoint": {
                       "url": "<my-query-embedding-url>",
                       "model_id": "<my-query-embedding-model>"
                    }
                 },
                 "index_embedding_model": {
                    "api_endpoint": {
                       "url": "<my-index-embedding-url>",
                       "model_id": "<my-index-embedding-model>"
                    }
                 },          
                 "top_k": 3
              }
           },
           "model": {
              "api_endpoint": {
                 "url": "<my-nim-deployment-base-url>/v1/chat/completions",
                 "model_id": "<my-model>"
              }
           }
        }
     }
  }'

Embedding + Answer Generation and Evaluation (with External APIs)#

For embedding models that require authentication, you can configure API keys for embedding models.

import os
from nemo_microservices import NeMoMicroservices

# Initialize the client
client = NeMoMicroservices(
    base_url=os.environ['EVALUATOR_BASE_URL']
)

# Create a RAG target with authenticated embedding models
client.evaluation.targets.create(
    type="rag",
    name="my-target-rag-external",
    namespace="my-organization",
    rag={
        "pipeline": {
            "retriever": {
                "pipeline": {
                    "query_embedding_model": {
                        "api_endpoint": {
                            "url": "<my-query-embedding-url>",
                            "model_id": "<my-query-embedding-model>",
                            "api_key": "<my-api-key>"
                        }
                    },
                    "index_embedding_model": {
                        "api_endpoint": {
                            "url": "<my-index-embedding-url>",
                            "model_id": "<my-index-embedding-model>",
                            "format": "nim"
                        }
                    }
                }
            },
            "model": {
                "api_endpoint": {
                    "url": "<my-model-url>",
                    "model_id": "<my-model-id>",
                    "api_key": "<my-api-key>"
                }
            }
        }
    }
)

print("RAG target with authenticated embedding models created successfully")
curl -X "POST" "${EVALUATOR_BASE_URL}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
     "type": "rag",
     "name": "my-target-rag-external",
     "namespace": "my-organization",
     "rag": {
        "pipeline": {
           "retriever": {
              "pipeline": {
                 "query_embedding_model": {
                    "api_endpoint": {
                       "url": "<my-query-embedding-url>",
                       "model_id": "<my-query-embedding-model>",
                       "api_key": "<my-api-key>"
                    }
                 },
                 "index_embedding_model": {
                    "api_endpoint": {
                       "url": "<my-index-embedding-url>",
                       "model_id": "<my-index-embedding-model>",
                       "format": "nim"
                    }
                 }
              }
           },
           "model": {
              "api_endpoint": {
                 "url": "<my-model-url>",
                 "model_id": "<my-model-id>",
                 "api_key": "<my-api-key>"
              }
           }
        }
     }
  }'

Embedding, Reranking + Answer Generation and Evaluation#

NeMo Evaluator supports Retrieval + Answer Generation + Answer Evaluation RAG pipelines. The rag pipeline field contains a retriever pipeline and a model.

To Create a RAG Embedding + Reranking + Answer Generation + Evaluation Target#

Choose one of the following options to create a RAG embedding + reranking + answer generation + evaluation target.

import os
from nemo_microservices import NeMoMicroservices

# Initialize the client
client = NeMoMicroservices(
    base_url=os.environ['EVALUATOR_BASE_URL']
)

# Create a RAG embedding + reranking + answer generation + evaluation target
client.evaluation.targets.create(
    type="rag",
    name="my-target-rag-4",
    namespace="my-organization",
    rag={
        "pipeline": {
            "retriever": {
                "pipeline": {
                    "query_embedding_model": {
                        "api_endpoint": {
                            "url": "<my-query-embedding-url>",
                            "model_id": "<my-query-embedding-model>"
                        }
                    },
                    "index_embedding_model": {
                        "api_endpoint": {
                            "url": "<my-index-embedding-url>",
                            "model_id": "<my-index-embedding-model>"
                        }
                    },
                    "reranker_model": {
                        "api_endpoint": {
                            "url": "<my-ranker-url>",
                            "model_id": "<my-ranker-model>"
                        }
                    },
                    "top_k": 3
                }
            },
            "model": {
                "api_endpoint": {
                    "url": "<my-nim-deployment-base-url>/v1/chat/completions",
                    "model_id": "<my-model>"
                }
            }
        }
    }
)

print("RAG embedding + reranking + generation + evaluation target created successfully")
curl -X "POST" "${EVALUATOR_BASE_URL}/v1/evaluation/targets" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
     "type": "rag",
     "name": "my-target-rag-4",
     "namespace": "my-organization",
     "rag": {
        "pipeline": {
           "retriever": {
              "pipeline": {
                 "query_embedding_model": {
                       "api_endpoint": {
                          "url": "<my-query-embedding-url>",
                          "model_id": "<my-query-embedding-model>"
                       }
                 },
                 "index_embedding_model": {
                    "api_endpoint": {
                       "url": "<my-index-embedding-url>",
                       "model_id": "<my-index-embedding-model>"
                    }
                 },
                 "reranker_model": {
                    "api_endpoint": {
                       "url": "<my-ranker-url>",
                       "model_id": "<my-ranker-model>"
                    }
                 },
                 "top_k": 3
              }
           },
           "model": {
              "api_endpoint": {
                 "url": "<my-nim-deployment-base-url>/v1/chat/completions",
                 "model_id": "<my-model>"
              }
           }
        }
     }
  }'