Custom HTTP Headers#

Note

The time to complete this tutorial is approximately 15 minutes.

Authorization Headers#

By default, the microservice reads the NIM_ENDPOINT_API_KEY environment variable for the API key to send to the LLM.

As an alternative to setting the environment variables, you can pass the API Key using the X-Model-Authorization header. When the microservice receives this request, the microservice extracts the token from the header and uses it for authorization. The header is only sent as part of the request to the LLM and not other services or endpoints.

If LLM response includes headers, these headers are available in the X-Model-Response-Headers header.

The following sample curl command shows one way to send the API key.

curl -X 'POST' \
  "http://0.0.0.0:${GUARDRAILS_PORT}/v1/guardrail/chat/completions" \
  -H "Accept: application/json" \
  -H "Content-Type: application/json" \
  -H "X-Model-Authorization: ${NVIDIA_API_KEY}" \
  -d '{
  "model": "meta/llama-3.1-70b-instruct",
  "messages": [
    {
      "role": "user",
      "content": "Hello! How are you?"
    }
  ],
  "guardrails": {
    "config_id": "self-check",
  },
  "max_tokens": 256,
  "temperature": 1,
  "top_p": 1
}'

The following sample Python code shows one way to send the API key.

import json
import os
from openai import OpenAI

nvidia_api_key = os.getenv("NVIDIA_API_KEY")
guardrails_port = os.getenv("GUARDRAILS_PORT")

x_model_authorization = {"X-Model-Authorization": nvidia_api_key}

client = OpenAI(base_url=f"http://0.0.0.0:{guardrails_port}/v1/guardrail", default_headers=x_model_authorization)
...

Other Custom Headers#

In addition to the X-Model-Authorization header, you can specify custom headers that start with x or X. The custom headers are passed to the LLM, if it supports custom headers.

The following sample curl command specifies several custom headers.

curl -v -X 'POST' \
  "http://localhost:${GUARDRAILS_PORT}/v1/guardrail/chat/completions" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -H "X-Model-Authorization: $OPENAI_API_KEY" \
  -H "X-Model-Response-Time: 100ms" \
  -H "X-Model-Request-ID: 12345" \
  -H "X-Model-Client: my-client" \
  -H "X-Model-Version: 1.0.0" \
  -d '{
  "model": "meta/llama-3.1-70b-instruct",
  "messages":[{"role": "user", "content": "how does internet work"}],
  "max_tokens": 160,
  "stream": false,
  "temperature": 1,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0
}'

When the microservice receives this request, the microservice passes the custom headers to the LLM, if it supports custom headers.