Chat Completions#
Generate chat completions using a model through the NIM Proxy microservice through a POST API call.
Prerequisites#
Before you can generate chat completions, make sure that you have:
Access to the NIM Proxy microservice through the base URL where the service is deployed. Store the base URL in an environment variable
NIM_PROXY_BASE_URL
.A valid model name. To retrieve the list of models deployed as NIM microservices in your environment, use the
${NIM_PROXY_BASE_URL}/v1/models
API. For more information, see List Models.
Options#
You can generate chat completions in the following ways.
API#
Perform a
POST
request to the/v1/chat/completions
endpoint.Use the following cURL command. For more details on the request body, refer to the NIM for LLMs API reference and find the API named the same as
v1/chat/completions
. The NIM Proxy API endpoint routes your requests to the NIM for LLMs microservice.curl -X POST \ "${NIM_PROXY_BASE_URL}/v1/chat/completions" \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "model": "meta/llama-3.1-8b-instruct", "messages": [ { "role":"user", "content":"Hello! How are you?" } ], "max_tokens": 32 }' | jq
Review the response.
Example Response
{ "id": "chatcmpl-123", "object": "chat.completion", "created": 1677652288, "model": "meta/llama-3.1-8b-instruct", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "I'm doing well, thank you for asking! How can I help you today?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 15, "completion_tokens": 12, "total_tokens": 27 } }