Completions#

Generate text completions using a model through the NIM Proxy microservice through a POST API call.

Prerequisites#

Before you can generate completions, make sure that you have:

  • Access to the NIM Proxy microservice through the base URL where the service is deployed. Store the base URL in an environment variable NIM_PROXY_BASE_URL.

  • A valid model name. To retrieve the list of models deployed as NIM microservices in your environment, use the ${NIM_PROXY_BASE_URL}/v1/models API. For more information, see List Models.


Options#

You can generate completions in the following ways.

API#

  1. Perform a POST request to the /v1/completions endpoint.

    Use the following cURL command. For more details on the request body, refer to the NIM for LLMs API reference and find the API named the same as v1/completions. The NIM Proxy API endpoint routes your requests to the NIM for LLMs microservice.

    curl -X POST \
      "${NIM_PROXY_BASE_URL}/v1/completions" \
      -H 'accept: application/json' \
      -H 'Content-Type: application/json' \
      -d '{
        "model": "meta/llama-3.1-8b-instruct",
        "prompt": "Once upon a time",
        "max_tokens": 64
      }' | jq
    
  2. Review the response.

    Example Response
    {
      "id": "cmpl-123",
      "object": "text_completion",
      "created": 1677652288,
      "model": "llama2-7b",
      "choices": [
        {
          "text": "In a sunlit studio, a robot named Pixel carefully dipped its metallic fingers into a palette of vibrant colors. Its optical sensors studied the blank canvas with intense focus, analyzing the interplay of light and shadow. With precise movements, it began to paint, each stroke a calculated expression of its growing understanding of art. The robot's journey from mechanical precision to artistic intuition was captured in the evolving masterpiece before it.",
          "index": 0,
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 89,
        "total_tokens": 99
      }
    }