Create Configuration#

Create a new deployment configuration for a NIM microservice you want to deploy.

Prerequisites#

Before you can create a NIM deployment configuration, make sure that you have:

To Create a Configuration#

Choose one of the following options of creating a configuration.

from nemo_microservices import NeMoMicroservices

client = NeMoMicroservices(
    base_url=os.environ["DEPLOYMENT_MANAGEMENT_BASE_URL"],
    inference_base_url=os.environ["NIM_PROXY_BASE_URL"]
)

# For NVIDIA NGC Models

response = client.deployment.configs.create(
    name="your-custom-config",
    namespace="your-namespace",
    description="Custom configuration for NIM deployment",
    model="meta/llama-3.1-8b-instruct",
    nim_deployment={
        "image_name": "string",
        "image_tag": "string",
        "gpu": 0,
        "additional_envs": {
            "additionalProp1": "string",
            "additionalProp2": "string",
            "additionalProp3": "string"
        },
        "namespace": "string"
    },
    project="your-project",
)
print(response)

# For External Models such as OpenAI ChatGPT and build.nvidia.com

response = client.deployment.configs.create(
    name="your-custom-config",
    namespace="your-namespace",
    description="External endpoint configuration",
    external_endpoint={
        "host_url": "https://example.com/",
        "api_key": "string",
        "enabled_models": [
            "meta/llama-3.1-8b-instruct"
        ]
    },
    project="your-project",
)
print(response)

Make a POST request to the /v1/deployment/configs endpoint.

For more details on the request body, see the Deployment Management API reference.

For NVIDIA NGC Models

curl -X POST \
  "${DEPLOYMENT_BASE_URL}/v1/deployment/configs" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "string",
    "namespace": "string",
    "description": "string",
    "model": "string",
    "nim_deployment": {
      "image_name": "string",
      "image_tag": "string",
      "gpu": 0,
      "additional_envs": {
        "additionalProp1": "string",
        "additionalProp2": "string",
        "additionalProp3": "string"
      },
      "namespace": "string"
    },
    "project": "string",
    "custom_fields": {},
    "ownership": {
      "created_by": "",
      "access_policies": {}
    }
  }' | jq

For External Models such as OpenAI ChatGPT and build.nvidia.com

curl -X POST \
  "${DEPLOYMENT_BASE_URL}/v1/deployment/configs" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "string",
    "namespace": "string",
    "description": "string",
    "model": "string",
    "external_endpoint": {
      "host_url": "https://example.com/",
      "api_key": "string",
      "enabled_models": [
        "string"
      ]
    },
    "project": "string",
    "custom_fields": {},
    "ownership": {
      "created_by": "",
      "access_policies": {}
    }
  }' | jq
Example Response
{
  "created_at": "2025-05-30T23:45:33.033Z",
  "updated_at": "2025-05-30T23:45:33.033Z",
  "name": "string",
  "namespace": "string",
  "description": "string",
  "model": "string",
  "nim_deployment": {
    "image_name": "string",
    "image_tag": "string",
    "gpu": 0,
    "additional_envs": {
      "additionalProp1": "string",
      "additionalProp2": "string",
      "additionalProp3": "string"
    },
    "namespace": "string"
  },
  "external_endpoint": {
    "host_url": "https://example.com/",
    "api_key": "string",
    "enabled_models": [
      "string"
    ]
  },
  "schema_version": "1.0",
  "project": "string",
  "custom_fields": {},
  "ownership": {
    "created_by": "",
    "access_policies": {}
  }
}

For more information about the response of the API, see the Deployment Management API reference.

Tip

The configuration is created immediately and can be used for deployments right away.