Deploy NIM Microservice#
Deploy a NIM microservice.
Prerequisites#
Before you can deploy a NIM microservice, make sure that you have:
Access to the NeMo Deployment Management service through the NeMo platform host if you have installed the NeMo platform or the independent base URL if you have installed the service individually. Store the base URL in an environment variable
DEPLOYMENT_BASE_URL
.Model details and deployment specifications you want to deploy. To find the models supported by NVIDIA NIM, see Models in the NVIDIA NIM for LLMs documentation.
To Deploy a NIM Microservice#
Choose one of the following options of deploying a NIM microservice.
from nemo_microservices import NeMoMicroservices
client = NeMoMicroservices(
base_url=os.environ["DEPLOYMENT_MANAGEMENT_BASE_URL"],
inference_base_url=os.environ["NIM_PROXY_BASE_URL"]
)
response = client.deployment.model_deployments.create(
name="your-nim-deployment",
namespace="your-namespace",
description="NIM deployment for inference",
models=["meta/llama-3.1-8b-instruct"],
async_enabled=False,
config="default",
project="your-project",
custom_fields={},
ownership={
"created_by": "",
"access_policies": {}
}
)
print(response)
Use the following cURL command. For more details on the request body, see the Deployment Management API reference.
curl -X POST \
"${DEPLOYMENT_BASE_URL}/v1/deployment/model-deployments" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"name": "string",
"namespace": "string",
"description": "string",
"models": [
"string"
],
"async_enabled": false,
"config": "string",
"project": "string",
"custom_fields": {},
"ownership": {
"created_by": "",
"access_policies": {}
}
}' | jq
Example Response
{
"created_at": "2025-05-30T23:52:51.074Z",
"updated_at": "2025-05-30T23:52:51.074Z",
"name": "string",
"namespace": "string",
"description": "string",
"url": "https://example.com/",
"deployed": false,
"status_details": {
"status": "created",
"description": "string"
},
"models": [
"string"
],
"async_enabled": false,
"config": "string",
"schema_version": "1.0",
"project": "string",
"custom_fields": {},
"ownership": {
"created_by": "",
"access_policies": {}
}
}
For more information about the response of the API, see the Deployment Management API reference.
Tip
The deployment process may take a few minutes to complete. You can check the deployment status using the Get NIM Deployment Details API.