API Reference#
Endpoints Schema#
The following are endpoints for NVIDIA NIM for Cosmos:
/v1/infer
/v1/health/ready
/v1/health/live
/v1/license
/v1/metrics
/v1/metadata
/v1/manifest
API Examples#
Use the examples in this section to get started with using the API.
Check Health#
Use the following command to check server health.
cURL Request
curl -X 'GET' 'http://0.0.0.0:8000/v1/health/ready'
Response
{
"description":"Triton readiness check",
"status":"ready"
}
Generate Sample#
Use the following command to generate a video sample. The generation process can take several minutes, depending on the hardware used and the selected profile. For more information on performance characteristics, refer to the Supported Models section.
cURL Request
curl -X 'POST' \
'http://0.0.0.0:8000/v1/infer' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"prompt": "first person view from a camera in a car driving down a two lane neighborhood street, viewed from the dashcam as we drive down the street. The camera faces forward. There are nice houses and sidewalks in this suburban area with green grass front yards and flower gardens and large oak trees. It is a rainy day and there are grey clouds overhead. The road has puddles on it, reflecting the sky overhead. The windshield wipers flash by.",
"negative_prompt": "blurry, low quality, artifacts, people",
"prompt_upsampling": true,
"seed": 4,
"guidance_scale": 7.5,
"steps": 50,
"video_params": {
"height": 704,
"width": 1280,
"frames_count": 121,
"frames_per_sec": 24
}
}'
Response
The server will respond with the following:
{
"b64_video": "<base64EncodedVideoString>",
"upsampled_prompt": "first person view from a camera in a car driving down a two lane neighborhood street, viewed from the dashcam as we drive down the street. The camera faces forward. There are nice houses and sidewalks in this suburban area with green grass front yards and flower gardens and large oak trees. It is a rainy day and there are grey clouds overhead. The road has puddles on it, reflecting the sky overhead. The windshield wipers flash by.",
"seed": 4
}
cURL Request
curl -X POST \
http://0.0.0.0:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
"prompt": "The video is a wide shot of a large industrial facility, likely a chemical plant or factory, situated in a rural or semi-industrial area. The scene is set during a partly cloudy day, with the sky showing patches of blue and white clouds. The facility is surrounded by a vast expanse of green fields, indicating its location in a countryside or suburban area. The factory itself is a large, rectangular building with a flat roof, constructed from concrete and metal. It features several large cylindrical tanks and pipes, suggesting the processing of chemicals or liquids. The tanks are arranged in a linear fashion along the side of the building, and there are several smaller structures and equipment scattered around the premises. The camera remains static throughout the video, capturing the entire facility from a distance, allowing viewers to observe the layout and scale of the operations. The lighting is natural, with sunlight casting shadows on the ground, enhancing the details of the industrial setup. There are no visible human activities or movements, indicating that the video might be a documentary or an informational piece about industrial processes.", "negative_prompt": "blurry, low quality, artifacts, people",
"image": "https://assets.ngc.nvidia.com/products/api-catalog/cosmos/industry_01_prompt.jpg",
"seed": 42,
"guidance_scale": 7.5,
"steps": 35,
"video_params": {
"height": 704,
"width": 1280,
"frames_count": 121,
"frames_per_sec": 24
}
}'
The image
field should be a URL to the image location or a base64-encoded image.
If the NIM_ALLOW_URL_INPUT
environment variable is set to 0
, the image
field does
not accept URLs and a base64 encoded image must be provided.
Response
The server will respond with the following:
{
"b64_video": "<base64EncodedVideoString>",
"seed": 42
}
For Video2World, the video
field is required instead of image
.
cURL Request
curl -X POST \
http://0.0.0.0:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
"prompt": "A first person view from the perspective from a human sized robot as it works in a chemical plant. The robot has many boxes and supplies nearby on the industrial shelves. The camera on moving forward, at a height of 1m above the floor. Photorealistic",
"video": "https://assets.ngc.nvidia.com/products/api-catalog/cosmos/ar_result_default_robot.mp4",
"seed": 42,
"guidance_scale": 7.5,
"steps": 35,
"video_params": {
"height": 704,
"width": 1280,
"frames_count": 121,
"frames_per_sec": 24
}
}'
The video
field should be a URL to the video location or a base64-encoded video.
If the NIM_ALLOW_URL_INPUT
environment variable is set to 0
, the video
field does not accept URLs and base64 encoded video must be provided.
Response
The server will respond with the following:
{
"b64_video": "<base64EncodedVideoString>",
"seed": 42
}
Error Handling#
The API returns standard HTTP status codes to indicate success or failure:
200 OK: Request successful
400 Bad Request: Invalid input parameters
500 Internal Server Error: Server-side error
Common error scenarios include the following:
Invalid input dimensions (height/width must be multiples of 8)
Malformed JSON in the request body
Tip
Refer to the troubleshooting page for additional steps to debug errors.