ASR HTTP REST API Reference#

Overview#

The ASR NIM exposes an HTTP REST API for offline speech recognition on the port set by NIM_HTTP_API_PORT (default 9000). All inference endpoints accept multipart/form-data requests. Use this API when you need simple curl-based access, language-agnostic client integration, or do not want to take a gRPC dependency.

Base URL:

http://<address>:9000

Streaming transcription and low-latency interactive use cases are served by the WebSocket Realtime API.

Endpoints#

`POST /v1/audio/transcriptions`#

Transcribes a complete audio file and returns the transcript in a single response. The entire file is sent in one request and no partial results are returned.

Content-Type: multipart/form-data

Request Parameters#

Parameter	Type	Required	Default	Description
`file`	file	Yes	—	Audio file to transcribe. Supported formats: WAV, OPUS, FLAC.
`language`	string	No	—	BCP-47 language code for the audio (for example, `en-US`, `es-US`, `zh-CN`, `multi`). Either `language` or `model` must be provided; omitting both returns a `400` error.
`model`	string	No	—	Internal model identifier. When provided, takes precedence over `language` for model selection. Use `language` for standard deployments.
`response_format`	string	No	`json`	Format of the response. Accepted values: `json` (returns `{"text": "..."}`) or `text` (returns a plain string).
`temperature`	number	No	—	Decoding temperature. Accepted by the API; has no effect on deterministic CTC models.

Examples#

Default JSON response:

curl -s http://localhost:9000/v1/audio/transcriptions \
  -F language=en-US \
  -F file="@recording.wav"

Plain-text response:

curl -s http://localhost:9000/v1/audio/transcriptions \
  -F language=en-US \
  -F response_format=text \
  -F file="@recording.wav"

Response#

response_format=json (default):

{
  "text": "What is natural language processing?"
}

response_format=text:

What is natural language processing?

Status Codes#

Code	Description
`200 OK`	Transcription succeeded.
`400 Bad Request`	A required parameter is missing, the `response_format` value is not supported, or the specified `model` is not recognized. Response body: `{"detail": "<reason>"}`.
`503 Service Unavailable`	The NIM is still loading. Retry after polling `/v1/health/ready`.

`POST /v1/audio/translations`#

Transcribes an audio file and translates the result into the specified target language, returning the translated text in a single response. Supported only by models that include a translation capability (Canary, Whisper).

Content-Type: multipart/form-data

Request Parameters#

Parameter	Type	Required	Default	Description
`file`	file	Yes	—	Audio file to transcribe and translate. Supported formats: WAV, OPUS, FLAC.
`language`	string	No	—	BCP-47 source language code of the audio (for example, `fr-FR`, `en-US`). Either `language` or `model` must be provided.
`target_language`	string	No	—	BCP-47 target language code for the translation output (for example, `en-US`, `fr-FR`).
`model`	string	No	—	Internal model identifier. Use `language` for standard deployments.
`response_format`	string	No	`json`	`json` or `text`.
`temperature`	number	No	—	Decoding temperature.

Example#

curl -s http://localhost:9000/v1/audio/translations \
  -F language=fr-FR \
  -F target_language=en-US \
  -F file="@french_audio.wav"

Response#

Same structure as /v1/audio/transcriptions, with text containing the translated output:

{
  "text": "The translated text in the target language."
}

Status Codes#

Code	Description
`200 OK`	Translation succeeded.
`400 Bad Request`	Missing parameter or unsupported language pair. Response body: `{"detail": "<reason>"}`.
`503 Service Unavailable`	The NIM is not ready.

`GET /v1/health/ready`#

Returns the readiness state of the NIM. Use this endpoint to determine when the service has finished loading models and is ready to serve inference requests.

Example#

curl http://localhost:9000/v1/health/ready

Response#

{
  "object": "health.response",
  "message": "ready",
  "status": "ready"
}

Status Codes#

Code	Description
`200 OK`	The NIM is healthy and ready to accept requests.
`503 Service Unavailable`	The NIM is still initializing.

`GET /v1/health/live`#

Returns the liveness state of the NIM process. Used by container orchestrators to determine whether the container should be restarted.

Example#

curl http://localhost:9000/v1/health/live

Response#

{
  "object": "health.response",
  "message": "live",
  "status": "live"
}

Status Codes#

Code	Description
`200 OK`	The process is alive.
`503 Service Unavailable`	The process is not responding.

`GET /v1/models`#

Returns the list of available models in OpenAI-compatible format.

Example#

curl -s http://localhost:9000/v1/models | jq .

Response#

{
  "object": "list",
  "data": [
    {
      "id": "unknown",
      "object": "model",
      "created": 0,
      "owned_by": "system"
    }
  ]
}

`GET /v1/version`#

Returns the NIM release and API version.

Example#

curl -s http://localhost:9000/v1/version | jq .

Response#

{
  "release": "1.5.1",
  "api": "3.1.0"
}

`GET /v1/metadata`#

Returns metadata about the deployed model, including the selected profile ID and NGC model URLs.

Example#

curl -s http://localhost:9000/v1/metadata | jq .

Response#

{
  "version": "1.5.1",
  "selectedModelProfileId": "<profile-hash>",
  "modelInfo": [
    {
      "modelUrl": "ngc://nim/nvidia/parakeet-1-1b-ctc-riva:<tag>",
      "shortName": "parakeet-1-1b-ctc-riva:<tag>"
    }
  ],
  "repository_override": "",
  "assetInfo": [],
  "licenseInfo": {}
}

`GET /v1/metrics`#

Returns runtime metrics in Prometheus text format.

Example#

curl http://localhost:9000/v1/metrics

Key Metrics#

Metric	Description
`num_requests_asr_total`	Total ASR requests received since startup.
`num_requests_asr_running`	ASR requests currently being processed.
`num_requests_asr_success_total`	Total successful ASR requests.
`request_duration_seconds_asr_total`	Cumulative wall-clock time spent on ASR requests, in seconds.

Error Format#

All 4xx errors return a JSON body. The field name depends on the error source:

API-level validation errors (bad parameter value, missing language/model):

{
  "detail": "Bad Request, need model or language"
}

Request parsing errors (missing required file field):

{
  "error": {
    "message": "file: Field required",
    "type": "BadRequestError",
    "code": 400
  }
}

Port Configuration#

The HTTP port is configured with the NIM_HTTP_API_PORT environment variable (default: 9000). Avoid port 8000, which is reserved for the internal Triton HTTP endpoint.

docker run ... -e NIM_HTTP_API_PORT=9000 ...

For the complete list of runtime parameters, refer to Runtime Parameters.

ASR HTTP REST API Reference#

Overview#

Endpoints#

POST /v1/audio/transcriptions#

Request Parameters#

Examples#

Response#

Status Codes#

POST /v1/audio/translations#

Request Parameters#

Example#

Response#

Status Codes#

GET /v1/health/ready#

Example#

Response#

Status Codes#

GET /v1/health/live#

Example#

Response#

Status Codes#

GET /v1/models#

Example#

Response#

GET /v1/version#

Example#

Response#

GET /v1/metadata#

Example#

Response#

GET /v1/metrics#

Example#

Key Metrics#

Error Format#

Port Configuration#

Related#

`POST /v1/audio/transcriptions`#

`POST /v1/audio/translations`#

`GET /v1/health/ready`#

`GET /v1/health/live`#

`GET /v1/models`#

`GET /v1/version`#

`GET /v1/metadata`#

`GET /v1/metrics`#