ASR HTTP REST API Reference#

Top

Overview#

The ASR NIM exposes an HTTP REST API for offline speech recognition on the port set by NIM_HTTP_API_PORT (default 9000). All inference endpoints accept multipart/form-data requests. Use this API when you need simple curl-based access, language-agnostic client integration, or do not want to take a gRPC dependency.

Base URL:

http://<address>:9000

Streaming transcription and low-latency interactive use cases are served by the WebSocket Realtime API.


Endpoints#

POST /v1/audio/transcriptions#

Transcribes a complete audio file and returns the transcript in a single response. The entire file is sent in one request and no partial results are returned.

Content-Type: multipart/form-data

Request Parameters#

Parameter

Type

Required

Default

Description

file

file

Yes

Audio file to transcribe. Supported formats: WAV, OPUS, FLAC.

language

string

No

BCP-47 language code for the audio (for example, en-US, es-US, zh-CN, multi). Either language or model must be provided; omitting both returns a 400 error.

model

string

No

Internal model identifier. When provided, takes precedence over language for model selection. Use language for standard deployments.

response_format

string

No

json

Format of the response. Accepted values: json (returns {"text": "..."}) or text (returns a plain string).

temperature

number

No

Decoding temperature. Accepted by the API; has no effect on deterministic CTC models.

Examples#

Default JSON response:

curl -s http://localhost:9000/v1/audio/transcriptions \
  -F language=en-US \
  -F file="@recording.wav"

Plain-text response:

curl -s http://localhost:9000/v1/audio/transcriptions \
  -F language=en-US \
  -F response_format=text \
  -F file="@recording.wav"

Response#

response_format=json (default):

{
  "text": "What is natural language processing?"
}

response_format=text:

What is natural language processing?

Status Codes#

Code

Description

200 OK

Transcription succeeded.

400 Bad Request

A required parameter is missing, the response_format value is not supported, or the specified model is not recognized. Response body: {"detail": "<reason>"}.

503 Service Unavailable

The NIM is still loading. Retry after polling /v1/health/ready.


POST /v1/audio/translations#

Transcribes an audio file and translates the result into the specified target language, returning the translated text in a single response. Supported only by models that include a translation capability (Canary, Whisper).

Content-Type: multipart/form-data

Request Parameters#

Parameter

Type

Required

Default

Description

file

file

Yes

Audio file to transcribe and translate. Supported formats: WAV, OPUS, FLAC.

language

string

No

BCP-47 source language code of the audio (for example, fr-FR, en-US). Either language or model must be provided.

target_language

string

No

BCP-47 target language code for the translation output (for example, en-US, fr-FR).

model

string

No

Internal model identifier. Use language for standard deployments.

response_format

string

No

json

json or text.

temperature

number

No

Decoding temperature.

Example#

curl -s http://localhost:9000/v1/audio/translations \
  -F language=fr-FR \
  -F target_language=en-US \
  -F file="@french_audio.wav"

Response#

Same structure as /v1/audio/transcriptions, with text containing the translated output:

{
  "text": "The translated text in the target language."
}

Status Codes#

Code

Description

200 OK

Translation succeeded.

400 Bad Request

Missing parameter or unsupported language pair. Response body: {"detail": "<reason>"}.

503 Service Unavailable

The NIM is not ready.


GET /v1/health/ready#

Returns the readiness state of the NIM. Use this endpoint to determine when the service has finished loading models and is ready to serve inference requests.

Example#

curl http://localhost:9000/v1/health/ready

Response#

{
  "object": "health.response",
  "message": "ready",
  "status": "ready"
}

Status Codes#

Code

Description

200 OK

The NIM is healthy and ready to accept requests.

503 Service Unavailable

The NIM is still initializing.


GET /v1/health/live#

Returns the liveness state of the NIM process. Used by container orchestrators to determine whether the container should be restarted.

Example#

curl http://localhost:9000/v1/health/live

Response#

{
  "object": "health.response",
  "message": "live",
  "status": "live"
}

Status Codes#

Code

Description

200 OK

The process is alive.

503 Service Unavailable

The process is not responding.


GET /v1/models#

Returns the list of available models in OpenAI-compatible format.

Example#

curl -s http://localhost:9000/v1/models | jq .

Response#

{
  "object": "list",
  "data": [
    {
      "id": "unknown",
      "object": "model",
      "created": 0,
      "owned_by": "system"
    }
  ]
}

GET /v1/version#

Returns the NIM release and API version.

Example#

curl -s http://localhost:9000/v1/version | jq .

Response#

{
  "release": "1.5.1",
  "api": "3.1.0"
}

GET /v1/metadata#

Returns metadata about the deployed model, including the selected profile ID and NGC model URLs.

Example#

curl -s http://localhost:9000/v1/metadata | jq .

Response#

{
  "version": "1.5.1",
  "selectedModelProfileId": "<profile-hash>",
  "modelInfo": [
    {
      "modelUrl": "ngc://nim/nvidia/parakeet-1-1b-ctc-riva:<tag>",
      "shortName": "parakeet-1-1b-ctc-riva:<tag>"
    }
  ],
  "repository_override": "",
  "assetInfo": [],
  "licenseInfo": {}
}

GET /v1/metrics#

Returns runtime metrics in Prometheus text format.

Example#

curl http://localhost:9000/v1/metrics

Key Metrics#

Metric

Description

num_requests_asr_total

Total ASR requests received since startup.

num_requests_asr_running

ASR requests currently being processed.

num_requests_asr_success_total

Total successful ASR requests.

request_duration_seconds_asr_total

Cumulative wall-clock time spent on ASR requests, in seconds.


Error Format#

All 4xx errors return a JSON body. The field name depends on the error source:

API-level validation errors (bad parameter value, missing language/model):

{
  "detail": "Bad Request, need model or language"
}

Request parsing errors (missing required file field):

{
  "error": {
    "message": "file: Field required",
    "type": "BadRequestError",
    "code": 400
  }
}

Port Configuration#

The HTTP port is configured with the NIM_HTTP_API_PORT environment variable (default: 9000). Avoid port 8000, which is reserved for the internal Triton HTTP endpoint.

docker run ... -e NIM_HTTP_API_PORT=9000 ...

For the complete list of runtime parameters, refer to Runtime Parameters.