ASR HTTP REST API Reference#
Overview#
The ASR NIM exposes an HTTP REST API for offline speech recognition on the port set by NIM_HTTP_API_PORT (default 9000). All inference endpoints accept multipart/form-data requests. Use this API when you need simple curl-based access, language-agnostic client integration, or do not want to take a gRPC dependency.
Base URL:
http://<address>:9000
Streaming transcription and low-latency interactive use cases are served by the WebSocket Realtime API.
Endpoints#
POST /v1/audio/transcriptions#
Transcribes a complete audio file and returns the transcript in a single response. The entire file is sent in one request and no partial results are returned.
Content-Type: multipart/form-data
Request Parameters#
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
file |
Yes |
— |
Audio file to transcribe. Supported formats: WAV, OPUS, FLAC. |
|
string |
No |
— |
BCP-47 language code for the audio (for example, |
|
string |
No |
— |
Internal model identifier. When provided, takes precedence over |
|
string |
No |
|
Format of the response. Accepted values: |
|
number |
No |
— |
Decoding temperature. Accepted by the API; has no effect on deterministic CTC models. |
Examples#
Default JSON response:
curl -s http://localhost:9000/v1/audio/transcriptions \
-F language=en-US \
-F file="@recording.wav"
Plain-text response:
curl -s http://localhost:9000/v1/audio/transcriptions \
-F language=en-US \
-F response_format=text \
-F file="@recording.wav"
Response#
response_format=json (default):
{
"text": "What is natural language processing?"
}
response_format=text:
What is natural language processing?
Status Codes#
Code |
Description |
|---|---|
|
Transcription succeeded. |
|
A required parameter is missing, the |
|
The NIM is still loading. Retry after polling |
POST /v1/audio/translations#
Transcribes an audio file and translates the result into the specified target language, returning the translated text in a single response. Supported only by models that include a translation capability (Canary, Whisper).
Content-Type: multipart/form-data
Request Parameters#
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
file |
Yes |
— |
Audio file to transcribe and translate. Supported formats: WAV, OPUS, FLAC. |
|
string |
No |
— |
BCP-47 source language code of the audio (for example, |
|
string |
No |
— |
BCP-47 target language code for the translation output (for example, |
|
string |
No |
— |
Internal model identifier. Use |
|
string |
No |
|
|
|
number |
No |
— |
Decoding temperature. |
Example#
curl -s http://localhost:9000/v1/audio/translations \
-F language=fr-FR \
-F target_language=en-US \
-F file="@french_audio.wav"
Response#
Same structure as /v1/audio/transcriptions, with text containing the translated output:
{
"text": "The translated text in the target language."
}
Status Codes#
Code |
Description |
|---|---|
|
Translation succeeded. |
|
Missing parameter or unsupported language pair. Response body: |
|
The NIM is not ready. |
GET /v1/health/ready#
Returns the readiness state of the NIM. Use this endpoint to determine when the service has finished loading models and is ready to serve inference requests.
Example#
curl http://localhost:9000/v1/health/ready
Response#
{
"object": "health.response",
"message": "ready",
"status": "ready"
}
Status Codes#
Code |
Description |
|---|---|
|
The NIM is healthy and ready to accept requests. |
|
The NIM is still initializing. |
GET /v1/health/live#
Returns the liveness state of the NIM process. Used by container orchestrators to determine whether the container should be restarted.
Example#
curl http://localhost:9000/v1/health/live
Response#
{
"object": "health.response",
"message": "live",
"status": "live"
}
Status Codes#
Code |
Description |
|---|---|
|
The process is alive. |
|
The process is not responding. |
GET /v1/models#
Returns the list of available models in OpenAI-compatible format.
Example#
curl -s http://localhost:9000/v1/models | jq .
Response#
{
"object": "list",
"data": [
{
"id": "unknown",
"object": "model",
"created": 0,
"owned_by": "system"
}
]
}
GET /v1/version#
Returns the NIM release and API version.
Example#
curl -s http://localhost:9000/v1/version | jq .
Response#
{
"release": "1.5.1",
"api": "3.1.0"
}
GET /v1/metadata#
Returns metadata about the deployed model, including the selected profile ID and NGC model URLs.
Example#
curl -s http://localhost:9000/v1/metadata | jq .
Response#
{
"version": "1.5.1",
"selectedModelProfileId": "<profile-hash>",
"modelInfo": [
{
"modelUrl": "ngc://nim/nvidia/parakeet-1-1b-ctc-riva:<tag>",
"shortName": "parakeet-1-1b-ctc-riva:<tag>"
}
],
"repository_override": "",
"assetInfo": [],
"licenseInfo": {}
}
GET /v1/metrics#
Returns runtime metrics in Prometheus text format.
Example#
curl http://localhost:9000/v1/metrics
Key Metrics#
Metric |
Description |
|---|---|
|
Total ASR requests received since startup. |
|
ASR requests currently being processed. |
|
Total successful ASR requests. |
|
Cumulative wall-clock time spent on ASR requests, in seconds. |
Error Format#
All 4xx errors return a JSON body. The field name depends on the error source:
API-level validation errors (bad parameter value, missing language/model):
{
"detail": "Bad Request, need model or language"
}
Request parsing errors (missing required file field):
{
"error": {
"message": "file: Field required",
"type": "BadRequestError",
"code": 400
}
}
Port Configuration#
The HTTP port is configured with the NIM_HTTP_API_PORT environment variable (default: 9000). Avoid port 8000, which is reserved for the internal Triton HTTP endpoint.
docker run ... -e NIM_HTTP_API_PORT=9000 ...
For the complete list of runtime parameters, refer to Runtime Parameters.