Run LLM Translation#
Use this guide when backend must stay llm and you need to point nemotron steps run translate/nemo_curator at an OpenAI-compatible chat-completions endpoint and model.
Prerequisites#
Set
NVIDIA_API_KEYin your shell when relying on the defaultserver.api_key_env, for exampleexport NVIDIA_API_KEY="<api-key>".Confirm
server.urlmatches your deployment. Thedefault.yamlfile targets the NVIDIA integrate API.
Procedure#
Start from
default.yamlwith-c default.Override model and languages:
uv run nemotron steps run translate/nemo_curator -c default \
backend=llm \
input_path=/path/to/chat.jsonl \
output_dir=/path/to/out \
source_language=en \
target_language=de \
server.model=YOUR_LLM_MODEL_ID
Adjust
max_concurrent_requestsupward only after verifying the endpoint tolerates parallel completions.
Hosted Model Hygiene#
Hosted catalogs retire models frequently. Pin to identifiers your tenant currently exposes before large batch jobs.