Troubleshooting — Nemotron

Unknown column type: 'person' or similar ValueError

Cause: The YAML declares a column type that step.py’s build_columns() does not recognise. Currently supported types: category, seed, llm_text, llm_structured, llm_judge.

Solution: Check the spelling. For person and datetime sampler support, step.py must be extended — see the extension reference in Config Schema.

config must declare a non-empty columns: list

Cause: The YAML has an empty or missing columns: block.

Solution: Add at least one column spec. A minimal config must include at least one llm_text or llm_structured column that produces output content.

Jinja2 template references an undefined variable

Cause: A prompt uses {{ column_name }} but column_name is neither a declared column, a seed field in seed_dataset.fields, nor an earlier column in the list.

Solution: Add the column or seed field, or fix the typo. Run preview=true num_records=2 to catch this cheaply before a full generation job.

Model health check fails at startup

Cause: Data Designer probes the model endpoint at startup. If the model is not available from the configured provider, or if NVIDIA_API_KEY is not set, the probe fails and the step exits before generating any records.

Solution:

Confirm export NVIDIA_API_KEY="..." is set.
Add skip_health_check: true to the model spec to bypass the probe (useful for local or vLLM endpoints that aren’t in the provider catalog).

Output JSONL is empty or has fewer records than num_records

Cause: Data Designer skips or drops records where the structured output doesn’t validate against output_format, or where the LLM returns a refusal.

Solution:

Run preview=true and inspect a sample for refusals or schema mismatches.
Simplify the output_format if the model consistently fails to match a complex schema.
Raise max_tokens if responses are being cut off mid-JSON.

Troubleshooting#

Local Run Failures#

Cluster Dispatch Failures#

Troubleshooting#

Local Run Failures#

Cluster Dispatch Failures#

Related#