Use Text-to-SQL Models with NVIDIA NIM for LLMs#

Text-to-SQL models translate natural language questions into executable SQL queries over a given database schema. With NVIDIA NIM for LLMs, you can generate queries by providing the model with your table definitions using Data Definition Language (DDL) and instructions on how to generate the SQL.

Text-to-SQL models include the following:

The following example Python request:

  • Provides the relevant DDL so the model understands available tables/columns

  • Instructs the model to return only SQL (or None if the question is out of scope)

  • Uses deterministic settings (for example, temperature=0) for reproducible queries

from openai import OpenAI

client = OpenAI(api_key="my-api-key", base_url="http://0.0.0.0:8000/v1")

prompt = """\
Based on DDL statements, instructions, and the current date, generate a SQL query in sqlite to answer the question.
If the question cannot be answered using the available tables and columns in the DDL (i.e., it is out of scope), return only: None.
Today is 2025-12-31 23:59:00
DDL statements:

DROP TABLE IF EXISTS diagnosis;
CREATE TABLE diagnosis
(
    diagnosisid INT NOT NULL PRIMARY KEY,
    patientunitstayid INT NOT NULL,
    diagnosisname VARCHAR(200) NOT NULL,
    diagnosistime TIMESTAMP(0) NOT NULL,
    icd9code VARCHAR(100)
);

Instructions:
- Respond only with the SQL query in markdown format. If unsure, reply with "None".
question: How many patients have been diagnosed?
"""

response = client.chat.completions.create(
    model="nvidia/llama-3.1-nemotron-nano-8b-healthcare-text2sql-v1.0",
    messages=[
        {"role": "system", "content": "detailed thinking off."},
        {"role": "user", "content": prompt},
    ],
    temperature=0,
    top_p=1,
)

raw_sql_query = response.choices[0].message.content
print(raw_sql_query)

Example output:

SELECT COUNT(DISTINCT diagnosis.patientunitstayid) FROM diagnosis;