Using Reward Models#
NIM LLM supports deploying Large Language reward
models, in addition to chat
and completion
models. Reward models are often used to score the outputs of another large language model for further fine tuning that model or filtering synthetically created datasets.
To send text to a reward model, you can use the chat/completions
endpoint like other kinds of models. Include the prompt that was used to generate the text as the first user
content, and the response from the model as the assistant
content. The reward model will score the provided model response, taking into account the query that generated it. For example:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="none")
original_query = "I am going to Paris, what should I see?"
original_response = "Ah, Paris, the City of Light! There are so many amazing things to see and do in this beautiful city ..."
messages = [
{"role": "user", "content": original_query},
{"role": "assistant", "content": original_response}
]
response = client.chat.completions.create(
model="nvidia/nemotron-4-340b-reward",
messages=messages,
stream=False
)
The response from NIM will include attribute and score pairs in the message content, where a regular chat completion model would return its generated text. The attributes that a reward model scores responses on are specific to each reward model. Reward models that are trained using the HelpSteer
dataset (like nemotron-4-340b
) score responses according to the following metrics:
Helpfulness
Correctness
Coherence
Complexity
Verbosity
You can use this response in your downstream applications. For example, you may want to parse the scores into a python dictionary:
response_content = response.choices[0].message.content
reward_pairs = [pair.split(":") for pair in response_content.split(",")]
reward_dict = {attribute: float(score) for attribute, score in reward_pairs}
print(reward_dict)
# Prints:
# {'helpfulness': 1.2578125, 'correctness': 0.43359375, 'coherence': 3.34375, 'complexity': 0.045166015625, 'verbosity': 0.6953125}