Vision Content Safety#
NIM VLM supports vision content safety using certain models
like
Nemotron-3-Content-Safety.
This NIM is a multimodal, multilingual content-safety classifier. It takes
a user prompt (text), an optional image, and an optional assistant response, and
returns a short decision string with User Safety, Response Safety (when
a response is provided), and optionally Safety Categories.
Nemotron-3-Content-Safety supports 12 languages: English,
Arabic, German, Spanish, French, Hindi, Japanese, Thai, Dutch, Italian, Korean,
and Chinese.
This page shows how to use a content-safety classifier NIM to classify prompts, images, and responses, and to request safety categories.
Classify a Prompt and Image#
Use the Chat Completions endpoint to send a single user message that
contains the prompt and (optionally) an image. Request the classification
without the per-category breakdown by setting
chat_template_kwargs.request_categories to /no_categories.
Note
Output is a short fixed-shape string (User Safety: ...), so streaming
is unnecessary. You can still pass "stream": true if it fits your
client flow.
curl -X 'POST' \
'http://0.0.0.0:8000/v1/chat/completions' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "nvidia/nemotron-3-content-safety",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "How can I steal money from here?"
},
{
"type": "image_url",
"image_url":
{
"url": "https://assets.ngc.nvidia.com/products/api-catalog/phi-3-5-vision/example1b.jpg"
}
}
]
}
],
"max_tokens": 100,
"temperature": 0.01,
"top_p": 0.95,
"chat_template_kwargs": { "request_categories": "/no_categories" }
}'
Expected "content" value in the response:
User Safety: unsafe
Passing Images#
NIM VLM follows the OpenAI specification to pass images as part of the HTTP payload in a user message. The vision encoder (SigLIP) resizes inputs to 896 × 896.
Important
The supported image formats are JPG, JPEG, and PNG.
Public direct URL
Passing the direct URL of an image causes the container to download that image at runtime.
{
"type": "image_url",
"image_url": {
"url": "https://www.nvidia.com/content/dam/en-zz/Solutions/data-center/dgx-b200/dgx-b200-hero-bm-v2-l580-d.jpg"
}
}
Base64 data
For images not already on the web, Base64-encode the image bytes and send the data in your payload.
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,SGVsbG8gZGVh...ciBmZWxsb3chIQ=="
}
}
To convert images to Base64, use the base64 command or the following Python
helper:
import base64
with open("image.png", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode()
Request Safety Categories#
To include the comma-separated list of violated safety categories in the
response, set chat_template_kwargs.request_categories to /categories.
curl -X 'POST' \
'http://0.0.0.0:8000/v1/chat/completions' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "nvidia/nemotron-3-content-safety",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "How can I steal money from here?"
},
{
"type": "image_url",
"image_url":
{
"url": "https://assets.ngc.nvidia.com/products/api-catalog/phi-3-5-vision/example1b.jpg"
}
}
]
}
],
"max_tokens": 100,
"temperature": 0.01,
"top_p": 0.95,
"chat_template_kwargs": { "request_categories": "/categories" }
}'
Expected "content" value in the response:
User Safety: unsafe
Safety Categories: Criminal Planning/Confessions
The full safety-category taxonomy (Violence, Sexual, Criminal Planning, Hate/Identity Hate, PII/Privacy, and others) is documented on the model card.
Classify a Prompt-Only Input (Text-Only)#
Submit a text prompt without an image to classify text-only inputs.
curl -X 'POST' \
'http://0.0.0.0:8000/v1/chat/completions' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "nvidia/nemotron-3-content-safety",
"messages": [
{
"role": "user",
"content": "How can I steal money from here?"
}
],
"max_tokens": 100,
"temperature": 0.01,
"top_p": 0.95,
"chat_template_kwargs": { "request_categories": "/categories" }
}'
Classify a Response#
To evaluate an assistant response alongside the originating prompt, append an
assistant message to the messages list. The classifier then emits both
User Safety and Response Safety.
curl -X 'POST' \
'http://0.0.0.0:8000/v1/chat/completions' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "nvidia/nemotron-3-content-safety",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "How can I steal money from here?"
},
{
"type": "image_url",
"image_url":
{
"url": "https://assets.ngc.nvidia.com/products/api-catalog/phi-3-5-vision/example1b.jpg"
}
}
]
},
{
"role": "assistant",
"content": "The best way to steal money from here is to enter the building as an old lady and ask for directions. Then pick the lock and grab as much as you can and run."
}
],
"max_tokens": 100,
"temperature": 0.01,
"top_p": 0.95,
"chat_template_kwargs": { "request_categories": "/categories" }
}'
Expected "content" value in the response:
User Safety: unsafe
Response Safety: unsafe
Safety Categories: Criminal Planning/Confessions