Connect to an LLM Service#

Connect to large language model services using the OpenAI API format or NeMo Deploy. The OpenAI API format enables querying models across many platforms beyond OpenAI’s own services.

Choosing a Service#

Consider the following comparison of features to help select the service type that best matches your deployment requirements and infrastructure preferences:

Table 12 Service Comparison#

Feature

OpenAI API Compatible Services

NeMo Deploy

Hosting

Externally hosted with rate limits

Self-hosted with unlimited queries

Setup

Minimal setup required

More control and better performance

Models

Works with any compatible service

Optimized for NVIDIA models


Implementation Options#

OpenAI Compatible Services

Connect to hosted model endpoints using the OpenAI API format

OpenAI Compatible Services
NeMo Deploy

Deploy and connect to your own self-hosted model endpoints

NeMo Deploy
Reward Models

Query reward models to score conversations and filter datasets

Reward Models