Connect to an LLM Service#

Connect to large language model services using the OpenAI API format or NeMo Deploy. The OpenAI API format enables querying models across many platforms beyond OpenAI’s own services.

Choosing a Service#

Consider the following comparison of features to help select the service type that best matches your deployment requirements and infrastructure preferences:

Table 12 Service Comparison#
Feature	OpenAI API Compatible Services	NeMo Deploy
Hosting	Externally hosted with rate limits	Self-hosted with unlimited queries
Setup	Minimal setup required	More control and better performance
Models	Works with any compatible service	Optimized for NVIDIA models