Platform Concepts#

This page helps you understand the core concepts of the NeMo microservices platform.

NeMo Microservices#

NVIDIA builds the NeMo microservices based on a modular architecture where each key feature is provided as a dedicated microservice. By deploying and configuring the microservices, you can form a platform, either end-to-end or partial, on your Kubernetes cluster. Conceptually, the microservices are categorized as follows.

  • Functional Microservices: These are the major blocks to form the AI workflow such as model fine-tuning, model evaluation, and adding safety checks. These microservices include NeMo Customizer, NeMo Evaluator, and NeMo Guardrails.

  • Infrastructure Microservices: These are needed to set up infrastructure for the functional microservices. They help the functional microservices connect with databases and storage, and provide users with tools for accessing objects such as models, datasets, and metadata. These microservices include NeMo Data Store, NeMo Entity Store, NeMo Operator, NeMo Deployment Management, and NIM Proxy.

You can deploy these microservices either individually using the NVIDIA NeMo Microservices Helm Chart.

NVIDIA NeMo Microservices Helm Chart#

NVIDIA NeMo Microservices Helm Chart is a Helm chart that packages and distributes the NeMo microservices through the NVIDIA NGC Catalog. Cluster administrators can use this Helm chart to set up the NeMo microservices on Kubernetes clusters. To learn more about NVIDIA NeMo Microservices Helm Chart, refer to About Admin Setup.

Diagrams#

The NeMo microservices platform architecture encompasses metadata management, data storage, and microservice interactions for model onboarding, customization, and inference. The following diagrams illustrate how NeMo microservices interact with one another.

Entity Store Relationships#

The NeMo Entity Store manages metadata and registration for all entities in the platform, while interacting with other microservices.

entity store erd

Data Store Relationships#

NeMo Data Store is accessed by other NeMo Microservices such as NeMo Evaluator and NeMo Guardrails.

data store erd

Base Model Onboarding Workflow#

The workflow begins when a user selects an LLM (such as llama-3.3-70b-instruct), uploads it to the system, initializes the inference service, and then can make API calls to interact with the model.

base model onboarding workflow

Model Customization Workflow#

The customization process begins when a user uploads their dataset and initiates model training. Once training is complete, the fine-tuned model becomes available for inference, allowing users to interact with it through API calls.

model customization workflow