Platform Concepts#

This page helps you understand the core concepts of the NeMo microservices platform.

NeMo Microservices#

NVIDIA builds the NeMo microservices based on a modular architecture where each key feature is provided as a dedicated microservice. By deploying and configuring the microservices, you can form a platform, either end-to-end or partial, on your Kubernetes cluster. Conceptually, the microservices are categorized as follows.

  • Core Microservices: These are the major blocks to form the AI workflow such as model fine-tuning, model evaluation, and adding safety checks. These microservices include NeMo Customizer, NeMo Evaluator, and NeMo Guardrails.

  • Platform Component Microservices: These are needed to set up the core NeMo microservices. They help the core microservices connect with databases and storage, and provide users with tools for accessing objects such as models, datasets, and metadata. These microservices include NeMo Data Store, NeMo Entity Store, NeMo Operator, DGX Cloud Admission Controller, NeMo Deployment Management, and NIM Proxy.

You can deploy these microservices either individually using the Helm charts, or together using the NeMo Microservices Helm Chart, which is introduced in the next section. While some microservices focus on specific stages of the AI development lifecycle, others provide functionality that spans multiple stages. For more information, see also Installation Scenarios.

NeMo Microservices Helm Charts#

The NeMo microservices are packaged and distributed as Helm charts through the NVIDIA NGC Catalog. Cluster administrators can use these Helm charts to set up the NeMo microservices on Kubernetes clusters.

The Helm charts are available in the following two ways.

  • NVIDIA NeMo Microservices Helm Chart: This chart includes all the microservices as dependencies. Use this for installing the entire NeMo microservices as a complete platform. This is also referred to as the NeMo platform Helm chart.

  • NeMo platform component microservices Helm charts: Each microservice also has its own Helm chart. Use the component charts to install the core microservices individually depending on your use case. Each core microservice has dependencies, external and/or other component microservices. You can configure them using the values.yaml file.

Diagrams#

The NeMo microservices platform architecture encompasses metadata management, data storage, and microservice interactions for model onboarding, customization, and inference. The following diagrams illustrate how NeMo microservices interact with one another.

Entity Store Relationships#

The NeMo Entity Store manages metadata and registration for all entities in the platform, while interacting with other microservices.

entity store erd

Data Store Relationships#

NeMo Data Store is accessed by other NeMo Microservices such as NeMo Evaluator and NeMo Guardrails.

data store erd

Base Model Onboarding Workflow#

The workflow begins when a user selects an LLM (such as llama-3.3-70b-instruct), uploads it to the system, initializes the inference service, and then can make API calls to interact with the model.

base model onboarding workflow

Model Customization Workflow#

The customization process begins when a user uploads their dataset and initiates model training. Once training is complete, the fine-tuned model becomes available for inference, allowing users to interact with it through API calls.

model customization workflow