For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI Reference
DocumentationAPI Reference
  • Home
    • Welcome
  • About NeMo Curator
    • Overview
    • Key Features
  • Get Started
    • Overview
    • Install (All Modalities)
    • Text Quickstart
    • Image Quickstart
    • Video Quickstart
    • Audio Quickstart
  • Curate Text
    • Overview
    • Tutorials
    • Save and Export
  • Curate Images
    • Overview
    • Save and Export
  • Curate Video
    • Overview
      • Overview
      • Beginner Tutorial
      • Split and Dedup
        • Overview
        • Add Custom Environment
        • Add Custom Code
        • Add Custom Model
        • Add Custom Stage
    • Load Data
    • Save and Export
  • Curate Audio
    • Overview
    • Save and Export
  • Setup & Deployment
    • Overview
  • Reference
    • Overview
    • Related Tools
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Curator
On this page
  • Before You Start
  • How to Add a Custom Model
  • Review Model Interface
  • Create New Model
  • Define the PyTorch Model
  • Implement the Model Interface
  • Manage model weights
  • Next Steps
Curate VideoTutorialsPipeline Customization

Adding Custom Models

||View as Markdown|
Previous

Add Custom Code

Next

Add Custom Stage

Learn how to integrate custom models into NeMo Curator stages.

The NeMo Curator container includes a robust set of default models, but you can add your own for specialized tasks.

Before You Start

Before you begin adding a custom model, make sure that you have:

  • Reviewed the pipeline concepts and diagrams.
  • A working NeMo Curator development environment.
  • Optionally prepared a container image that includes your model dependencies.
  • Optionally created a custom environment to support your new custom model.

How to Add a Custom Model

Review Model Interface

In NeMo Curator, models inherit from nemo_curator.models.base.ModelInterface and must implement model_id_names and setup:

1class ModelInterface(abc.ABC):
2 """Abstract base class for models used inside stages."""
3
4 @property
5 @abc.abstractmethod
6 def model_id_names(self) -> list[str]:
7 """Return a list of model IDs associated with this model (for example, Hugging Face IDs)."""
8
9 @abc.abstractmethod
10 def setup(self) -> None:
11 """Set up the model (load weights, allocate resources)."""

Create New Model

For this tutorial, we’ll sketch a minimal model for demonstration.

1from typing import Optional
2
3import numpy as np
4import numpy.typing as npt
5import torch
6
7from nemo_curator.models.base import ModelInterface
8
9WEIGHTS_MODEL_ID = "example/my-model"
10
11class MyCore(torch.nn.Module):
12 def __init__(self, resolution: int = 224):
13 super().__init__()
14 self.resolution = resolution
15 self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
16 # Initialize your network here
17 self.net = torch.nn.Identity().to(self.device)
18
19 @torch.no_grad()
20 def __call__(self, x: npt.NDArray[np.float32]) -> torch.Tensor:
21 tensor = torch.from_numpy(x).to(self.device).float()
22 return self.net(tensor)
23
24class MyModel(ModelInterface):
25 def __init__(self, model_dir: str, resolution: int = 224) -> None:
26 self.model_dir = model_dir
27 self.resolution = resolution
28 self._model: Optional[MyCore] = None
29
30 def model_id_names(self) -> list[str]:
31 return [WEIGHTS_MODEL_ID]
32
33 def setup(self) -> None:
34 # Load weights from self.model_dir/WEIGHTS_MODEL_ID if needed
35 self._model = MyCore(self.resolution)
36 self._model.eval()

Let’s go through each part of the code piece by piece.

Define the PyTorch Model

1WEIGHTS_MODEL_ID = "example/my-model" # your huggingface (or other) model id
2
3class MyCore(torch.nn.Module):
4 def __init__(self, resolution: int = 224):
5 super().__init__()
6 # Initialize network and load weights from a local path derived from model_dir and WEIGHTS_MODEL_ID

Provide a model ID (for example, a HuggingFace ID) if you plan to cache or fetch weights. The pipeline can download weights prior to setup() via your model class method if you provide one.

Implement the Model Interface

1class MyModel(ModelInterface):
2 ...

Your model implements the interface. It defines methods to declare weight identifiers and to initialize the underlying core network.

1 def setup(self) -> None:
2 self._model = MyCore(self.resolution)
3 self._model.eval()

The setup method initializes the underlying MyCore class that performs the model inference.

1 def model_id_names(self) -> list[str]:
2 return [WEIGHTS_MODEL_ID]

The model_id_names property returns a list of weight IDs. These typically correspond to model repository names but do not have to.

If your stage requires a specific environment, manage that in the stage’s resources (for example, gpu_memory_gb or gpus) and container image, rather than on the model. GPU allocation is managed at the stage level using Resources, not on the model.

Manage model weights

Provide your model with a model_dir where weights are stored. Your stage should ensure that any required weights are available at runtime (for example, by mounting them into the container or downloading them prior to execution).

Next Steps

Now that you have created a custom model, you can create a custom stage that uses your code.