Create a Model Entity | NVIDIA NeMo Platform

Create a Model Entity that references your FileSet to enable customization jobs.

Prerequisites

Created a FileSet containing your model checkpoint (refer to Create a Model FileSet).
Set the NMP_BASE_URL environment variable.

$ export NMP_BASE_URL="https://your-nemo-platform-url"

Create the Model Entity

1 import os
2 from nemo_platform import NeMoPlatform
3 from nemo_platform._exceptions import ConflictError
4 
5 client = NeMoPlatform(
6     base_url=os.environ.get("NMP_BASE_URL", "http://localhost:8080"),
7     workspace="default",
8 )
9 
10 try:
11     model = client.models.create(
12         workspace="default",
13         name="qwen3-1.7b",
14         fileset="default/qwen3-1.7b",  # Reference: workspace/fileset-name
15         description="Qwen3 1.7B base model for customization",
16     )
17     print(f"Created Model Entity: {model.name}")
18 except ConflictError:
19     print("Model Entity already exists")
20     model = client.models.retrieve(workspace="default", name="qwen3-1.7b")

Example Response

1 {
2   "id": "model-abc123def456",
3   "name": "qwen3-1.7b",
4   "workspace": "default",
5   "description": "Qwen3 1.7B base model for customization",
6   "fileset": "default/qwen3-1.7b",
7   "spec": null,
8   "adapters": [],
9   "created_at": "2026-02-09T10:30:00Z",
10   "updated_at": "2026-02-09T10:30:00Z"
11 }

Note: spec is initially null and will be auto-populated by the Models Controller.

Wait for Model Spec Auto-Population

The platform’s Models Controller automatically analyzes your model files and populates the spec field with metadata like architecture, parameter count, and layer information. This is required before running customization jobs.

1 import time
2 
3 # Poll until spec is populated
4 print("Waiting for model spec to be auto-populated...")
5 while True:
6     model = client.models.retrieve(workspace="default", name="qwen3-1.7b")
7     if model.spec:
8         break
9     print(" Still waiting...")
10     time.sleep(5)
11 
12 print(f"\nModel ready for customization!")
13 print(f" Family: {model.spec.family}")
14 print(f" Parameters: {model.spec.base_num_parameters:,}")
15 print(f" Hidden Size: {model.spec.hidden_size}")
16 print(f" Layers: {model.spec.num_layers}")
17 print(f" Attention Heads: {model.spec.num_attention_heads}")

Example Model Spec

1 {
2   "spec": {
3     "family": "qwen",
4     "base_num_parameters": 1720574976,
5     "hidden_size": 2048,
6     "num_layers": 28,
7     "num_attention_heads": 16,
8     "num_kv_heads": 8,
9     "vocab_size": 151936,
10     "context_size": 40960
11   }
12 }

Verify the Model Entity

Before using in a customization job, verify everything is set up correctly:

1 model = client.models.retrieve(workspace="default", name="qwen3-1.7b")
2 
3 # Verification checks
4 assert model.spec is not None, "Model spec not yet populated - wait longer"
5 assert model.fileset, "FileSet reference missing"
6 
7 print("Model Entity is ready for customization.")
8 print(f"\nUse in customization jobs as:")
9 print(f' model: "{model.workspace}/{model.name}"')

Using the Model Entity in Customization Jobs

After your Model Entity is ready (has a populated spec), reference it in a customization job. Jobs are submitted to a backend (automodel shown here; unsloth is also available):

1 from nemo_automodel_plugin.schema import AutomodelJobInput
2 
3 # Create a customization job using the Model Entity
4 spec = AutomodelJobInput(
5     model="default/qwen3-1.7b",  # Created above
6     dataset={"training": "default/my-training-dataset"},
7     training={"training_type": "sft", "finetuning_type": "lora"},
8     schedule={"epochs": 1},
9 )
10 job = client.customization.automodel.jobs.create(spec=spec, workspace="default", name="my-lora-job")

Refer to create-job for complete job creation details.

Post-Training Output

After a customization job completes, the output depends on the fine-tuning regime (training.finetuning_type):

LoRA training (finetuning_type: "lora"): An adapter is attached to this Model Entity. The adapter contains only the trained LoRA weights.
Full-weight training (finetuning_type: "all_weights"): A new Model Entity is created containing the complete fine-tuned model weights. This new entity has a base_model field linking back to the original.

For LoRA jobs, you can list adapters attached to a model:

1 model = client.models.retrieve(workspace="default", name="qwen3-1.7b")
2 
3 if model.adapters:
4     print(f"Adapters attached to {model.name}:")
5     for adapter in model.adapters:
6         print(f" - {adapter.name} (enabled: {adapter.enabled})")
7 else:
8     print("No adapters attached yet")