Create a Model Entity

View as Markdown

Create a Model Entity that references your FileSet to enable customization jobs.

Prerequisites

  • Created a FileSet containing your model checkpoint (refer to Create a Model FileSet).
  • Set the NMP_BASE_URL environment variable.
$export NMP_BASE_URL="https://your-nemo-platform-url"

Create the Model Entity

1import os
2from nemo_platform import NeMoPlatform
3from nemo_platform._exceptions import ConflictError
4
5client = NeMoPlatform(
6 base_url=os.environ.get("NMP_BASE_URL", "http://localhost:8080"),
7 workspace="default",
8)
9
10try:
11 model = client.models.create(
12 workspace="default",
13 name="qwen3-1.7b",
14 fileset="default/qwen3-1.7b", # Reference: workspace/fileset-name
15 description="Qwen3 1.7B base model for customization",
16 )
17 print(f"Created Model Entity: {model.name}")
18except ConflictError:
19 print("Model Entity already exists")
20 model = client.models.retrieve(workspace="default", name="qwen3-1.7b")
1{
2 "id": "model-abc123def456",
3 "name": "qwen3-1.7b",
4 "workspace": "default",
5 "description": "Qwen3 1.7B base model for customization",
6 "fileset": "default/qwen3-1.7b",
7 "spec": null,
8 "adapters": [],
9 "created_at": "2026-02-09T10:30:00Z",
10 "updated_at": "2026-02-09T10:30:00Z"
11}

Note: spec is initially null and will be auto-populated by the Models Controller.


Wait for Model Spec Auto-Population

The platform’s Models Controller automatically analyzes your model files and populates the spec field with metadata like architecture, parameter count, and layer information. This is required before running customization jobs.

1import time
2
3# Poll until spec is populated
4print("Waiting for model spec to be auto-populated...")
5while True:
6 model = client.models.retrieve(workspace="default", name="qwen3-1.7b")
7 if model.spec:
8 break
9 print(" Still waiting...")
10 time.sleep(5)
11
12print(f"\nModel ready for customization!")
13print(f" Family: {model.spec.family}")
14print(f" Parameters: {model.spec.base_num_parameters:,}")
15print(f" Hidden Size: {model.spec.hidden_size}")
16print(f" Layers: {model.spec.num_layers}")
17print(f" Attention Heads: {model.spec.num_attention_heads}")
1{
2 "spec": {
3 "family": "qwen",
4 "base_num_parameters": 1720574976,
5 "hidden_size": 2048,
6 "num_layers": 28,
7 "num_attention_heads": 16,
8 "num_key_value_heads": 8,
9 "vocab_size": 151936,
10 "max_sequence_length": 40960
11 }
12}

Verify the Model Entity

Before using in a customization job, verify everything is set up correctly:

1model = client.models.retrieve(workspace="default", name="qwen3-1.7b")
2
3# Verification checks
4assert model.spec is not None, "Model spec not yet populated - wait longer"
5assert model.fileset, "FileSet reference missing"
6
7print("Model Entity is ready for customization.")
8print(f"\nUse in customization jobs as:")
9print(f' model: "{model.workspace}/{model.name}"')

Using the Model Entity in Customization Jobs

After your Model Entity is ready (has a populated spec), reference it in a customization job. Jobs are submitted to a backend (automodel shown here; unsloth is also available):

1from nemo_automodel_plugin.schema import AutomodelJobInput
2
3# Create a customization job using the Model Entity
4spec = AutomodelJobInput(
5 model="default/qwen3-1.7b", # Created above
6 dataset={"training": "default/my-training-dataset"},
7 training={"training_type": "sft", "finetuning_type": "lora"},
8 schedule={"epochs": 1},
9)
10job = client.customization.automodel.jobs.create(spec=spec, workspace="default", name="my-lora-job")

Refer to create-job for complete job creation details.


Post-Training Output

After a customization job completes, the output depends on the fine-tuning regime (training.finetuning_type):

  • LoRA training (finetuning_type: "lora"): An adapter is attached to this Model Entity. The adapter contains only the trained LoRA weights.
  • Full-weight training (finetuning_type: "all_weights"): A new Model Entity is created containing the complete fine-tuned model weights. This new entity has a base_model field linking back to the original.

For LoRA jobs, you can list adapters attached to a model:

1model = client.models.retrieve(workspace="default", name="qwen3-1.7b")
2
3if model.adapters:
4 print(f"Adapters attached to {model.name}:")
5 for adapter in model.adapters:
6 print(f" - {adapter.name} (enabled: {adapter.enabled})")
7else:
8 print("No adapters attached yet")

Next Steps