NeMo Customizer Microservice Helm Chart#

Type: application

For deployment guide, see Admin Setup in the NeMo Microservices documentation.

Values#

Key

Type

Default

Description

apiImage

object

This object has the following default values for the NeMo Customizer API only image.

Customizer API only image configuration.

apiImage.imagePullPolicy

string

"IfNotPresent"

Image pull policy for the NeMo Customizer API image.

apiImage.registry

string

"nvcr.io"

Registry for the NeMo Customizer API image.

apiImage.repository

string

"nvidia/nemo-microservices/customizer-api"

Repository for the NeMo Customizer API image.

customizerConfig

object

This object has default values for the following fields.

Configuration for the NeMo Customizer microservice.

customizerConfig.entityStoreURL

string

"http://nemo-gateway.nemo-gateway.svc.cluster.local:8000"

URL for the NeMo Entity Store microservice.

customizerConfig.mlflowURL

string

"http://mlflow-tracking.mlflow-system.svc.cluster.local:80"

URL for the MLflow tracking server.

customizerConfig.models

object

This object has default values for the supported models.

Models configuration.

customizerConfig.models.”meta/llama-3.1-70b-instruct”

object

This object has the following default values for the Llama 3.1 70B Instruct model.

Llama 3.1 70B Instruct model configuration.

customizerConfig.models.”meta/llama-3.1-70b-instruct”.enabled

bool

false

Whether to enable the model.

customizerConfig.models.”meta/llama-3.1-70b-instruct”.max_seq_length

int

4096

Maximum sequence length for input tokens.

customizerConfig.models.”meta/llama-3.1-70b-instruct”.micro_batch_size

int

1

Micro batch size for training.

customizerConfig.models.”meta/llama-3.1-70b-instruct”.model_path

string

"llama-3_1-70b-instruct_0_0_1"

Path where model files are stored.

customizerConfig.models.”meta/llama-3.1-70b-instruct”.model_uri

string

"ngc://nvidia/nemo/llama-3_1-70b-instruct-nemo:2.0"

NGC model URI for Llama 3.1 70B Instruct model.

customizerConfig.models.”meta/llama-3.1-70b-instruct”.num_parameters

int

70000000000

Number of model parameters.

customizerConfig.models.”meta/llama-3.1-70b-instruct”.precision

string

"bf16-mixed"

Model precision format.

customizerConfig.models.”meta/llama-3.1-70b-instruct”.prompt_template

string

"{prompt} {completion}"

Template for formatting prompts.

customizerConfig.models.”meta/llama-3.1-70b-instruct”.training_options

list

[{"finetuning_type":"lora","num_gpus":4,"num_nodes":1,"tensor_parallel_size":4,"training_type":"sft"}]

Training options for different fine-tuning methods.

customizerConfig.models.”meta/llama-3.1-8b-instruct”

object

This object has the following default values for the Llama 3.1 8B Instruct model.

Llama 3.1 8B Instruct model configuration.

customizerConfig.models.”meta/llama-3.1-8b-instruct”.enabled

bool

false

Whether to enable the model.

customizerConfig.models.”meta/llama-3.1-8b-instruct”.max_seq_length

int

4096

Maximum sequence length for input tokens.

customizerConfig.models.”meta/llama-3.1-8b-instruct”.micro_batch_size

int

1

Micro batch size for training.

customizerConfig.models.”meta/llama-3.1-8b-instruct”.model_path

string

"llama-3_1-8b-instruct_0_0_1"

Path where model files are stored.

customizerConfig.models.”meta/llama-3.1-8b-instruct”.model_uri

string

"ngc://nvidia/nemo/llama-3_1-8b-instruct-nemo:2.0"

NGC model URI for Llama 3.1 8B Instruct model.

customizerConfig.models.”meta/llama-3.1-8b-instruct”.num_parameters

int

8000000000

Number of model parameters.

customizerConfig.models.”meta/llama-3.1-8b-instruct”.precision

string

"bf16-mixed"

Model precision format.

customizerConfig.models.”meta/llama-3.1-8b-instruct”.prompt_template

string

"{prompt} {completion}"

Template for formatting prompts.

customizerConfig.models.”meta/llama-3.1-8b-instruct”.training_options

list

[{"finetuning_type":"lora","num_gpus":1,"training_type":"sft"},{"finetuning_type":"all_weights","num_gpus":8,"num_nodes":1,"tensor_parallel_size":4,"training_type":"sft"}]

Training options for different fine-tuning methods.

customizerConfig.models.”meta/llama-3.2-1b”

object

This object has the following default values for the Llama 3.2 1B model.

Llama 3.2 1B model configuration.

customizerConfig.models.”meta/llama-3.2-1b”.enabled

bool

false

Whether to enable the model.

customizerConfig.models.”meta/llama-3.2-1b”.max_seq_length

int

4096

Maximum sequence length for input tokens.

customizerConfig.models.”meta/llama-3.2-1b”.micro_batch_size

int

1

Micro batch size for training.

customizerConfig.models.”meta/llama-3.2-1b”.model_path

string

"llama32_1b"

Path where model files are stored.

customizerConfig.models.”meta/llama-3.2-1b”.model_uri

string

"ngc://nvidia/nemo/llama-3_2-1b:2.0"

NGC model URI for Llama 3.2 1B model.

customizerConfig.models.”meta/llama-3.2-1b”.num_parameters

int

1000000000

Number of model parameters.

customizerConfig.models.”meta/llama-3.2-1b”.precision

string

"bf16-mixed"

Model precision format.

customizerConfig.models.”meta/llama-3.2-1b”.prompt_template

string

"{prompt} {completion}"

Template for formatting prompts.

customizerConfig.models.”meta/llama-3.2-1b”.training_options

list

[{"finetuning_type":"lora","num_gpus":1,"num_nodes":1,"tensor_parallel_size":1,"training_type":"sft"},{"finetuning_type":"all_weights","num_gpus":1,"num_nodes":1,"tensor_parallel_size":1,"training_type":"sft"}]

Training options for different fine-tuning methods.

customizerConfig.models.”meta/llama-3.2-1b-instruct”

object

This object has the following default values for the Llama 3.2 1B Instruct model.

Llama 3.2 1B Instruct model configuration.

customizerConfig.models.”meta/llama-3.2-1b-instruct”.enabled

bool

false

Whether to enable the model.

customizerConfig.models.”meta/llama-3.2-1b-instruct”.max_seq_length

int

4096

Maximum sequence length for input tokens.

customizerConfig.models.”meta/llama-3.2-1b-instruct”.micro_batch_size

int

1

Micro batch size for training.

customizerConfig.models.”meta/llama-3.2-1b-instruct”.model_path

string

"llama32_1b-instruct"

Path where model files are stored.

customizerConfig.models.”meta/llama-3.2-1b-instruct”.model_uri

string

"ngc://nvidia/nemo/llama-3_2-1b-instruct:2.0"

NGC model URI for Llama 3.2 1B Instruct model.

customizerConfig.models.”meta/llama-3.2-1b-instruct”.num_parameters

int

1000000000

Number of model parameters.

customizerConfig.models.”meta/llama-3.2-1b-instruct”.precision

string

"bf16-mixed"

Model precision format.

customizerConfig.models.”meta/llama-3.2-1b-instruct”.prompt_template

string

"{prompt} {completion}"

Template for formatting prompts.

customizerConfig.models.”meta/llama-3.2-1b-instruct”.training_options

list

[{"finetuning_type":"lora","num_gpus":1,"num_nodes":1,"tensor_parallel_size":1,"training_type":"sft"},{"finetuning_type":"all_weights","num_gpus":1,"num_nodes":1,"tensor_parallel_size":1,"training_type":"sft"}]

Training options for different fine-tuning methods.

customizerConfig.models.”meta/llama-3.2-3b-instruct”

object

This object has the following default values for the Llama 3.2 3B Instruct model.

Llama 3.2 3B Instruct model configuration.

customizerConfig.models.”meta/llama-3.2-3b-instruct”.enabled

bool

false

Whether to enable the model.

customizerConfig.models.”meta/llama-3.2-3b-instruct”.max_seq_length

int

4096

Maximum sequence length for input tokens.

customizerConfig.models.”meta/llama-3.2-3b-instruct”.micro_batch_size

int

1

Micro batch size for training.

customizerConfig.models.”meta/llama-3.2-3b-instruct”.model_path

string

"llama32_3b-instruct"

Path where model files are stored.

customizerConfig.models.”meta/llama-3.2-3b-instruct”.model_uri

string

"ngc://nvidia/nemo/llama-3_2-3b-instruct:2.0"

NGC model URI.

customizerConfig.models.”meta/llama-3.2-3b-instruct”.num_parameters

int

3000000000

Number of model parameters.

customizerConfig.models.”meta/llama-3.2-3b-instruct”.precision

string

"bf16-mixed"

Model precision format.

customizerConfig.models.”meta/llama-3.2-3b-instruct”.prompt_template

string

"{prompt} {completion}"

Template for formatting prompts.

customizerConfig.models.”meta/llama-3.2-3b-instruct”.training_options

list

[{"finetuning_type":"lora","num_gpus":1,"num_nodes":1,"tensor_parallel_size":1,"training_type":"sft"}]

Training options for different fine-tuning methods.

customizerConfig.models.”meta/llama-3.3-70b-instruct”

object

This object has the following default values for the Llama 3.3 70B Instruct model.

Llama 3.3 70B Instruct model configuration.

customizerConfig.models.”meta/llama-3.3-70b-instruct”.enabled

bool

false

Whether to enable the model.

customizerConfig.models.”meta/llama-3.3-70b-instruct”.max_seq_length

int

4096

Maximum sequence length for input tokens.

customizerConfig.models.”meta/llama-3.3-70b-instruct”.micro_batch_size

int

1

Micro batch size for training.

customizerConfig.models.”meta/llama-3.3-70b-instruct”.model_path

string

"llama-3_3-70b-instruct_0_0_1"

Path where model files are stored.

customizerConfig.models.”meta/llama-3.3-70b-instruct”.model_uri

string

"ngc://nvidia/nemo/llama-3_3-70b-instruct:1.0"

NGC model URI for Llama 3.3 70B Instruct model.

customizerConfig.models.”meta/llama-3.3-70b-instruct”.num_parameters

int

70000000000

Number of model parameters.

customizerConfig.models.”meta/llama-3.3-70b-instruct”.precision

string

"bf16-mixed"

Model precision format.

customizerConfig.models.”meta/llama-3.3-70b-instruct”.prompt_template

string

"{prompt} {completion}"

Template for formatting prompts.

customizerConfig.models.”meta/llama-3.3-70b-instruct”.training_options

list

[{"finetuning_type":"lora","num_gpus":4,"num_nodes":1,"tensor_parallel_size":4,"training_type":"sft"}]

Training options for different fine-tuning methods.

customizerConfig.models.meta/llama3-70b-instruct

object

This object has the following default values for the Llama 3 70B Instruct model.

Llama 3 70B Instruct model configuration.

customizerConfig.models.meta/llama3-70b-instruct.enabled

bool

false

Whether to enable the model.

customizerConfig.models.meta/llama3-70b-instruct.max_seq_length

int

4096

Maximum sequence length for input tokens.

customizerConfig.models.meta/llama3-70b-instruct.micro_batch_size

int

1

Micro batch size for training.

customizerConfig.models.meta/llama3-70b-instruct.model_path

string

"llama-3-70b-bf16"

Path where model files are stored.

customizerConfig.models.meta/llama3-70b-instruct.model_uri

string

"ngc://nvidia/nemo/llama-3-70b-instruct:2.0"

NGC model URI for Llama 3 70B Instruct model.

customizerConfig.models.meta/llama3-70b-instruct.num_parameters

int

70000000000

Number of model parameters.

customizerConfig.models.meta/llama3-70b-instruct.precision

string

"bf16-mixed"

Model precision format.

customizerConfig.models.meta/llama3-70b-instruct.prompt_template

string

"{prompt} {completion}"

Template for formatting prompts.

customizerConfig.models.meta/llama3-70b-instruct.training_options

list

[{"finetuning_type":"lora","num_gpus":4,"num_nodes":1,"tensor_parallel_size":4,"training_type":"sft"}]

Training options for different fine-tuning methods.

customizerConfig.models.microsoft/phi-4

object

This object has the following default values for the Phi-4.

Phi-4 model configuration.

customizerConfig.models.microsoft/phi-4.enabled

bool

false

Whether to enable the model.

customizerConfig.models.microsoft/phi-4.max_seq_length

int

4096

Maximum sequence length for input tokens.

customizerConfig.models.microsoft/phi-4.micro_batch_size

int

1

Micro batch size for training.

customizerConfig.models.microsoft/phi-4.model_path

string

"phi-4"

Path where model files are stored.

customizerConfig.models.microsoft/phi-4.model_uri

string

"ngc://nvidia/nemo/phi4:2.0"

NGC model URI for Phi-4 model.

customizerConfig.models.microsoft/phi-4.num_parameters

int

14659507200

Number of model parameters.

customizerConfig.models.microsoft/phi-4.precision

string

"bf16"

Model precision format.

customizerConfig.models.microsoft/phi-4.prompt_template

string

"{prompt} {completion}"

Template for formatting prompts.

customizerConfig.models.microsoft/phi-4.training_options

list

[{"finetuning_type":"lora","num_gpus":1,"num_nodes":1,"training_type":"sft"},{"finetuning_type":"all_weights","num_gpus":4,"num_nodes":2,"tensor_parallel_size":8,"training_type":"sft"}]

Training options for different fine-tuning methods.

customizerConfig.nemoDataStoreURL

string

"http://nds-datastore-http.nds-v2.svc.cluster.local:3000"

URL for the NeMo Data Store microservice.

customizerConfig.openTelemetry

object

This object has the following default values for the OpenTelemetry settings.

OpenTelemetry settings.

customizerConfig.openTelemetry.enabled

bool

true

Whether to enable OpenTelemetry.

customizerConfig.openTelemetry.exporterOtlpEndpoint

string

""

Endpoint to access a custom OTLP collector listening on port 4317. Example: “http://$(HOST_IP):4317”.

customizerConfig.openTelemetry.logsExporter

string

"otlp"

Sets the logs exporter type (otlp, console, none).

customizerConfig.openTelemetry.metricsExporter

string

"otlp"

Sets the metrics exporter type (otlp, console, none).

customizerConfig.openTelemetry.tracesExporter

string

"otlp"

Sets the traces exporter type (otlp, console, none).

customizerConfig.tolerations

list

[]

Tolerations on the customization job pods.

customizerConfig.training

object

This object has the following default values for the training configuration.

Training configuration for customization jobs.

customizerConfig.training.container_defaults

object

{"env":null,"imagePullPolicy":"IfNotPresent"}

Default container configuration for training jobs.

customizerConfig.training.container_defaults.env

string

nil

Environment variables for the training container. Cannot override env variables reserved by NeMo Customizer.

customizerConfig.training.pvc.size

string

"5Gi"

Size of the training job PVC.

customizerConfig.training.pvc.storageClass

string

"local-nfs"

Storage class for the training job PVC.

customizerConfig.training.pvc.volumeAccessMode

string

"ReadWriteMany"

Volume access mode for the training job PVC.

customizerConfig.training.queue

string

"default"

Queue name used by the underlying scheduler of NemoTrainingJob. Maps to “resourceGroup” in NemoTrainingJob.

customizerConfig.training.workspace_dir

string

"/pvc/workspace"

Directory path for training workspace.

customizerConfig.trainingNetworking

list

[{"name":"NCCL_IB_SL","value":0},{"name":"NCCL_IB_TC","value":41},{"name":"NCCL_IB_QPS_PER_CONNECTION","value":4},{"name":"UCX_TLS","value":"TCP"},{"name":"UCX_NET_DEVICES","value":"eth0"},{"name":"HCOLL_ENABLE_MCAST_ALL","value":0},{"name":"NCCL_IB_GID_INDEX","value":3}]

Network configuration for training jobs on Oracle Kubernetes Engine (OKE) on Oracle Cloud Infrastructure (OCI).

customizerConfig.wandb

object

{"entity":null,"project":"nvidia-nemo-customizer"}

Weights and Biases (WandB) Python SDK intialization configuration for logging and monitoring training jobs in WandB.

customizerConfig.wandb.entity

string

nil

Reference: https://docs.wandb.ai/ref/python/init/

env

object

{}

A map of environment variables to inject into the NeMo Customizer app container. Example: {HOST_IP:   valueFrom:     fieldRef:       fieldPath: status.hostIP OTEL_EXPORTER_OTLP_ENDPOINT: "http://$(HOST_IP):4317"}

externalDatabase

object

This object has the following default values for the external PostgreSQL configuration.

External PostgreSQL configuration.

externalDatabase.database

string

"finetuning"

Name of the database to use.

externalDatabase.existingSecret

string

""

Name of an existing secret resource containing the database credentials.

externalDatabase.existingSecretPasswordKey

string

""

Name of an existing secret key containing the database credentials.

externalDatabase.host

string

"localhost"

External database host address.

externalDatabase.password

string

"nemo"

Password for the non-root username for the NeMo Customizer microservice.

externalDatabase.port

int

5432

External database port number.

externalDatabase.user

string

"nemo"

Non-root username for the NeMo Customizer microservice.

fullnameOverride

string

""

String to fully override the chart and release name on resulting objects when deployed.

image

object

This object has the following default values for the NeMo Customizer microservice image.

NeMo Customizer image that supports training and standalone mode.

image.imagePullPolicy

string

"IfNotPresent"

Image pull policy for the NeMo Customizer image.

image.registry

string

"nvcr.io"

Registry for the NeMo Customizer image.

image.repository

string

"nvidia/nemo-microservices/customizer"

Repository for the NeMo Customizer image.

imagePullSecrets

list

[{"name":"nvcrimagepullsecret"}]

Image pull secrets configuration.

ingress

object

This object has the following default values for the Ingress configuration.

Ingress configuration.

ingress.annotations

object

{}

Additional annotations for the Ingress resource.

ingress.className

string

""

Ingress class name.

ingress.enabled

bool

false

Whether to enable the ingress resource.

ingress.hostname

string

""

Hostname for the ingress resource.

ingress.hosts

list

[]

A list of maps, each containing the keys host and paths for the ingress resource. You must specify a list for configuring ingress for the microservice.

ingress.tls

list

[]

TLS configuration for the ingress resource.

logging

object

{"logHealthEndpoints":false,"logLevel":"INFO"}

Logging configuration.

logging.logHealthEndpoints

bool

false

Enable logging for health endpoints.

logging.logLevel

string

"INFO"

Log level for the application.

modelDownloader

object

This object has the following default values for the model downloader.

Download models to PVC model cache configuration.

modelDownloader.pollIntervalSeconds

int

15

Interval in seconds to poll for model download status.

modelDownloader.securityContext

object

{"fsGroup":1000,"runAsGroup":1000,"runAsNonRoot":true,"runAsUser":1000}

Security context for the model downloader.

modelDownloader.ttlSecondsAfterFinished

int

7200

Time to live in seconds after the job finishes.

modelsStorage

object

{"accessModes":["ReadWriteMany"],"enabled":true,"size":"1Ti","storageClassName":""}

Configure the PVC for models mount, where we store the parent/base models.

modelsStorage.accessModes

list

["ReadWriteMany"]

Access modes for the persistent volume.

modelsStorage.enabled

bool

true

Enable persistent volume for model storage.

modelsStorage.size

string

"1Ti"

Size of the persistent volume.

modelsStorage.storageClassName

string

""

Storage class name for the models PVC. Empty string uses the default storage class.

nameOverride

string

""

String to override chart name on resulting objects when deployed.

nemoDataStoreTools

object

This object has the following default values for the NeMo Data Store tools image.

Tools configuration for downloading and uploading entities to NeMo Data Store.

nemoDataStoreTools.imagePullSecret

string

"nvcrimagepullsecret"

Image pull secret for the NeMo Data Store tools image.

nemoDataStoreTools.registry

string

"nvcr.io"

Registry for the NeMo Data Store tools image.

nemoDataStoreTools.repository

string

"nvidia/nemo-microservices/nds-v2-huggingface-cli"

Repository for the NeMo Data Store tools image.

nemoDataStoreTools.tag

string

""

Tag for the NeMo Data Store tools image.

ngcAPISecret

string

"ngc-api"

Secret used for auto hydrating the model cache from NGC for enabled models.

ngcAPISecretKey

string

"NGC_API_KEY"

Key in the NGC API secret containing the API key.

opentelemetry-collector

object

This object has the following default values for the Open Telemetry Collector configuration.

Open Telemetry Collector configuration.

opentelemetry-collector.config

object

{"exporters":{"debug":{"verbosity":"detailed"}},"extensions":{"health_check":{},"zpages":{"endpoint":"0.0.0.0:55679"}},"processors":{"batch":{}},"receivers":{"otlp":{"protocols":{"grpc":{},"http":{"cors":{"allowed_origins":["*"]}}}}},"service":{"extensions":["zpages","health_check"],"pipelines":{"logs":{"exporters":["debug"],"processors":["batch"],"receivers":["otlp"]},"metrics":{"exporters":["debug"],"processors":["batch"],"receivers":["otlp"]},"traces":{"exporters":["debug"],"processors":["batch"],"receivers":["otlp"]}}}}

Base collector configuration for Open Telemetry Collector.

opentelemetry-collector.enabled

bool

true

Switch to enable or disable Open Telemetry Collector.

opentelemetry-collector.image.repository

string

"otel/opentelemetry-collector-k8s"

Repository for Open Telemetry Collector image.

opentelemetry-collector.image.tag

string

"0.102.1"

Overrides the image tag whose default is the chart appVersion.

opentelemetry-collector.mode

string

"deployment"

Deployment mode for Open Telemetry Collector. Valid values are “daemonset”, “deployment”, and “statefulset”.

postgresql

object

This object has the following default values for the PostgreSQL configuration.

PostgreSQL configuration for the NeMo Customizer microservice.

postgresql.architecture

string

"standalone"

PostgreSQL architecture (standalone or replication).

postgresql.auth.database

string

"finetuning"

Name for a custom database to create.

postgresql.auth.enablePostgresUser

bool

true

Whether to assign a password to the “postgres” admin user. Otherwise, remote access will be blocked for this user.

postgresql.auth.existingSecret

string

""

Name of existing secret to use for PostgreSQL credentials.

postgresql.auth.password

string

"nemo"

Password for the custom user to create.

postgresql.auth.username

string

"nemo"

Name for a custom user to create.

postgresql.enabled

bool

true

Whether to enable or disable the PostgreSQL helm chart.

replicaCount

int

1

Number of replicas to deploy.

service

object

{"internalPort":9009,"port":8000,"type":"ClusterIP"}

Service configuration.

service.internalPort

int

9009

Internal port for the service.

service.port

int

8000

External port for the service.

service.type

string

"ClusterIP"

Type of Kubernetes service to create.

serviceAccount

object

{"annotations":{},"automountServiceAccountToken":true,"create":true,"name":""}

Service account configuration.

serviceAccount.annotations

object

{}

Annotations to add to the service account.

serviceAccount.automountServiceAccountToken

bool

true

Automatically mount a ServiceAccount’s API credentials.

serviceAccount.create

bool

true

Specifies whether a service account should be created.

serviceAccount.name

string

""

The name of the service account to use. If not set and create is true, a name is generated.

useRunAIExecutor

bool

false

Enable or disable RunAI executor.

wandb

object

This object has the following default values for the WandB configuration.

WandB configuration.

wandb.existingSecret

string

""

Name of an existing Kubernetes secret resource for the WandB encryption secret.

wandb.existingSecretKey

string

""

Name of the key in the existing WandB secret containing the secret value. The secret value must be exactly 32 alphanumeric characters: ^[a-zA-Z0-9]{32}$

wandb.secretValue

string

"ec60d96b639764ccf9859bc10d4363d1"

WandB secret value. Must contain exactly 32 alphanumeric characters. Creates a new Kubernetes secret named “wandb-secret” with key-value pair “encryption_key=<wandb.secretValue>”. Ignored if wandb.existingSecret is set.