Customization Config Reference#
Tip
Looking for a step-by-step guide? Check out Create Customization Config.
For a complete reference of all customization configuration parameters with constraints and types:
CustomizationConfigInput object
A customization configuration template supported by the Customizer.
Properties
name string
The name of the entity. Must be unique inside the namespace. If not specified, it will be the same as the automatically generated id.
Constraints: max length: 255, pattern:
^[\w\-\+.@:]*$Default:
namespace string
The namespace of the entity. This can be missing for namespace entities or in deployments that don't use namespaces.
Default:
defaultdescription string
The description of the entity.
target string | object
The target to perform the customization on
Any of:
Option 1:
string - A reference to CustomizationTarget.Option 2:
object - Optional model_pathtraining_options * array
Resource configuration for each training option for the model.
Array items:
item object
Resource configuration for model training.
Specifies the hardware and parallelization settings for training.
Properties
training_type * string
Allowed values:
dposftdistillationfinetuning_type * string
Allowed values:
loralora_mergedall_weightsnum_gpus * integer
The number of GPUs per node to use for the specified training
num_nodes integer
The number of nodes to use for the specified training
Default:
1tensor_parallel_size integer
Number of GPUs used to split individual layers for tensor model parallelism (intra-layer).
Default:
1data_parallel_size integer
Number of model replicas that process different data batches in parallel, with gradient synchronization across GPUs. Only available on HF checkpoint models. data_parallel_size must be equal num_gpus * num_nodes and is set to this value automatically if not provided.
pipeline_parallel_size integer
Number of GPUs used to split the model across layers for pipeline model parallelism (inter-layer). Only available on NeMo 2 checkpoint models. pipeline_parallel_size * tensor_parallel_size must equal num_gpus * num_nodes
Default:
1expert_model_parallel_size integer
Number of GPUs used to parallelize expert (MoE) components of the model. This controls distribution of expert computation across devices for models that use Mixture-of-Experts. If omitted (null), expert parallelism will not be enabled/assumed by default.Setting for models that do not use MoE can cause failures during training.
use_sequence_parallel boolean
If set, sequences are distributed over multiple GPUs
Default:
Falsemicro_batch_size * integer
The number of examples per data-parallel rank. More details at: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/nemo_megatron/batching.html
training_precision string
The precision to train the model with, defaults to the target's precision
Allowed values:
int8bf16fp16fp32fp8-mixedbf16-mixedmax_seq_length * integer
The largest context used for training. Datasets are truncated based on the maximum sequence length.
pod_spec object
Additional parameters to ensure these training jobs get run on the appropriate hardware.
Examples:
{'annotations': {'nmp/job-type': 'customization'}, 'node_affinity': {'preferredDuringSchedulingIgnoredDuringExecution': [{'preference': {'matchExpressions': [{'key': 'nvidia.com/gpu.count', 'operator': 'Gt', 'values': ['4']}]}, 'weight': 100}, {'preference': {'matchExpressions': [{'key': 'topology.kubernetes.io/zone', 'operator': 'In', 'values': ['us-west-2a', 'us-west-2b']}]}, 'weight': 50}], 'requiredDuringSchedulingIgnoredDuringExecution': {'nodeSelectorTerms': [{'matchExpressions': [{'key': 'nvidia.com/gpu.product', 'operator': 'In', 'values': ['NVIDIA-A100-SXM4-80GB', 'NVIDIA-H100-80GB-HBM3']}, {'key': 'node.kubernetes.io/instance-type', 'operator': 'In', 'values': ['p4d.24xlarge', 'p5.48xlarge']}]}]}}, 'node_selectors': {'kubernetes.io/hostname': 'minikube'}, 'tolerations': [{'effect': 'NoSchedule', 'key': 'app', 'operator': 'Equal', 'value': 'customizer'}]}Properties
node_selectors object
Additional arguments for node selector
Additional properties schema:
[key: string] string
annotations object
Additional arguments for annotations
Additional properties schema:
[key: string] string
tolerations array
Additional arguments for tolerations
Array items:
item object
Properties
key string
Taint key that the toleration applies to
operator string
Operator: "Exists" or "Equal"
Default:
Equalvalue string
Value to match
effect string
Taint effect to match: "NoSchedule", "PreferNoSchedule", or "NoExecute"
tolerationSeconds integer
Only for NoExecute; how long the toleration lasts
node_affinity object
The kubernentes node affinity to apply to the training pods
Properties
requiredDuringSchedulingIgnoredDuringExecution object
Properties
nodeSelectorTerms * array
Array items:
item object
Properties
matchExpressions array
Array items:
item object
Properties
key * string
operator * string
Allowed values:
InNotInExistsDoesNotExistGtLtvalues array
Array items:
item string
preferredDuringSchedulingIgnoredDuringExecution array
Array items:
item object
Properties
weight * integer
preference * object
Properties
matchExpressions array
Array items:
item object
Properties
key * string
operator * string
Allowed values:
InNotInExistsDoesNotExistGtLtvalues array
Array items:
item string
prompt_template string
Prompt template used to extract keys from the dataset.
E.g. prompt_template='{input} {output}', and sample looks like
'{\"input\": \"Q: 2x2 A:\", \"output\": \"4\"}' then the model sees 'Q: 2x2 A: 4'.
This parameter is only used for the "SFT" and "Distillation" Training Types on non embeddding models.
Default:
{prompt} {completion}chat_prompt_template string
Chat Prompt Template to apply to the model to make it compatible with chat datasets, or to train it on a different
template for your use case.
This parameter is only used for the "SFT" and "Distillation" Training Types on non embedding models.
dataset_schemas array
JSON Schema used for validating datasets that can be used with the configured finetuning jobs.
Array items:
item object
Allows additional properties: Yes
project string
The URN of the project associated with this entity.
custom_fields object
A set of custom fields that the user can define and use for various purposes.
Allows additional properties: Yes
ownership object
Ownership information for the entity
Properties
created_by string
The ID of the user that created this entity.
Default:
updated_by string
The ID of the user that last updated this entity.
access_policies object
A general object for capturing access policies which can be used by an external service to determine ACLs
Default:
{}Additional properties schema:
[key: string] string