Create Customization Config#

Prerequisites#

Before you can create a customization job, make sure that you have:


Options#

You can create a customization configuration in the following ways.

API#

The following example defines a pod_spec that allows jobs to be scheduled only on nodes tainted with app=a100-workload by specifying the required toleration.

For more information about GPU cluster configurations, see Configure Cluster GPUs.

  1. Perform a POST request to the /v1/customization/configs endpoint.

    curl -X POST \
      "${CUSTOMIZER_SERVICE_URL}/v1/customization/configs" \
      -H 'accept: application/json' \
      -H 'Content-Type: application/json' \
      -d '{
        "name": "llama-3.1-8b-instruct@v1.0.0+A100",
        "namespace": "default",
        "description": "Configuration for Llama 3.1 8B on A100 GPUs",
        "target": "meta/llama-3.1-8b-instruct@2.0",
        "training_options": [
           {
              "training_type": "sft",
              "finetuning_type": "lora",
              "num_gpus": 2,
              "micro_batch_size": 8,
              "tensor_parallel_size": 1,
              "pipeline_parallel_size": 1,
              "use_sequence_parallel": false
          }
        ],
        "training_precision": "bf16",
        "max_seq_length": 2048,
        "prompt_template": "{input} {output}",
        "pod_spec": {
          "tolerations": [
            {
              "key": "app",
              "operator": "Equal",
              "value": "a100-workload",
              "effect": "NoSchedule"
            }
          ],
          "node_selectors": {
            "nvidia.com/gpu.product": "NVIDIA-A100-80GB"
          },
          "annotations": {
            "sidecar.istio.io/inject": "false"
          }
        }
      }' | jq
    
    import requests
    
    url = f"{CUSTOMIZER_SERVICE_URL}/v1/customization/configs"
    payload = {
        "name": "llama-3.1-8b-instruct@v1.0.0+A100",
        "namespace": "default",
        "description": "Configuration for Llama 3.1 8B on A100 GPUs",
        "target": "meta/llama-3.1-8b-instruct@2.0",
        "training_options": [
            {
                "training_type": "sft",
                "finetuning_type": "lora",
                "num_gpus": 2,
                "micro_batch_size": 8,
                "tensor_parallel_size": 1,
                "pipeline_parallel_size": 1,
                "use_sequence_parallel": False
            }
        ],
        "training_precision": "bf16",
        "max_seq_length": 2048,
        "prompt_template": "{input} {output}",
        "pod_spec": {
            "tolerations": [
                {
                    "key": "app",
                    "operator": "Equal",
                    "value": "a100-workload",
                    "effect": "NoSchedule"
                }
            ],
            "node_selectors": {
                "nvidia.com/gpu.product": "NVIDIA-A100-80GB"
            },
            "annotations": {
                "sidecar.istio.io/inject": "false"
            }
        }
    }
    headers = {
        "accept": "application/json",
        "Content-Type": "application/json"
    }
    response = requests.post(url, json=payload, headers=headers)
    print(response.json())
    
  2. Review the response.

    Example Response
    {
        "id": "customization_config-MedVscVbr4pgLhLgKTLbv9",
        "name": "llama-3.1-8b-instruct@v1.0.0+A100",
        "namespace": "default",
        "description": "Configuration for Llama-3.2-1b on A100 GPUs",
        "target": "my-target",
        "training_options": [
            {
                "training_type": "sft",
                "finetuning_type": "lora",
                "num_gpus": 2,
                "micro_batch_size": 8,
                "tensor_parallel_size": 1,
                "pipeline_parallel_size": 1,
                "use_sequence_parallel": false
            }
        ],
        "training_precision": "bf16",
        "max_seq_length": 2048,
        "pod_spec": {},
        "prompt_template": "{input} {output}",
        "chat_prompt_template": null,
        "dataset_schemas": [],
        "custom_fields": {},
        "ownership": {}
     }