Customize the Evaluation Loop#

In this tutorial, you learn how to fine-tune the Llama 3.1 8B Instruct model on a sample dataset.

Prerequisites
Upload Datasets
Customizing the Llama 3.1 8B Instruct Model
Running Another Custom Evaluation Job on the Fine-tuned Model

Note

The time to complete this tutorial is approximately 45 minutes. In this tutorial, you run a customization job and two evaluation jobs. Customization job duration increases with model parameter count and dataset size. For more information about evaluation job duration, refer to Expected Evaluation Duration.

Prerequisites#

Before you begin, complete the following prerequisites:

Upload Datasets#

To fine-tune a model, you first need to have a dataset split into three parts: one for training, one for testing, and one for validation. All data should be in jsonl format. Each line of the file should contain an example prompt and completion field with values that fit the task you are trying to train the model to perform. An example line for a dataset geared around question answering could be:

{"prompt": "What is the distance from the earth to the sun? A: ", "completion": "93 millions miles"}

The following steps show how to utilize the Hugging Face APIs integrated in NeMo Data Store, download demo datasets prepared for demonstration purposes, and upload the sample datasets to the default database. The demo dataset is a subset of the SQuAD for question answer generation.

Set the following environment variables:

export HF_ENDPOINT="http://data-store.test/v1/hf"
export HF_TOKEN="dummy-unused-value"

Create a dataset repository:

huggingface-cli repo create sample-basic-test --type dataset

Create a new folder ~/tmp/sample_test_data:
```
mkdir -p ~/tmp/sample_test_data
```
Download the following sample datasets that you’ll use for fine-tuning and evaluating the Llama 3.1 8B Instruct model. Ensure that the datasets are accessible in the respective directories outlined as follows:
- Save the training dataset (618.61 KiB) in the local ~/tmp/sample_test_data/training directory.
- Save the validation dataset (75.69 KiB) in the local ~/tmp/sample_test_data/validation directory.
- Save the test dataset (80.68 KiB) in the local ~/tmp/sample_test_data/testing directory.
The following is the folder structure under ~/tmp/sample_test_data:
```
├── testing
│   └── testing.jsonl
├── training
│   └── training.jsonl
└── validation
    └── validation.jsonl
```
Note

NeMo Customizer and NeMo Evaluator expect testing files to be in the testing folder, training files to be the training folder, and validation files to be in the validation folder. Make sure that you put the files in the right places.

Upload the datasets:

Note

Make sure you point at folders that only contain the .jsonl files you want to use in the dataset. If your dataset folder is large, you may have to upload the files individually.

huggingface-cli upload --repo-type dataset default/sample-basic-test ~/tmp/sample_test_data

To use a dataset for operations such as evaluations and customizations, you need to register a dataset using the /v1/datasets endpoint. Registering the dataset enables you to refer to it by its namespace and name afterward.

Register the dataset created in the previous step. Format the files_url field as hf://datasets/{namespace}/{dataset-name}.

Python SDK

from nemo_microservices import NeMoMicroservices

client = NeMoMicroservices(
   base_url="http://nemo.test",
   inference_base_url="http://nim.test",
)

dataset = client.datasets.create(
    name="sample-basic-test",
    namespace="default",
    description="This is an example of a dataset",
    files_url="hf://datasets/default/sample-basic-test",
    project="sample_project"
)
print(dataset)

cURL

curl -X POST "http://nemo.test/v1/datasets" \
   -H 'accept: application/json' \
   -H 'Content-Type: application/json' \
   -d '{
      "name": "sample-basic-test",
      "namespace": "default",
      "description": "This is an example of a dataset",
      "files_url": "hf://datasets/default/sample-basic-test",
      "project": "sample_project"
   }' | jq

Customizing the Llama 3.1 8B Instruct Model#

Use the following procedure to fine-tune the Llama 3.1 8B Instruct model with the uploaded datasets.

Run the customization command:

Python SDK

customization_job = client.customization.jobs.create(
    config="meta/llama-3.1-8b-instruct@v1.0.0+A100",
    dataset={
        "name": "sample-basic-test",
        "namespace": "default"
    },
    hyperparameters={
        "training_type": "sft",
        "finetuning_type": "lora",
        "epochs": 3,
        "batch_size": 16,
        "learning_rate": 0.0001,
        "lora": {"adapter_dim": 16}
    },
    project="test-project",
    ownership={
        "created_by": "me",
        "access_policies": {
            "arbitrary": "json"
        }
    },
    output_model="default/test-example-model@v1"
)
print(customization_job)

cURL

curl -X POST \
   "http://nemo.test/v1/customization/jobs" \
   -H 'Accept: application/json' \
   -H 'Content-Type: application/json' \
   -d '{
         "config": "meta/llama-3.1-8b-instruct@v1.0.0+A100",
         "dataset": {
            "name": "sample-basic-test",
            "namespace": "default"
         },
         "hyperparameters": {
            "training_type": "sft",
            "finetuning_type": "lora",
            "epochs": 3,
            "batch_size": 16,
            "learning_rate": 0.0001,
            "lora": {"adapter_dim": 16}
         },
         "project": "test-project",
         "ownership": {
            "created_by": "me",
            "access_policies": {
               "arbitrary": "json"
            }
         },
         "output_model": "default/test-example-model@v1"
      }' | jq

Save the customization job ID:

Python SDK

# The job ID is available from the customization_job object created in the previous step
cust_id = customization_job.id

cURL

export CUST_ID=<cust-JGTaMbJMdqjJU8WbQdN9Q2>

Check the status of the customization job. Use the following command to verify that the job has completed:

Python SDK

# Using the customization job ID from the previous step
status = client.customization.jobs.status.retrieve(cust_id)
print(status)

cURL

curl "http://nemo.test/v1/customization/jobs/${CUST_ID}/status" | jq

If the status field changes to completed, it indicates that it finished creating and uploading the output_model.

Test the fine-tuned model by sending a prompt to the output_model.

Python SDK

completion = client.completions.create(
    model="default/test-example-model@v1",
    prompt="When is the upcoming GTC event? GTC 2018 attracted over 8,400 attendees. Due to the COVID pandemic of 2020, GTC 2020 was converted to a digital event and drew roughly 59,000 registrants. The 2021 GTC keynote, which was streamed on YouTube on April 12, included a portion that was made with CGI using the Nvidia Omniverse real-time rendering platform. This next GTC will take place in the middle of March, 2023. Answer: ",
    max_tokens=128
)
print(completion)

cURL

curl -X POST "http://nim.test/v1/completions" \
   -H 'accept: application/json' \
   -H 'Content-Type: application/json' \
   -d '{
      "model": "default/test-example-model@v1",
      "prompt": "When is the upcoming GTC event? GTC 2018 attracted over 8,400 attendees. Due to the COVID pandemic of 2020, GTC 2020 was converted to a digital event and drew roughly 59,000 registrants. The 2021 GTC keynote, which was streamed on YouTube on April 12, included a portion that was made with CGI using the Nvidia Omniverse real-time rendering platform. This next GTC will take place in the middle of March, 2023. Answer: ",
      "max_tokens": 128
   }' | jq

Customize the Evaluation Loop#

Prerequisites#

Upload Datasets#

Evaluate the Llama 3.1 8B Instruct Model#

Customizing the Llama 3.1 8B Instruct Model#

Running Another Custom Evaluation Job on the Fine-tuned Model#