Datasets#
The Datasets page in NeMo Studio provides a centralized interface for managing your datasets. You can upload, view, search, and organize datasets and use them for tasks such as model fine-tuning and evaluation.
Backend Microservices#
In the backend, the UI communicates with NeMo Entity Store and NeMo Data Store to manage dataset files and entities such as metadata.
Datasets Page UI Overview#
The following are the main components and features of the Datasets page.
Dataset Listing#
The Datasets page displays your datasets in a table format with the following columns:
Dataset Name: The name of your dataset.
Description: Brief description of the dataset.
Created: Timestamp showing when the dataset was created.
Updated: Timestamp showing when the dataset was last updated.
You can sort datasets by clicking column headers to organize your view.
Dataset Management#
You can perform the following actions on the Datasets page:
Create Dataset: Add a new dataset and upload files to it.
Search and Filter: Find datasets by name or apply filters.
You can perform the following actions on each dataset by clicking the three-dot menu icon.
View Dataset: Access detailed information about the dataset, including file contents and metadata.
Edit Dataset: Edit the dataset’s name, description, or files.
Delete Dataset: Delete the dataset.
You can perform the following actions on each dataset file by clicking the three-dot menu icon.
View JSON: View the file contents in JSON format.
Split: Split the file into training, evaluation, and validation datasets.
Download: Download the file to your local system.
Rename: Change the file name.
Delete: Remove the file from the dataset.
Formatting Dataset Files#
Format dataset files according to the requirements for the model type or evaluation flow. Refer to the following guides for more information:
For model fine-tuning datasets, refer to Dataset Format Requirements.
For evaluation datasets, refer to Evaluation Flows.