Datasets#

The Datasets page in NeMo Studio provides a centralized interface for managing your datasets. You can upload, view, search, and organize datasets and use them for tasks such as model fine-tuning and evaluation.


Backend Microservices#

In the backend, the UI communicates with NeMo Entity Store and NeMo Data Store to manage dataset files and entities such as metadata.


Datasets Page UI Overview#

The following are the main components and features of the Datasets page.

Dataset Listing#

The Datasets page displays your datasets in a table format with the following columns:

  • Dataset Name: The name of your dataset.

  • Description: Brief description of the dataset.

  • Created: Timestamp showing when the dataset was created.

  • Updated: Timestamp showing when the dataset was last updated.

You can sort datasets by clicking column headers to organize your view.

Dataset Management#

You can perform the following actions on the Datasets page:

  • Create Dataset: Add a new dataset and upload files to it.

  • Search and Filter: Find datasets by name or apply filters.

You can perform the following actions on each dataset by clicking the three-dot menu icon.

  • View Dataset: Access detailed information about the dataset, including file contents and metadata.

  • Edit Dataset: Edit the dataset’s name, description, or files.

  • Delete Dataset: Delete the dataset.

You can perform the following actions on each dataset file by clicking the three-dot menu icon.

  • View JSON: View the file contents in JSON format.

  • Split: Split the file into training, evaluation, and validation datasets.

  • Download: Download the file to your local system.

  • Rename: Change the file name.

  • Delete: Remove the file from the dataset.


Formatting Dataset Files#

Format dataset files according to the requirements for the model type or evaluation flow. Refer to the following guides for more information: