Datasets#
Use datasets in your customization and evautions jobs for a model.
Prerequisites#
Before creating and managing datasets, make sure that you have:
The NeMo Entity Store and NeMo Data Store microservices are running on your cluster.
The base URLs for the NeMo Entity Store and NeMo Data Store microservices. This depends on how your cluster administrator configures the ingress setup for the NeMo microservices in your cluster. For more information, see Beginner Tutorial Prerequisites and Ingress Setup for Production Environment.
Hugging Face CLI or SDK installed.
Dataset Registry Workflow#
Prepare datasets.
Create a
repo_id
using{namespace}/{dataset_name}
.Register the dataset in NeMo Entity Store.
Upload dataset files to NeMo Data Store using the Hugging Face CLI or SDK with the
repo_id
.
Task Guides#
Create a dataset and register it to NeMo Entity Store.
Get a dataset from your datastore.
List datasets available for use in customization jobs.
Update a dataset’s file or metadata.
Delete a dataset.