DatasetCreateParams#

class nemo_microservices.types.DatasetCreateParams

Bases: TypedDict

files_url: Required[str]

The location where the artifact files are stored.

This can be a URL pointing to NDS, Hugging Face, S3, or any other accessible resource location.

custom_fields: Dict[str, str]

A set of custom fields that the user can define and use for various purposes.

description: str

The description of the entity.

format: str

Specifies the dataset format, referring to the schema of the dataset rather than the file format. Examples include SQuAD, BEIR, etc.

hf_endpoint: str

For HuggingFace URLs, the endpoint that should be used.

By default, this is set to the Data Store URL. For HuggingFace Hub, this should be set to “https://huggingface.co”.

limit: int

The maximum number of items to be used from the dataset.

name: str

The name of the entity.

Must be unique inside the namespace. If not specified, it will be the same as the automatically generated id.

namespace: str

The namespace of the entity.

This can be missing for namespace entities or in deployments that don’t use namespaces.

ownership: Ownership

Information about ownership of an entity.

If the entity is a namespace, the access_policies will typically apply to all entities inside the namespace.

project: str

The URN of the project associated with this entity.

split: str

The split of the dataset. Examples include train, validation, test, etc.