DatasetCu#

class nemo_microservices.types.customization.DatasetCu(*args: Any, **kwargs: Any)

Bases: BaseModel

id: str | None = None

The ID of the entity.

With the exception of namespaces, this is always a semantically-prefixed base58-encoded uuid4 [<prefix>-base58(uuid4())].

created_at: datetime | None = None

Timestamp for when the entity was created.

custom_fields: Dict[str, str] | None = None

A set of custom fields that the user can define and use for various purposes.

description: str | None = None

The description of the entity.

files_url: str | None = None

The location where the artifact files are stored.

This can be a URL pointing to NDS, Hugging Face, S3, or any other accessible resource location.

format: str | None = None

Specifies the dataset format, referring to the schema of the dataset rather than the file format. Examples include SQuAD, BEIR, etc.

hf_endpoint: str | None = None

For HuggingFace URLs, the endpoint that should be used.

By default, this is set to the Data Store URL. For HuggingFace Hub, this should be set to “https://huggingface.co”.

limit: int | None = None

The maximum number of items to be used from the dataset.

name: str | None = None

The name of the entity.

Must be unique inside the namespace. If not specified, it will be the same as the automatically generated id.

namespace: str | None = None

The namespace of the entity.

This can be missing for namespace entities or in deployments that don’t use namespaces.

ownership: Ownership | None = None

Information about ownership of an entity.

If the entity is a namespace, the access_policies will typically apply to all entities inside the namespace.

project: str | None = None

The URN of the project associated with this entity.

schema_version: str | None = None

The version of the schema for the object. Internal use only.

split: str | None = None

The split of the dataset. Examples include train, validation, test, etc.

type_prefix: str | None = None

The type prefix of the entity ID.

If not specified, it will be inferred from the entity type name, but this will likely result in long prefixes.

updated_at: datetime | None = None

Timestamp for when the entity was last updated.

version_id: str | None = None

A unique, immutable id for the version. This is similar to the commit hash.

version_tags: List[VersionTag] | None = None

The list of version tags associated with this entity.