For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • Getting Started
    • Welcome
    • Contributing
  • Concepts
    • Columns
    • Seed Datasets
    • Agent Rollout Ingestion
    • Custom Columns
    • Validators
    • Processors
    • Person Sampling
    • Traces
    • Architecture & Performance
    • Deployment Options
    • Security
  • Tutorials
    • Overview
    • The Basics
    • Structured Outputs, Jinja Expressions, and Conditional Generation
    • Seeding with an External Dataset
    • Providing Images as Context
    • Generating Images
    • Image-to-Image Editing
  • Recipes
    • Recipe Cards
  • Plugins
    • Overview
    • Example Plugin
    • FileSystemSeedReader Plugins
    • Discover
  • Code Reference
    • Overview
      • Overview
      • models
      • mcp
      • column_configs
      • config_builder
      • data_designer_config
      • run_config
      • sampler_params
      • validator_params
      • seeds
      • processors
      • analysis
      • Config API
        • Analysis
        • Base
        • Column Configs
        • Column Types
        • Config Builder
        • Custom Column
        • Data Designer Config
        • Dataset Metadata
        • Default Model Settings
        • Errors
        • Exportable Config
        • Fingerprint
        • Interface
        • Mcp
        • Models
        • Preview Results
        • Processor Types
        • Processors
        • Run Config
        • Sampler Constraints
        • Sampler Params
        • Seed
        • Seed Source
        • Seed Source Dataframe
        • Seed Source Types
        • Testing
        • Utils
        • Validator Params
        • Version
  • Dev Notes
    • Overview
    • Prompt Sensitivity
    • Retriever SDG Toolkit
    • Have It Your Way
    • VLM Long Document Understanding
    • Push Datasets to Hugging Face Hub
    • Text-to-SQL for Nemotron Super
    • Async All the Way Down
    • Owning the Model Stack
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Data Designer
On this page
  • Module Contents
  • Classes
  • Functions
  • API
Code ReferenceConfigConfig API

data_designer.config.seed_source

||View as Markdown|
Previous

Seed

Next

Seed Source Dataframe

Module Contents

Classes

NameDescription
SeedSourceBase class for seed dataset configurations.
LocalFileSeedSourceBase class for seed dataset configurations.
HuggingFaceSeedSourceBase class for seed dataset configurations.
FileSystemSeedSourceBase class for seed sources backed by a directory of files.
DirectorySeedSourceBase class for seed sources backed by a directory of files.
FileContentsSeedSourceBase class for seed sources backed by a directory of files.
AgentRolloutFormatstr(object=”) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
AgentRolloutSeedSourceBase class for seed sources backed by a directory of files.

Functions

NameDescription
_resolve_filesystem_runtime_pathNone
_resolve_local_file_runtime_pathNone
get_claude_code_default_pathNone
get_codex_default_pathNone
get_hermes_agent_default_pathNone
get_pi_coding_agent_default_pathNone
_validate_filesystem_seed_source_pathNone
_validate_filesystem_seed_source_file_patternNone
get_agent_rollout_format_defaultsNone

API

1class data_designer.config.seed_source.SeedSource(
2 /,
3 **data: typing.Any
4)

Bases: pydantic.BaseModel, abc.ABC

Base class for seed dataset configurations.

All subclasses must define a seed_type field with a Literal value. This serves as a discriminated union discriminator.

Parameters:

seed_type

Discriminator field that identifies the specific seed source type. Subclasses must override this field with a Literal value.

Attributes:

seed_type

Discriminator field that identifies the specific seed source type. Subclasses must override this field with a Literal value.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1seed_type: str
1class data_designer.config.seed_source.LocalFileSeedSource(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.seed_source.SeedSource

1seed_type: typing.Literal[local] = local
1_runtime_path: str | None = PrivateAttr(...)
1path: str = Field(...)
1validate_path(v: str) -> str
1model_post_init(__context: typing.Any) -> None
1runtime_path: str
1from_dataframe(
2 df: pandas.DataFrame,
3 path: str
4) -> typing_extensions.Self
1class data_designer.config.seed_source.HuggingFaceSeedSource(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.seed_source.SeedSource

1seed_type: typing.Literal[hf] = hf
1path: str = Field(...)
1token: str | None
1endpoint: str = https://huggingface.co
1class data_designer.config.seed_source.FileSystemSeedSource(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.seed_source.SeedSource, abc.ABC

Base class for seed sources backed by a directory of files.

Use this base when a seed reader needs to enumerate files under a directory on disk and turn each (or groups of them) into seed rows. Concrete plugin configs declare a Literal seed_type and pair with a FileSystemSeedReader implementation.

Parameters:

path

Directory containing seed artifacts. Relative paths are resolved from the current working directory when the config is loaded, not from the config file location.

file_pattern

Case-sensitive filename pattern used to match files under the provided directory. Patterns match basenames only, not relative paths. Defaults to '*'.

recursive

Whether to search nested subdirectories under the provided directory for matching files. Defaults to True.

Attributes:

path

Directory containing seed artifacts. Relative paths are resolved from the current working directory when the config is loaded, not from the config file location.

file_pattern

Case-sensitive filename pattern used to match files under the provided directory. Patterns match basenames only, not relative paths. Defaults to '*'.

recursive

Whether to search nested subdirectories under the provided directory for matching files. Defaults to True.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1_runtime_path: str | None = PrivateAttr(...)
1path: str = Field(...)
1file_pattern: str = Field(...)
1recursive: bool = Field(...)
1validate_path(value: str | None) -> str | None
1model_post_init(__context: typing.Any) -> None
1runtime_path: str
1validate_file_pattern(value: str | None) -> str | None
1class data_designer.config.seed_source.DirectorySeedSource(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.seed_source.FileSystemSeedSource

1seed_type: typing.Literal[directory] = directory
1class data_designer.config.seed_source.FileContentsSeedSource(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.seed_source.FileSystemSeedSource

1seed_type: typing.Literal[file_contents] = file_contents
1encoding: str = Field(...)
1validate_encoding(value: str) -> str
1data_designer.config.seed_source._resolve_filesystem_runtime_path(path: str) -> str
1data_designer.config.seed_source._resolve_local_file_runtime_path(path: str) -> str
1data_designer.config.seed_source.get_claude_code_default_path() -> str
1data_designer.config.seed_source.get_codex_default_path() -> str
1data_designer.config.seed_source.get_hermes_agent_default_path() -> str
1data_designer.config.seed_source.get_pi_coding_agent_default_path() -> str
1data_designer.config.seed_source._validate_filesystem_seed_source_path(value: str | None) -> str | None
1data_designer.config.seed_source._validate_filesystem_seed_source_file_pattern(value: str | None) -> str | None
1class data_designer.config.seed_source.AgentRolloutFormat

Bases: data_designer.config.utils.type_helpers.StrEnum

1ATIF = atif
1CLAUDE_CODE = claude_code
1CODEX = codex
1HERMES_AGENT = hermes_agent
1PI_CODING_AGENT = pi_coding_agent
1data_designer.config.seed_source.get_agent_rollout_format_defaults(fmt: data_designer.config.seed_source.AgentRolloutFormat) -> tuple[str | None, str]
1class data_designer.config.seed_source.AgentRolloutSeedSource(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.seed_source.FileSystemSeedSource

1seed_type: typing.Literal[agent_rollout] = agent_rollout
1format: data_designer.config.seed_source.AgentRolloutFormat = Field(...)
1path: str | None = Field(...)
1file_pattern: str | None = Field(...)
1validate_runtime_path_source() -> typing_extensions.Self
1runtime_path: str
1resolved_file_pattern: str