> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/curator/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/curator/llms-full.txt.

# nemo_curator.stages.text.io.writer.base

## Module Contents

### Classes

| Name                                                                | Description                       |
| ------------------------------------------------------------------- | --------------------------------- |
| [`BaseWriter`](#nemo_curator-stages-text-io-writer-base-BaseWriter) | Base class for all writer stages. |

### API

<Anchor id="nemo_curator-stages-text-io-writer-base-BaseWriter">
  <CodeBlock showLineNumbers={false} wordWrap={true}>
    ```python
    class nemo_curator.stages.text.io.writer.base.BaseWriter(
        path: str,
        file_extension: str,
        write_kwargs: dict[str, typing.Any] = dict(),
        fields: list[str] | None = None,
        name: str = 'BaseWriter',
        mode: typing.Literal['ignore', 'overwrite', 'append', 'error'] = 'ignore',
        append_mode_implemented: bool = False
    )
    ```
  </CodeBlock>
</Anchor>

<Indent>
  <Badge>
    Dataclass
  </Badge>

  <Badge>
    Abstract
  </Badge>

  **Bases:** [ProcessingStage\[DocumentBatch, FileGroupTask\]](/nemo-curator/nemo_curator/stages/base#nemo_curator-stages-base-ProcessingStage)

  Base class for all writer stages.

  This abstract base class provides common functionality for writing DocumentBatch
  tasks to files, including file naming, metadata handling, and filesystem operations.

  <ParamField path="append_mode_implemented" type="bool = False" />

  <ParamField path="fields" type="list[str] | None = None" />

  <ParamField path="file_extension" type="str" />

  <ParamField path="mode" type="Literal['ignore', 'overwrite', 'append', 'error'] = 'ignore'" />

  <ParamField path="name" type="str = 'BaseWriter'" />

  <ParamField path="path" type="str" />

  <ParamField path="write_kwargs" type="dict[str, Any] = field(default_factory=dict)" />

  <Anchor id="nemo_curator-stages-text-io-writer-base-BaseWriter-__post_init__">
    <CodeBlock showLineNumbers={false} wordWrap={true}>
      ```python
      nemo_curator.stages.text.io.writer.base.BaseWriter.__post_init__()
      ```
    </CodeBlock>
  </Anchor>

  <Indent />

  <Anchor id="nemo_curator-stages-text-io-writer-base-BaseWriter-get_file_extension">
    <CodeBlock showLineNumbers={false} wordWrap={true}>
      ```python
      nemo_curator.stages.text.io.writer.base.BaseWriter.get_file_extension() -> str
      ```
    </CodeBlock>
  </Anchor>

  <Indent>
    Return the file extension for this writer format.
  </Indent>

  <Anchor id="nemo_curator-stages-text-io-writer-base-BaseWriter-inputs">
    <CodeBlock showLineNumbers={false} wordWrap={true}>
      ```python
      nemo_curator.stages.text.io.writer.base.BaseWriter.inputs() -> tuple[list[str], list[str]]
      ```
    </CodeBlock>
  </Anchor>

  <Indent />

  <Anchor id="nemo_curator-stages-text-io-writer-base-BaseWriter-outputs">
    <CodeBlock showLineNumbers={false} wordWrap={true}>
      ```python
      nemo_curator.stages.text.io.writer.base.BaseWriter.outputs() -> tuple[list[str], list[str]]
      ```
    </CodeBlock>
  </Anchor>

  <Indent />

  <Anchor id="nemo_curator-stages-text-io-writer-base-BaseWriter-process">
    <CodeBlock links={{"nemo_curator.tasks.DocumentBatch":"/nemo-curator/nemo_curator/tasks/document#nemo_curator-tasks-document-DocumentBatch","nemo_curator.tasks.FileGroupTask":"/nemo-curator/nemo_curator/tasks/file_group#nemo_curator-tasks-file_group-FileGroupTask"}} showLineNumbers={false} wordWrap={true}>
      ```python
      nemo_curator.stages.text.io.writer.base.BaseWriter.process(
          task: nemo_curator.tasks.DocumentBatch
      ) -> nemo_curator.tasks.FileGroupTask
      ```
    </CodeBlock>
  </Anchor>

  <Indent>
    Process a DocumentBatch and write to files.

    **Parameters:**

    <ParamField path="task" type="DocumentBatch">
      DocumentBatch containing data to write
    </ParamField>

    **Returns:** `FileGroupTask`

    Task containing paths to written files
  </Indent>

  <Anchor id="nemo_curator-stages-text-io-writer-base-BaseWriter-write_data">
    <CodeBlock links={{"nemo_curator.tasks.DocumentBatch":"/nemo-curator/nemo_curator/tasks/document#nemo_curator-tasks-document-DocumentBatch"}} showLineNumbers={false} wordWrap={true}>
      ```python
      nemo_curator.stages.text.io.writer.base.BaseWriter.write_data(
          task: nemo_curator.tasks.DocumentBatch,
          file_path: str
      ) -> None
      ```
    </CodeBlock>
  </Anchor>

  <Indent>
    <Badge>
      abstract
    </Badge>

    Write data to file using format-specific implementation.
  </Indent>
</Indent>