nemo_curator.stages.text.io.writer.utils

View as Markdown

Module Contents

Functions

NameDescription
batchedBatch an iterable into lists of size n.
get_deterministic_hashCreate a deterministic hash from inputs.

API

nemo_curator.stages.text.io.writer.utils.batched(
iterable: collections.abc.Iterable[typing.Any],
n: int
) -> collections.abc.Iterator[tuple[typing.Any, ...]]

Batch an iterable into lists of size n.

Parameters:

iterable
Iterable[Any]

The iterable to batch

n
int

The size of the batch

Returns: Iterator[tuple[Any, ...]]

Iterator[tuple[…]]: An iterator of tuples, each containing n elements from the iterable

nemo_curator.stages.text.io.writer.utils.get_deterministic_hash(
inputs: list[str],
seed: str = ''
) -> str

Create a deterministic hash from inputs.