nemo_curator.stages.text.experimental.translation.stages.format_translation_output

View as Markdown

Stage for shaping translation output columns.

Module Contents

Classes

NameDescription
FormatTranslationOutputStageApply the requested translation output format.

Data

__all__

API

class nemo_curator.stages.text.experimental.translation.stages.format_translation_output.FormatTranslationOutputStage(
name: str = 'FormatTranslationOutputStage',
target_lang: str,
output_mode: str = 'replaced',
output_field: str = 'translated_text',
reconstruct_messages: bool = False,
messages_field: str = 'messages',
messages_content_field: str = 'content'
)
Dataclass

Bases: ProcessingStage[DocumentBatch, DocumentBatch]

Apply the requested translation output format.

messages_content_field
str = 'content'
messages_field
str = 'messages'
name
str = 'FormatTranslationOutputStage'
output_field
str = 'translated_text'
output_mode
str = 'replaced'
reconstruct_messages
bool = False
target_lang
str
nemo_curator.stages.text.experimental.translation.stages.format_translation_output.FormatTranslationOutputStage.__post_init__() -> None
nemo_curator.stages.text.experimental.translation.stages.format_translation_output.FormatTranslationOutputStage._build_metadata_column(
df: pandas.DataFrame
) -> None

Construct the translation_metadata JSON column.

nemo_curator.stages.text.experimental.translation.stages.format_translation_output.FormatTranslationOutputStage._build_translated_messages(
df: pandas.DataFrame
) -> None

Construct the translated_messages column from original messages.

nemo_curator.stages.text.experimental.translation.stages.format_translation_output.FormatTranslationOutputStage._parse_optional_json_object(
value: object
) -> dict[str, object] | None
staticmethod

Parse helper JSON emitted by ReassemblyStage when present.

nemo_curator.stages.text.experimental.translation.stages.format_translation_output.FormatTranslationOutputStage.inputs() -> tuple[list[str], list[str]]
nemo_curator.stages.text.experimental.translation.stages.format_translation_output.FormatTranslationOutputStage.outputs() -> tuple[list[str], list[str]]
nemo_curator.stages.text.experimental.translation.stages.format_translation_output.FormatTranslationOutputStage.process(
batch: nemo_curator.tasks.DocumentBatch
) -> nemo_curator.tasks.DocumentBatch

Apply output formatting to the batch.

nemo_curator.stages.text.experimental.translation.stages.format_translation_output.__all__ = ['FormatTranslationOutputStage']