nemo_curator.stages.text.experimental.translation.stages.reassembly
nemo_curator.stages.text.experimental.translation.stages.reassembly
Reassemble translated segments back into document rows.
Module Contents
Classes
Data
API
Bases: ProcessingStage[DocumentBatch, DocumentBatch]
Collapse segment rows back into one row per document.
Average FAITH scores across segments, ignoring zero-valued dimensions.
Create the common output row fields for one document group.
Build one output document row from its segment group.
Build [{src, tgt}, ...] pairs for one field entry.
Build passthrough output for rows marked as skipped.
Reassemble translated text and return metadata helper maps.
Collect per-field reassembled text and helper maps.
Compute faith_avg as the mean of non-zero FAITH dimensions.
Count the translatable segments expected by one field entry.
Return the metadata key for field_path.
Reconstruct a document from coarse-mode metadata.
Reconstruct a document from fine-mode metadata.
Reassemble one or more translated field paths.
Reassemble a single translated field.
Log when multi-field metadata did not consume all translated segments.
Aggregate segment-level FAITH scores into one document-level record.
Write reassembled multi-field output payload into out_row.
Write a nested or wildcard field payload.
Write one reassembled field and return its output payload value.
Reassemble translated segments into full documents.