nemo_curator.stages.text.experimental.translation.backends.aws

View as Markdown

AWS Translate backend for NeMo Curator.

Uses Amazon Translate for translation. The sync boto3 client is wrapped in asyncio.get_running_loop().run_in_executor() for async support.

Module Contents

Classes

NameDescription
AWSTranslationBackendAWS Translate backend.

Data

AWS_MAX_BYTES_PER_REQUEST

API

class nemo_curator.stages.text.experimental.translation.backends.aws.AWSTranslationBackend(
region: str | None = None,
max_concurrent_requests: int = 32
)

Bases: ExecutorTranslationBackend

AWS Translate backend.

Parameters:

region
str | NoneDefaults to None

AWS region. Resolved in order: explicit value -> AWS_REGION env var -> AWS_DEFAULT_REGION env var -> "us-east-2" fallback.

max_concurrent_requests
intDefaults to 32

Semaphore size for async concurrency.

_region
backend_name
= 'AWS Translate'
nemo_curator.stages.text.experimental.translation.backends.aws.AWSTranslationBackend._non_retryable_exceptions() -> tuple[type[BaseException], ...]

Treat client-side size validation as a hard failure.

nemo_curator.stages.text.experimental.translation.backends.aws.AWSTranslationBackend._translate_single_sync(
text: str,
source_lang: str,
target_lang: str
) -> str

Synchronous single-text translation (called via executor).

Raises:

  • ValueError: If the UTF-8 encoded text exceeds 10 000 bytes.
nemo_curator.stages.text.experimental.translation.backends.aws.AWSTranslationBackend.close() -> None

Release client resources.

nemo_curator.stages.text.experimental.translation.backends.aws.AWSTranslationBackend.setup() -> None

Initialize the boto3 Translate client.

Raises:

  • ImportError: If boto3 is not installed.
nemo_curator.stages.text.experimental.translation.backends.aws.AWS_MAX_BYTES_PER_REQUEST = 10000