safe_synthesizer.job_builder#

Module Contents#

Classes#

SafeSynthesizerJobBuilder

Builder for Safe Synthesizer Jobs ran with the NeMo Microservices Platform.

Data#

API#

class safe_synthesizer.job_builder.SafeSynthesizerJobBuilder(
client: nemo_platform.NeMoPlatform,
workspace: str = 'default',
)#

Builder for Safe Synthesizer Jobs ran with the NeMo Microservices Platform.

This class provides a fluent interface for building Safe Synthesizer configurations. It allows you to configure all the parameters needed to create and run a Safe Synthesizer job. Each method returns the builder instance to allow method chaining.

Examples

>>> from nemo_platform import NeMoPlatform
>>> from nemo_platform.beta.safe_synthesizer.job_builder import SafeSynthesizerJobBuilder
>>> client = NeMoPlatform(base_url=..., inference_base_url=...)
>>> builder = (
...     SafeSynthesizerJobBuilder(client)
...     .with_data_source(your_dataframe)
...     .with_replace_pii()
...     .synthesize()
...     .with_train(learning_rate=0.0001)
...     .with_generate(num_records=10000)
...     .with_evaluate(enable=False)
... )
>>> job = builder.create_job()

Initialization

create_job(**kwargs) safe_synthesizer.job.SafeSynthesizerJob#

Upload the dataset and submit the job.

Parameters:

**kwargs – Additional job creation parameters passed to the API.

Returns:

A SafeSynthesizerJob for monitoring and retrieving results.

resolve_job_config() typing_extensions.Self#

Resolve and validate the final job configuration without submitting.

Returns:

The builder instance for method chaining.

synthesize() typing_extensions.Self#

Enable data synthesis for the job run.

with_classify_model_provider(
provider_name: str,
) typing_extensions.Self#

Configure column classification using an Inference Gateway model provider.

The model provider should be configured to serve an LLM suitable for column classification tasks.

Parameters:

provider_name – Name of the model provider. Provide just the name to use a provider in the current workspace (e.g. "my-classify-llm"), or a fully-qualified workspace/provider_name reference to use a provider from a different workspace.

Returns:

The builder instance for method chaining.

with_data(
config: safe_synthesizer.job_builder._ConfigInput = None,
**kwargs,
) typing_extensions.Self#

Configure data parameters.

with_data_source(
df_source: pandas.DataFrame | str,
) typing_extensions.Self#

Set the data source for synthetic data generation.

Parameters:

df_source – Training dataset as a pandas DataFrame or a fetchable URL.

Returns:

The builder instance for method chaining.

with_differential_privacy(
config: safe_synthesizer.job_builder._ConfigInput = None,
**kwargs,
) typing_extensions.Self#

Configure differential privacy parameters.

with_evaluate(
config: safe_synthesizer.job_builder._ConfigInput = None,
**kwargs,
) typing_extensions.Self#

Configure evaluation parameters.

with_generate(
config: safe_synthesizer.job_builder._ConfigInput = None,
**kwargs,
) typing_extensions.Self#

Configure generation parameters.

Calling this method also enables synthesis.

with_hf_token_secret(secret_name: str) typing_extensions.Self#

Configure HuggingFace authentication using a platform secret.

The secret must exist in the same workspace as the job and should contain a valid HuggingFace token.

Parameters:

secret_name – Name of the platform secret containing the HuggingFace token.

Returns:

The builder instance for method chaining.

with_replace_pii(
config: safe_synthesizer.job_builder._ConfigInput = None,
**kwargs,
) typing_extensions.Self#

Configure PII replacement.

Calling this method enables PII replacement for the job. If no config is provided the service will use its own defaults.

Parameters:
  • config – PII replacement config as a dict or Pydantic model, or None to use service-side defaults.

  • **kwargs – Individual PII replacement parameters to override.

Returns:

The builder instance for method chaining.

with_time_series(
config: safe_synthesizer.job_builder._ConfigInput = None,
**kwargs,
) typing_extensions.Self#

Configure time-series parameters.

with_train(
config: safe_synthesizer.job_builder._ConfigInput = None,
**kwargs,
) typing_extensions.Self#

Configure training hyperparameters.

Calling this method also enables synthesis.

safe_synthesizer.job_builder.logger#

‘getLogger(…)’