List and Retrieve NeMo Safe Synthesizer Jobs#

List all NeMo Safe Synthesizer jobs and retrieve details for specific jobs.

Prerequisites#

Before you can list and retrieve NeMo Safe Synthesizer jobs, make sure that you have:

  • Obtained the base URL of your NeMo Safe Synthesizer service

  • Set the SAFE_SYN_BASE_URL environment variable to your NeMo Safe Synthesizer service endpoint

export SAFE_SYN_BASE_URL="https://your-safe-synthesizer-service-url"

To List NeMo Safe Synthesizer Jobs#

Choose one of the following options to list all NeMo Safe Synthesizer jobs.

import os
from nemo_microservices import NeMoMicroservices

# Initialize the client
client = NeMoMicroservices(
    base_url=os.environ['SAFE_SYN_BASE_URL']
)

# List all jobs
jobs = client.beta.safe_synthesizer.jobs.list()

print(f"Found {len(jobs)} jobs")
for job in jobs:
    print(f"Job {job.id}: {job.name} - {job.status}")
    print(f"  Created: {job.created_at}")
    print(f"  Project: {job.project}")
# List all jobs
curl -X GET \
  "${SAFE_SYN_BASE_URL}/v1beta1/safe-synthesizer/jobs" \
  -H 'Accept: application/json' \
  | jq
Example List Response
[
  {
    "id": "job-abc123def456",
    "name": "pii-redaction-example",
    "project": "default",
    "status": "completed",
    "created_at": "2024-01-15T10:30:00.000Z",
    "updated_at": "2024-01-15T10:45:00.000Z"
  },
  {
    "id": "job-def456ghi789",
    "name": "full-pipeline-demo",
    "project": "default", 
    "status": "active",
    "created_at": "2024-01-15T11:00:00.000Z",
    "updated_at": "2024-01-15T11:15:00.000Z"
  }
]

To Retrieve a Specific NeMo Safe Synthesizer Job#

Choose one of the following options to get details for a specific NeMo Safe Synthesizer job.

# Retrieve specific job details
job_id = "job-abc123def456"
job = client.beta.safe_synthesizer.jobs.retrieve(job_id)

print(f"Job ID: {job.id}")
print(f"Name: {job.name}")
print(f"Status: {job.status}")
print(f"Project: {job.project}")
print(f"Created: {job.created_at}")
print(f"Updated: {job.updated_at}")

# Access job configuration
if hasattr(job, 'spec') and job.spec:
    print(f"Data source: {job.spec.data_source}")
    if hasattr(job.spec, 'config') and job.spec.config:
        print(f"Configuration keys: {list(job.spec.config.keys())}")
JOB_ID="job-abc123def456"

# Get specific job details
curl -X GET \
  "${SAFE_SYN_BASE_URL}/v1beta1/safe-synthesizer/jobs/${JOB_ID}" \
  -H 'Accept: application/json' \
  | jq
Example Retrieve Response
{
  "id": "job-abc123def456",
  "name": "pii-redaction-example",
  "project": "default",
  "status": "completed",
  "created_at": "2024-01-15T10:30:00.000Z",
  "updated_at": "2024-01-15T10:45:00.000Z",
  "spec": {
    "data_source": "dataset-xyz789",
    "enable_replace_pii": true,
    "enable_synthesis": false,
    "replace_pii": {
      "globals": {"locales": ["en_US"]},
      "steps": [
        {
          "rows": {
            "update": [
              {"entity": ["email", "phone_number"], "value": "column.entity | fake"}
            ]
          }
        }
      ]
    },
    "config": {}
  }
}

Job Properties#

When listing or retrieving jobs, you see these key properties:

  • id: Unique identifier for the job

  • name: Human-readable name assigned to the job

  • project: Project (typically “default”)

  • status: Current job status (created, pending, active, cancelling, cancelled, error, completed)

  • created_at: Time when the system creates the job

  • updated_at: Time when the system last updates the job

  • spec: Job specification containing data source and configuration details

Filtering and Finding Jobs#

To find specific jobs, you can filter the results in your application code:

# List all jobs
jobs = client.beta.safe_synthesizer.jobs.list()

# Filter by status
completed_jobs = [job for job in jobs if job.status == "completed"]
print(f"Found {len(completed_jobs)} completed jobs")

# Filter by name pattern
pii_jobs = [job for job in jobs if "pii" in job.name.lower()]
print(f"Found {len(pii_jobs)} PII-related jobs")

# Find most recent job
if jobs:
    latest_job = max(jobs, key=lambda j: j.created_at)
    print(f"Latest job: {latest_job.name} ({latest_job.id})")