Download Result#

Download result artifacts from completed NeMo Safe Synthesizer jobs to access synthetic datasets, evaluation reports, and privacy analysis.

Prerequisites#

Before you can download job results, make sure that you have:

Obtained the base URL of your NeMo Safe Synthesizer service
Set the SAFE_SYN_BASE_URL environment variable to your NeMo Safe Synthesizer service endpoint
A completed job with available results
The specific result_name you want to download

export SAFE_SYN_BASE_URL="https://your-safe-synthesizer-service-url"

Download a Specific Result#

Download the actual file content for a result artifact. The download returns binary data that you can save to disk or process in memory.

Python SDK

import os
from pathlib import Path
from nemo_microservices import NeMoMicroservices

# Initialize the client
client = NeMoMicroservices(
    base_url=os.environ['SAFE_SYN_BASE_URL']
)

# Download a specific result
job_id = "job-abc123def456"
result_name = "synthetic_data.csv"

try:
    # Download the result
    binary_response = client.beta.safe_synthesizer.jobs.results.download(
        result_name, job_id=job_id
    )
    
    # Save to file
    output_path = Path(f"downloads/{result_name}")
    output_path.parent.mkdir(exist_ok=True)
    
    with open(output_path, 'wb') as f:
        f.write(binary_response.content)
    
    print(f"✓ Downloaded {result_name} to {output_path}")
    print(f"File size: {len(binary_response.content):,} bytes")
    
    # For text-based results, you can also access content directly
    if result_name.endswith(('.csv', '.json', '.txt')):
        text_content = binary_response.content.decode('utf-8')
        print(f"Preview (first 200 chars):")
        print(text_content[:200] + "..." if len(text_content) > 200 else text_content)
        
except Exception as e:
    print(f"Error downloading result: {e}")

REST API

# Download a result to a local file
curl -L -X GET \
  "${SAFE_SYN_BASE_URL}/v1beta1/safe-synthesizer/jobs/<JOB_ID>/results/<RESULT_NAME>/download" \
  -o "downloaded_result.bin"

# For specific file types, use appropriate extensions
curl -L -X GET \
  "${SAFE_SYN_BASE_URL}/v1beta1/safe-synthesizer/jobs/<JOB_ID>/results/synthetic_data.csv/download" \
  -o "synthetic_data.csv"

curl -L -X GET \
  "${SAFE_SYN_BASE_URL}/v1beta1/safe-synthesizer/jobs/<JOB_ID>/results/evaluation_report.json/download" \
  -o "evaluation_report.json"

Response: Binary file content with appropriate MIME type headers.

Working with Different Result Types#

Different result types require different handling approaches:

CSV/Tabular Data

import pandas as pd
from io import StringIO

# Download and load CSV data
binary_response = client.beta.safe_synthesizer.jobs.results.download(
    "synthetic_data.csv", job_id=job_id
)

# Convert to pandas DataFrame
csv_content = binary_response.content.decode('utf-8')
df = pd.read_csv(StringIO(csv_content))

print(f"Dataset shape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print("\nFirst few rows:")
print(df.head())

JSON Reports

import json

# Download and parse JSON report
binary_response = client.beta.safe_synthesizer.jobs.results.download(
    "evaluation_report.json", job_id=job_id
)

# Parse JSON content
json_content = binary_response.content.decode('utf-8')
report_data = json.loads(json_content)

print("Evaluation Report Summary:")
print(f"- Quality Score: {report_data.get('quality_score', 'N/A')}")
print(f"- Privacy Score: {report_data.get('privacy_score', 'N/A')}")
print(f"- Utility Metrics: {report_data.get('utility_metrics', {})}")

Binary Files

# Download binary files (PDFs, compressed archives, etc.)
binary_response = client.beta.safe_synthesizer.jobs.results.download(
    "privacy_analysis.pdf", job_id=job_id
)

# Save binary content directly
with open("privacy_analysis.pdf", "wb") as f:
    f.write(binary_response.content)

print(f"Downloaded binary file: {len(binary_response.content):,} bytes")

Batch Download Results#

Download all results for a job in one operation:

Python SDK

import os
from pathlib import Path

# List all results first
results = client.beta.safe_synthesizer.jobs.results.list(job_id)

# Create download directory
download_dir = Path(f"job_results_{job_id}")
download_dir.mkdir(exist_ok=True)

print(f"Downloading {len(results)} results to {download_dir}/")

for result in results:
    try:
        # Download each result
        binary_response = client.beta.safe_synthesizer.jobs.results.download(
            result.result_name, job_id=job_id
        )
        
        # Save to appropriately named file
        output_path = download_dir / result.result_name
        with open(output_path, 'wb') as f:
            f.write(binary_response.content)
            
        print(f"✓ Downloaded {result.result_name} ({len(binary_response.content):,} bytes)")
        
    except Exception as e:
        print(f"✗ Failed to download {result.result_name}: {e}")

print(f"\nDownload complete. Files saved to {download_dir}/")

Shell Script

#!/bin/bash

JOB_ID="job-abc123def456"
DOWNLOAD_DIR="job_results_${JOB_ID}"

# Create download directory
mkdir -p "${DOWNLOAD_DIR}"

# Get list of results (requires jq for JSON parsing)
RESULTS=$(curl -s "${SAFE_SYN_BASE_URL}/v1beta1/safe-synthesizer/jobs/${JOB_ID}/results")

# Download each result
echo "${RESULTS}" | jq -r '.[].result_name' | while read -r RESULT_NAME; do
    echo "Downloading ${RESULT_NAME}..."
    curl -L -X GET \
        "${SAFE_SYN_BASE_URL}/v1beta1/safe-synthesizer/jobs/${JOB_ID}/results/${RESULT_NAME}/download" \
        -o "${DOWNLOAD_DIR}/${RESULT_NAME}"
    echo "✓ Downloaded ${RESULT_NAME}"
done

echo "All results downloaded to ${DOWNLOAD_DIR}/"

Error Handling#

Common errors when downloading results:

Result Not Found

{
  "detail": "Result not found: invalid_result_name"
}

Solution: List available results first to confirm the correct result_name.

Result Not Ready

{
  "detail": "Result artifact not yet available"
}

Solution: Wait for the job to complete fully. Check job status before downloading.

Storage Access Error

{
  "detail": "Unable to access result storage"
}

Solution: This is typically a temporary issue. Retry after a moment.

Large File Timeout

Error: Connection timeout during download Solution: For large files, use retry logic or download in chunks:

import time

def download_with_retry(client, result_name, job_id, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.beta.safe_synthesizer.jobs.results.download(
                result_name, job_id=job_id
            )
        except Exception as e:
            if attempt < max_retries - 1:
                print(f"Attempt {attempt + 1} failed, retrying in 5 seconds...")
                time.sleep(5)
            else:
                raise e

Next Steps#

After downloading results, you can:

Process synthetic data with your preferred data analysis tools
Review evaluation reports to understand data quality and privacy metrics
Share results with team members while following your organization’s data governance policies
Create new jobs based on insights from the evaluation reports

Important

Downloaded synthetic data maintains the privacy protections configured in your original job. Follow your organization’s data handling policies when storing, sharing, or processing downloaded results.