Getting Started with NeMo Safe Synthesizer#
Get started with NeMo Safe Synthesizer for generating private synthetic versions of sensitive tabular datasets.
Prerequisites#
Before using NeMo Safe Synthesizer, complete the NeMo Platform Quickstart to install the CLI/SDK and deploy the platform.
NeMo Safe Synthesizer has the following additional requirements:
An NVIDIA GPU on the host machine with 80GB+ VRAM (check with
nvidia-smi). This is separate from any GPU inside a NIM container — Safe Synthesizer training runs directly on the host.Sufficient disk space for generated datasets (50GB+ recommended)
For general platform troubleshooting (port conflicts, health checks, and so on), refer to the main quickstart guide.
Note
The platform pre-configures a system/nvidia-build model provider during startup.
This provider routes inference requests to models hosted on build.nvidia.com using the API base URL https://integrate.api.nvidia.com
and the NGC API key with Public API Endpoints permissions provided during deployment (automatically saved as the built-in system/ngc-api-key secret).
You can verify this provider exists by running nmp inference providers list --workspace system.
The tutorials in these docs use this provider for inference, but you can alternatively create your own and use it instead.
Using the CLI#
Interact with NeMo Safe Synthesizer using the nmp CLI:
# List jobs
nmp safe-synthesizer jobs list
# Create a job from a config file
nmp safe-synthesizer jobs create --input-file config.json
# Create a job with inline JSON
nmp safe-synthesizer jobs create --input-data '{"spec": {...}}'
Next Steps#
Run one of the tutorials to create your first synthetic dataset:
Safe Synthesizer 101 Tutorial - A beginner-friendly introduction
Differential Privacy Tutorial - Generate differentially-private synthetic data