NeMo Data Designer Microservice Deployment Guide#
The NeMo Data Designer microservice enables synthetic data generation capabilities for the NeMo platform. NeMo Data Designer is in Early Access and supports deployment in “Quickstart” mode using Docker Compose. All the necessary components to run Data Designer are deployed as part of the Docker Compose and the Data Designer API is exposed on your localhost.
Data Designer supports generating realistic synthetic datasets with various column types and constraints, leveraging large language models for intelligent data creation.
Key Features#
Synthetic Data Generation: Create realistic datasets with various data types
Column-based Configuration: Define custom column types, constraints, and relationships
LLM Integration: Leverage language models for intelligent data generation
Batch Processing: Support for large-scale dataset generation via batch jobs
API-driven: RESTful API for programmatic access and integration
Prerequisites#
Before deploying Data Designer, ensure you have:
Docker and Docker Compose installed
NGC API key for accessing NVIDIA container registry
Access to LLM endpoints (local NIM or NVIDIA API)
Sufficient storage for generated artifacts
Deployment Options#
Choose one of the following deployment options based on your use case:
Deploy the Data Designer microservice using Docker Compose for local development and testing.
Configuration and Management#
Customize and manage your Data Designer deployment:
Resolve common issues and debug deployment problems.