Synthetic Data Generation with StyleGAN-XL#
Synthetic data generation is a powerful approach to augment training datasets, especially when real-world data is limited, expensive to collect, or sensitive in nature. One of the most advanced tools for generating high-quality synthetic images is StyleGAN-XL, a large-scale generative adversarial network designed for high-resolution image synthesis.
StyleGAN-XL extends the capabilities of the original StyleGAN architecture by improving training stability and scalability, making it suitable for generating diverse and realistic samples across a wide range of categories. It is particularly effective for generating class-conditional data and scaling to complex domains with high fidelity.
This technique is commonly used to:
Balance class distribution in imbalanced datasets
Generate rare or underrepresented examples
Augment training sets for privacy-sensitive applications
Improve robustness and generalization of downstream models
StyleGAN-XL is included in TAO. It supports the following tasks:
dataset_convert
train
evaluate
inference
export
These tasks can be invoked from the TAO Launcher using the following convention on the command line:
tao model stylegan_xl <sub_task> <args_per_subtask>
where, args_per_subtask
are the command-line arguments required for a given subtask. Each
subtask is explained in detail in the following sections.
Data Input for StyleGAN-XL#
StyleGAN-XL requires the dataset to be organized in a folder structure where each class is represented by a subfolder containing its corresponding images. See the Data Annotation Format page for more information
Creating a Dataset Convert Spec File#
We provide a dataset tool entry point, dataset_convert
, to convert the dataset folder described above into a zipped file. This tool serves two main purposes:
Dataset Portability and Performance: Zipping datasets simplifies transferring them between file servers and clusters and may improve training performance when using network file systems.
Image Preprocessing: StyleGAN-XL requires square, fixed-resolution images for training. The tool can crop and/or resize images to meet the resolution requirements of StyleGAN-XL’s progressive training workflow. This training process involves starting with lower resolutions (e.g., 16x16) and progressively increasing to higher resolutions (e.g., 256x256). Consequently, we need multiple versions of the dataset, such as 16x16, 32x32, 64x64, 128x128, and 256x256.
Here is an example spec file for converting images of a train folder into a zipped file with images resized to 16x16 resolution.
source: /path/to/dataset_root/train
results_dir: /path/to/experiment_results
dest_file_name: train_16.zip
resolution: [16, 16]
transform: null
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
string |
Sorce dataset which follows image_classification_format |
/path/to/dataset_root/train |
||||
|
string |
Result directory |
/path/to/experiment_results |
||||
|
string |
Destination zipped file name generated from source dataset |
train_16.zip |
||||
|
list |
The resolution of the resized image |
[128, 128] |
FALSE |
|||
|
categorical |
Transformation such as ‘center-crop’ before resizing can avoid distortion |
center-crop |
Converting the Dataset#
Use the following command to run StyleGAN-XL dataset converting:
tao model stylegan_xl dataset_convert [-h] -e <dataset_convert_spec>
[results_dir=<global_results_dir>]
[source=<soure_image_folder>]
[dest_file_name=<destination_zipped_file_name>]
[resolution=<resized_resolution>]
[transform=<transformation_applied>]
Required Arguments#
The following arguments are required.
-e, --experiment_spec_file
: The path to the experiment spec file.
Optional Arguments#
You can set optional arguments to override the option values in the experiment spec file.
-h, --help
: Show this help message and exit.results_dir
: The parameter options.source
: The parameter options.dest_file_name
: The parameter options.resolution
: The parameter options.transform
: The parameter options.
Creating a Experiment Spec File#
The training experiment spec file for StyleGAN-XL includes model
, train
, and dataset
parameters.
Use the following command to create an experiment spec file for StyleGAN-XL:
SPECS=$(tao-client stylegan_xl get-spec --action train --job_type experiment --id $EXPERIMENT_ID)
See also
For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.
Here is an example spec file for training a StyleGAN-XL model with at 16x16 resolution.
results_dir: /path/to/experiment_results
encryption_key: tlt_encode
task: stylegan
train:
resume_training_checkpoint_path: null
pretrained_model_path: null
num_epochs: 3000
num_nodes: 1
num_gpus: 1
gpu_ids: [0]
deterministic_all: True
validation_interval: 1
checkpoint_interval: 1
stylegan:
gan_seed_offset: 0 # Try when encountering GAN mode collapsed
optim_generator:
lr: 0.0025
optim: "Adam"
betas: [0, 0.99]
eps: 1e-08
optim_discriminator:
lr: 0.002
optim: "Adam"
betas: [0, 0.99]
eps: 1e-08
results_dir: "${results_dir}/train"
model:
input_embeddings_path: /tao-pt/nvidia_tao_pytorch/sdg/stylegan_xl/pretrained_modules/tf_efficientnet_lite0_embed.pth
generator:
backbone: stylegan3-t
superres: False
added_head_superres: # Ignore this sub section when the superres == False
head_layers: [4, 4, 4, 4, 4]
up_factor: [2, 2, 2, 2, 2]
pretrained_stem_path: /path/to/the/stem.pth
reinit_stem_anyway: False
stem:
fp32: False
cbase: 16384
cmax: 256
syn_layers: 7
resolution: 16
stylegan:
loss:
cls_weight: 0.0
discriminator:
backbones: ["deit_base_distilled_patch16_224", "tf_efficientnet_lite0"]
metrics:
inception_fid_path: /tao-pt/nvidia_tao_pytorch/sdg/stylegan_xl/pretrained_modules/InceptionV3.pth
dataset:
common:
cond: True
num_classes: 6 # Be 0 when cond==False
img_channels: 3
img_resolution: 16 # 512 if the below is using resolution 512x512 images
stylegan:
train_dataset:
images_dir: /dataset/hyperkvasir_16/hyperkvasir_16_class.zip
validation_dataset:
images_dir: /dataset/hyperkvasir_16/hyperkvasir_16_class.zip
test_dataset:
images_dir: /dataset/hyperkvasir_16/hyperkvasir_16_class.zip
infer_dataset:
start_seed: 0
end_seed: 50
mirror: True
batch_size: 16
workers: 3
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
string |
Name of model if invoking task via |
|||||
|
string |
Key for encrypting model checkpoints |
|||||
|
string |
Path to where all the assets generated from a task are stored. |
/results |
||||
|
collection |
FALSE |
|||||
|
categorical |
The task to be performed. |
stylegan |
stylegan,bigdatasetgan |
|||
|
collection |
Configuration parameters for the model |
FALSE |
||||
|
collection |
Configuration parameters for the dataset |
FALSE |
||||
|
collection |
Configuration parameters for the training |
FALSE |
||||
|
collection |
Configuration parameters for the evaluation |
FALSE |
||||
|
collection |
Configuration parameters for the inference |
FALSE |
||||
|
collection |
Configuration parameters for the export |
FALSE |
||||
|
collection |
Configuration parameters for the TRT engine |
FALSE |
model#
The model
parameter contains hyperparameters for configuring model. Here is the model
section of the training experiment spec file
model:
input_embeddings_path: /tao-pt/nvidia_tao_pytorch/sdg/stylegan_xl/pretrained_modules/tf_efficientnet_lite0_embed.pth
generator:
backbone: stylegan3-t
superres: False
added_head_superres: # Ignore this sub section when the superres == False
head_layers: [4, 4, 4, 4, 4]
up_factor: [2, 2, 2, 2, 2]
pretrained_stem_path: /path/to/the/stem.pth
reinit_stem_anyway: False
stem:
fp32: False
cbase: 16384
cmax: 256
syn_layers: 7
resolution: 16
stylegan:
loss:
cls_weight: 0.0
discriminator:
backbones: ["deit_base_distilled_patch16_224", "tf_efficientnet_lite0"]
metrics:
inception_fid_path: /tao-pt/nvidia_tao_pytorch/sdg/stylegan_xl/pretrained_modules/InceptionV3.pth
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
collection |
Configuration parameters for the loss function |
FALSE |
||||
|
collection |
Configuration parameters for the generator |
FALSE |
||||
|
string |
The path to the pretrained input embeddings |
|||||
|
collection |
Configuration parameters for the StyleGAN model |
FALSE |
||||
|
collection |
Configuration parameters for the BigDatasetGAN model |
FALSE |
loss#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
float |
The weight for the classification loss. |
0 |
generator#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
categorical |
The backbone architecture to be used for the generator |
stylegan3-r |
stylegan3-t,stylegan3-r,stylegan2,fastgan |
|||
|
bool |
Whether to use super-resolution generator backbone |
False |
||||
|
collection |
Configuration parameters for super-resolution generator backbone |
FALSE |
||||
|
collection |
Configuration parameters for stem generator backbone |
FALSE |
added_head_superres#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
list |
The layers list to be added to the super-resolution generator backbone |
[7] |
FALSE |
|||
|
list |
The up-factor list for the super-resolution generator backbone |
[2] |
FALSE |
|||
|
string |
The path to the pretrained stem generator backbone |
|||||
|
bool |
Whether to reinitialize the stem generator backbone forcefully |
True |
||||
|
bool |
Whether to train the head only |
True |
stem#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
bool |
Whether to use fp32 for the stem generator backbone |
False |
||||
|
int |
The base channel for the stem generator backbone |
32768 |
||||
|
int |
The max channel for the stem generator backbone |
512 |
||||
|
int |
The number of syn layers for the stem generator backbone |
10 |
||||
|
int |
The resolution for the stem generator backbone |
128 |
train#
The train
parameter defines the hyperparameters of the training process.
train:
resume_training_checkpoint_path: null
pretrained_model_path: null
num_epochs: 3000
num_nodes: 1
num_gpus: 1
gpu_ids: [0]
deterministic_all: True
validation_interval: 1
checkpoint_interval: 1
stylegan:
gan_seed_offset: 0 # Try when encountering GAN mode collapsed
optim_generator:
lr: 0.0025
optim: "Adam"
betas: [0, 0.99]
eps: 1e-08
optim_discriminator:
lr: 0.002
optim: "Adam"
betas: [0, 0.99]
eps: 1e-08
results_dir: "${results_dir}/train"
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
int |
The number of GPUs to run the train job |
1 |
1 |
|||
gpu_ids |
list
|
List of GPU IDs to run the training on. The length of this list
must be equal to the number of gpus in
train.num_gpus |
[0]
|
FALSE
|
|||
|
int |
Number of nodes to run the training on. If > 1, then training runs on multiple nodes |
1 |
1 |
|||
|
int |
The seed for the initializer in PyTorch. If < 0, then it disables a fixed seed |
1234 |
-1 |
inf |
||
|
collection |
FALSE |
|||||
|
int |
Number of epochs to run the training |
10 |
1 |
inf |
||
|
int |
The interval (in epochs) at which a checkpoint is to be saved. Helps resume training |
1 |
1 |
|||
validation_interval |
int
|
The interval (in epochs) at which a evaluation
will be triggered on the validation dataset
|
1
|
1
|
|||
|
string |
Path to the checkpoint from which to resume training. |
|||||
|
string |
Path to the place where all the assets generated from a task are stored |
|||||
|
bool |
Whether to use deterministic training in order to reproduce the results |
False |
||||
|
string |
The path to the pretrained model |
|||||
|
collection |
Configuration parameters for the StyleGAN trainer |
FALSE |
||||
|
collection |
Configuration parameters for the BigDatasetGAN trainer |
FALSE |
||||
|
collection |
Configuration parameters for the tensorboard logger |
FALSE |
stylegan#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
int |
The seed offset for the GAN for randomness control |
0 |
||||
|
collection |
Configuration parameters for the generator optimizer |
FALSE |
||||
|
collection |
Configuration parameters for the discriminator optimizer |
FALSE |
optim_generator#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
string |
Type of optimizer used to train the generator |
Adam |
||||
|
float |
The learning rate for training the generator |
0.0025 |
0 |
inf |
TRUE |
|
|
float |
The epsilon for the Adam optimizer |
1e-08 |
||||
|
list |
The betas for the Adam optimizer |
[0, 0.99] |
FALSE |
optim_discriminator#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
string |
Type of optimizer used to train the discriminator |
Adam |
||||
|
float |
The learning rate for training the discriminator |
0.002 |
0 |
inf |
TRUE |
|
|
float |
The epsilon for the Adam optimizer |
1e-08 |
||||
|
list |
The betas for the Adam optimizer |
[0, 0.99] |
FALSE |
dataset#
The dataset
parameter defines the dataset path and hyperparameters for dataloader.
dataset:
common:
cond: True
num_classes: 6 # Be 0 when cond==False
img_channels: 3
img_resolution: 16 # 512 if the below is using resolution 512x512 images
stylegan:
train_dataset:
images_dir: /dataset/hyperkvasir_16/hyperkvasir_16_class.zip
validation_dataset:
images_dir: /dataset/hyperkvasir_16/hyperkvasir_16_class.zip
test_dataset:
images_dir: /dataset/hyperkvasir_16/hyperkvasir_16_class.zip
infer_dataset:
start_seed: 0
end_seed: 50
mirror: True
batch_size: 16
workers: 3
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
collection |
Configuration parameters for the StyleGAN dataset |
FALSE |
||||
|
collection |
Configuration parameters for the BigDatasetGAN dataset |
FALSE |
||||
|
collection |
Configuration parameters for the common dataset |
FALSE |
||||
|
int |
The batch size for the dataset |
64 |
1 |
inf |
||
|
bool |
Whether to pin the memory for the dataset |
True |
||||
|
int |
The prefetch factor for the dataset |
2 |
1 |
inf |
||
|
int |
The number of workers for the dataset |
3 |
1 |
inf |
stylegan#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
collection |
Configuration parameters for the training dataset |
FALSE |
||||
|
collection |
Configuration parameters for the validation dataset |
FALSE |
||||
|
collection |
Configuration parameters for the test dataset |
FALSE |
||||
|
collection |
Configuration parameters for the inference dataset |
FALSE |
||||
|
int |
The fixed batch size for a single GPU in order to achieve gradient accumulation |
16 |
1 |
inf |
||
|
bool |
Whether to mirror the images as augmentation in training |
True |
train_dataset/validation_dataset/test_dataset#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
string |
The path to the zipped file of images or directory of images |
??? |
infer_dataset#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
int |
The start seed for the seed dataset |
0 |
0 |
inf |
||
|
int |
The end seed for the seed dataset |
100 |
0 |
inf |
common#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
bool |
Whether to use conditional training |
False |
||||
|
int |
The resolution of the images |
128 |
||||
|
int |
The number of channels in the images |
3 |
||||
|
int |
The number of classes in the dataset |
0 |
Training the Model#
Use the following command to run StyleGAN-XL training:
TRAIN_JOB_ID=$(tao-client stylegan_xl experiment-run-action --action train --id $EXPERIMENT_ID --specs "$SPECS")
See also
For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.
tao model stylegan_xl train [-h] -e <experiment_spec>
[results_dir=<global_results_dir>]
[model.<model_option>=<model_option_value>]
[dataset.<dataset_option>=<dataset_option_value>]
[train.<train_option>=<train_option_value>]
[train.gpu_ids=<gpu indices>]
[train.num_gpus=<number of gpus>]
Required Arguments
-e, --experiment_spec_file
: The path to the experiment spec file.
Optional Arguments
You can set optional arguments to override the option values in the experiment spec file.
-h, --help
: Show this help message and exit.model.<model_option>
: The model options.dataset.<dataset_option>
: The dataset options.train.<train_option>
: The train options.
Evaluating the Model#
evaluate#
The evaluate
parameter defines the hyperparameters of the evaluate process.
evaluate:
num_nodes: 1
num_gpus: 1
gpu_ids: [0]
checkpoint: /path/to/model.pth
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
int |
The number of GPUs to run the evaluation job. |
1 |
1 |
|||
|
list |
|
[0] |
FALSE |
|||
|
int |
Number of nodes to run the evaluation on. If > 1, then multi-node is enabled. |
1 |
1 |
|||
|
string |
Path to the checkpoint used for evaluation. |
??? |
||||
|
string |
The path to the TRT engine. |
??? |
||||
|
string |
Path to where all the assets generated from a task are stored. |
|||||
|
int |
The number of batches to visualize the results. |
16 |
1 |
inf |
To run evaluation with a StyleGAN-XL model, use this command:
EVAL_JOB_ID=$(tao-client stylegan_xl experiment-run-action --action evaluate --id $EXPERIMENT_ID --parent_job_id $TRAIN_JOB_ID --specs "$SPECS")
See also
For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.
tao model stylegan_xl evaluate [-h] -e <experiment_spec>
evaluate.checkpoint=<model to be evaluated>
[evaluate.<evaluate_option>=<evaluate_option_value>]
[evaluate.gpu_ids=<gpu indices>]
[evaluate.num_gpus=<number of gpus>]
Required Arguments
The following arguments are required.
-e, --experiment_spec
: The experiment spec file to set up the evaluation experimentevaluate.checkpoint
: The.pth
model to be evaluated.
Optional Arguments
The following arguments are optional to run the command.
evaluate.<evaluate_option>
: The evaluate options.
Inferencing the Model#
inference#
The inference
parameter defines the hyperparameters of the inference process.
inference:
num_nodes: 1
num_gpus: 1
gpu_ids: [0]
checkpoint: /path/to/model.pth
truncation_psi: 1.0
translate: [0.0, 0.0]
rotate: 0.0
class_idx: 0
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
int |
The number of GPUs to run the inference job. |
1 |
1 |
|||
gpu_ids |
list
|
List of GPU IDs to run the inference on. The length of this list
must be equal to the number of gpus in inference.num_gpus.
|
[0]
|
FALSE
|
|||
|
int |
Number of nodes to run the inference on. If > 1, then multi-node is enabled. |
1 |
1 |
|||
|
string |
Path to the checkpoint used for inference. |
??? |
||||
|
string |
The path to the TRT engine. |
??? |
||||
|
string |
Path to the place where all the assets generated from a task are stored. |
|||||
|
int |
The number of batches to visualize the results. |
1 |
1 |
inf |
||
|
float |
The truncation psi for the image generation. |
1.0 |
0 |
1.0 |
||
|
list |
The translation for the image generation. |
[0.0, 0.0] |
FALSE |
|||
|
float |
The rotation for the image generation. |
0 |
0.0 |
360.0 |
||
|
string |
The path to the centroids. |
|||||
|
int |
The class index for the image generation. |
0 |
0 |
inf |
The inference tool for StyleGAN-XL can be used to generate synthetic based on the random seeds specified in the infer_dataset
INFERENCE_JOB_ID=$(tao-client stylegan_xl experiment-run-action --action inference --id $EXPERIMENT_ID --parent_job_id $TRAIN_JOB_ID --specs "$SPECS")
See also
For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.
tao model stylegan_xl inference [-h] -e <experiment spec file>
inference.checkpoint=<model to be inferenced>
[inference.<inference_option>=<inference_option_value>]
[inference.gpu_ids=<gpu indices>]
[inference.num_gpus=<number of gpus>]
Required Arguments
The following arguments are required to run the command.
-e, --experiment_spec
: The experiment spec file to set up the inference experimentinference.checkpoint
: The.pth
model to inference.
Optional Arguments
The following arguments are optional to run the command.
inference.<inference_option>
: The inference options.
Exporting the Model#
export#
The export
parameter defines the hyperparameters for the export process.
export:
gpu_id: 0
checkpoint: /path/to/model.pth
on_cpu: False
onnx_file: "${results_dir}/stylegan/styleganxl.onnx"
batch_size: -1
opset_version: 17
onnxruntime:
test_onnxruntime: True
sample_result_dir: "${results_dir}/stylegan"
runtime_seed: 0
runtime_batch_size: 2
runtime_class_dix: 2
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
hidden |
The path to the results directory. |
|||||
|
int |
The GPU ID. |
0 |
||||
|
string |
The absolute path to the checkpoint. |
??? |
||||
|
string |
The absolute path to the onnx file. |
??? |
||||
|
bool |
Whether to run the export on the CPU. |
False |
||||
|
int |
The ONNX opset version. |
12 |
||||
|
int |
The batch size for the export. -1 means the dynamic batch size. |
-1 |
||||
|
bool |
Whether to print the verbose output. |
False |
||||
|
collection |
Configuration parameters for the ONNX runtime. |
FALSE |
onnxruntime#
Field |
value_type |
description |
default_value |
valid_min |
valid_max |
valid_options |
automl_enabled |
---|---|---|---|---|---|---|---|
|
bool |
Whether to test the ONNX runtime. |
True |
||||
|
hidden |
The path to the sample result directory. |
|||||
|
int |
The seed for the runtime. |
0 |
||||
|
int |
The batch size for the runtime. |
1 |
||||
|
int |
The class index for the runtime. |
0 |
Use the following command to export the model:
EXPORT_JOB_ID=$(tao-client stylegan_xl experiment-run-action --action export --id $EXPERIMENT_ID --parent_job_id $TRAIN_JOB_ID --specs "$SPECS")
See also
For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.
tao model stylegan_xl export [-h] -e <experiment spec file>
export.checkpoint=<model to export>
export.onnx_file=<onnx path>
[export.<export_option>=<export_option_value>]
Required Arguments
The following arguments are required to run the command.
-e, --experiment_spec
: The path to an experiment spec fileexport.checkpoint
: The.pth
model to export.export.onnx_file
: The path where the.etlt
or.onnx
model is saved.
Optional Arguments
The following arguments are optional to run the command.
export.<export_option>
: The export options.