Synthetic Data Generation with StyleGAN-XL#

Synthetic data generation is a powerful approach to augment training datasets, especially when real-world data is limited, expensive to collect, or sensitive in nature. One of the most advanced tools for generating high-quality synthetic images is StyleGAN-XL, a large-scale generative adversarial network designed for high-resolution image synthesis.

StyleGAN-XL extends the capabilities of the original StyleGAN architecture by improving training stability and scalability, making it suitable for generating diverse and realistic samples across a wide range of categories. It is particularly effective for generating class-conditional data and scaling to complex domains with high fidelity.

This technique is commonly used to:

  • Balance class distribution in imbalanced datasets

  • Generate rare or underrepresented examples

  • Augment training sets for privacy-sensitive applications

  • Improve robustness and generalization of downstream models

StyleGAN-XL is included in TAO. It supports the following tasks:

  • dataset_convert

  • train

  • evaluate

  • inference

  • export

These tasks can be invoked from the TAO Launcher using the following convention on the command line:

tao model stylegan_xl <sub_task> <args_per_subtask>

where, args_per_subtask are the command-line arguments required for a given subtask. Each subtask is explained in detail in the following sections.

Data Input for StyleGAN-XL#

StyleGAN-XL requires the dataset to be organized in a folder structure where each class is represented by a subfolder containing its corresponding images. See the Data Annotation Format page for more information

Creating a Dataset Convert Spec File#

We provide a dataset tool entry point, dataset_convert, to convert the dataset folder described above into a zipped file. This tool serves two main purposes:

  1. Dataset Portability and Performance: Zipping datasets simplifies transferring them between file servers and clusters and may improve training performance when using network file systems.

  2. Image Preprocessing: StyleGAN-XL requires square, fixed-resolution images for training. The tool can crop and/or resize images to meet the resolution requirements of StyleGAN-XL’s progressive training workflow. This training process involves starting with lower resolutions (e.g., 16x16) and progressively increasing to higher resolutions (e.g., 256x256). Consequently, we need multiple versions of the dataset, such as 16x16, 32x32, 64x64, 128x128, and 256x256.

Here is an example spec file for converting images of a train folder into a zipped file with images resized to 16x16 resolution.

source: /path/to/dataset_root/train
results_dir: /path/to/experiment_results
dest_file_name: train_16.zip
resolution: [16, 16]
transform: null

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

source

string

Sorce dataset which follows image_classification_format

/path/to/dataset_root/train

results_dir

string

Result directory

/path/to/experiment_results

dest_file_name

string

Destination zipped file name generated from source dataset

train_16.zip

resolution

list

The resolution of the resized image

[128, 128]

FALSE

transform

categorical

Transformation such as ‘center-crop’ before resizing can avoid distortion

center-crop

Converting the Dataset#

Use the following command to run StyleGAN-XL dataset converting:

tao model stylegan_xl dataset_convert [-h] -e <dataset_convert_spec>
                           [results_dir=<global_results_dir>]
                           [source=<soure_image_folder>]
                           [dest_file_name=<destination_zipped_file_name>]
                           [resolution=<resized_resolution>]
                           [transform=<transformation_applied>]

Required Arguments#

The following arguments are required.

  • -e, --experiment_spec_file: The path to the experiment spec file.

Optional Arguments#

You can set optional arguments to override the option values in the experiment spec file.

Creating a Experiment Spec File#

The training experiment spec file for StyleGAN-XL includes model, train, and dataset parameters.

Use the following command to create an experiment spec file for StyleGAN-XL:

SPECS=$(tao-client stylegan_xl get-spec --action train --job_type experiment --id $EXPERIMENT_ID)

See also

For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

model_name

string

Name of model if invoking task via model_agnostic

encryption_key

string

Key for encrypting model checkpoints

results_dir

string

Path to where all the assets generated from a task are stored.

/results

wandb

collection

FALSE

task

categorical

The task to be performed.

stylegan

stylegan,bigdatasetgan

model

collection

Configuration parameters for the model

FALSE

dataset

collection

Configuration parameters for the dataset

FALSE

train

collection

Configuration parameters for the training

FALSE

evaluate

collection

Configuration parameters for the evaluation

FALSE

inference

collection

Configuration parameters for the inference

FALSE

export

collection

Configuration parameters for the export

FALSE

gen_trt_engine

collection

Configuration parameters for the TRT engine

FALSE

model#

The model parameter contains hyperparameters for configuring model. Here is the model section of the training experiment spec file

model:
  input_embeddings_path: /tao-pt/nvidia_tao_pytorch/sdg/stylegan_xl/pretrained_modules/tf_efficientnet_lite0_embed.pth
  generator:
    backbone: stylegan3-t
    superres: False
    added_head_superres: # Ignore this sub section when the superres == False
      head_layers: [4, 4, 4, 4, 4]
      up_factor: [2, 2, 2, 2, 2]
      pretrained_stem_path: /path/to/the/stem.pth
      reinit_stem_anyway: False
    stem:
      fp32: False
      cbase: 16384
      cmax: 256
      syn_layers: 7
      resolution: 16
  stylegan:
    loss:
      cls_weight: 0.0
    discriminator:
      backbones: ["deit_base_distilled_patch16_224", "tf_efficientnet_lite0"]
    metrics:
      inception_fid_path: /tao-pt/nvidia_tao_pytorch/sdg/stylegan_xl/pretrained_modules/InceptionV3.pth

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

loss

collection

Configuration parameters for the loss function

FALSE

generator

collection

Configuration parameters for the generator

FALSE

input_embeddings_path

string

The path to the pretrained input embeddings

stylegan

collection

Configuration parameters for the StyleGAN model

FALSE

bigdatasetgan

collection

Configuration parameters for the BigDatasetGAN model

FALSE

loss#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

cls_weight

float

The weight for the classification loss.

0

generator#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

backbone

categorical

The backbone architecture to be used for the generator

stylegan3-r

stylegan3-t,stylegan3-r,stylegan2,fastgan

superres

bool

Whether to use super-resolution generator backbone

False

added_head_superres

collection

Configuration parameters for super-resolution generator backbone

FALSE

stem

collection

Configuration parameters for stem generator backbone

FALSE

added_head_superres#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

head_layers

list

The layers list to be added to the super-resolution generator backbone

[7]

FALSE

up_factor

list

The up-factor list for the super-resolution generator backbone

[2]

FALSE

pretrained_stem_path

string

The path to the pretrained stem generator backbone

reinit_stem_anyway

bool

Whether to reinitialize the stem generator backbone forcefully

True

train_head_only

bool

Whether to train the head only

True

stem#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

fp32

bool

Whether to use fp32 for the stem generator backbone

False

cbase

int

The base channel for the stem generator backbone

32768

cmax

int

The max channel for the stem generator backbone

512

syn_layers

int

The number of syn layers for the stem generator backbone

10

resolution

int

The resolution for the stem generator backbone

128

train#

The train parameter defines the hyperparameters of the training process.

train:
  resume_training_checkpoint_path: null
  pretrained_model_path: null
  num_epochs: 3000
  num_nodes: 1
  num_gpus: 1
  gpu_ids: [0]
  deterministic_all: True
  validation_interval: 1
  checkpoint_interval: 1
  stylegan:
    gan_seed_offset: 0  # Try when encountering GAN mode collapsed
    optim_generator:
      lr: 0.0025
      optim: "Adam"
      betas: [0, 0.99]
      eps: 1e-08
    optim_discriminator:
      lr: 0.002
      optim: "Adam"
      betas: [0, 0.99]
      eps: 1e-08
  results_dir: "${results_dir}/train"

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

num_gpus

int

The number of GPUs to run the train job

1

1

gpu_ids

list

List of GPU IDs to run the training on. The length of this list
must be equal to the number of gpus in train.num_gpus
[0]







FALSE

num_nodes

int

Number of nodes to run the training on. If > 1, then training runs on multiple nodes

1

1

seed

int

The seed for the initializer in PyTorch. If < 0, then it disables a fixed seed

1234

-1

inf

cudnn

collection

FALSE

num_epochs

int

Number of epochs to run the training

10

1

inf

checkpoint_interval

int

The interval (in epochs) at which a checkpoint is to be saved. Helps resume training

1

1

validation_interval

int

The interval (in epochs) at which a evaluation
will be triggered on the validation dataset
1

1







resume_training_checkpoint_path

string

Path to the checkpoint from which to resume training.

results_dir

string

Path to the place where all the assets generated from a task are stored

deterministic_all

bool

Whether to use deterministic training in order to reproduce the results

False

pretrained_model_path

string

The path to the pretrained model

stylegan

collection

Configuration parameters for the StyleGAN trainer

FALSE

bigdatasetgan

collection

Configuration parameters for the BigDatasetGAN trainer

FALSE

tensorboard

collection

Configuration parameters for the tensorboard logger

FALSE

stylegan#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

gan_seed_offset

int

The seed offset for the GAN for randomness control

0

optim_generator

collection

Configuration parameters for the generator optimizer

FALSE

optim_discriminator

collection

Configuration parameters for the discriminator optimizer

FALSE

optim_generator#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

optim

string

Type of optimizer used to train the generator

Adam

lr

float

The learning rate for training the generator

0.0025

0

inf

TRUE

eps

float

The epsilon for the Adam optimizer

1e-08

betas

list

The betas for the Adam optimizer

[0, 0.99]

FALSE

optim_discriminator#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

optim

string

Type of optimizer used to train the discriminator

Adam

lr

float

The learning rate for training the discriminator

0.002

0

inf

TRUE

eps

float

The epsilon for the Adam optimizer

1e-08

betas

list

The betas for the Adam optimizer

[0, 0.99]

FALSE

dataset#

The dataset parameter defines the dataset path and hyperparameters for dataloader.

dataset:
  common:
    cond: True
    num_classes: 6 # Be 0 when cond==False
    img_channels: 3
    img_resolution: 16 # 512 if the below is using resolution 512x512 images
  stylegan:
    train_dataset:
      images_dir: /dataset/hyperkvasir_16/hyperkvasir_16_class.zip
    validation_dataset:
      images_dir: /dataset/hyperkvasir_16/hyperkvasir_16_class.zip
    test_dataset:
      images_dir: /dataset/hyperkvasir_16/hyperkvasir_16_class.zip
    infer_dataset:
      start_seed: 0
      end_seed: 50
    mirror: True
  batch_size: 16
  workers: 3

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

stylegan

collection

Configuration parameters for the StyleGAN dataset

FALSE

bigdatasetgan

collection

Configuration parameters for the BigDatasetGAN dataset

FALSE

common

collection

Configuration parameters for the common dataset

FALSE

batch_size

int

The batch size for the dataset

64

1

inf

pin_memory

bool

Whether to pin the memory for the dataset

True

prefetch_factor

int

The prefetch factor for the dataset

2

1

inf

workers

int

The number of workers for the dataset

3

1

inf

stylegan#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

train_dataset

collection

Configuration parameters for the training dataset

FALSE

validation_dataset

collection

Configuration parameters for the validation dataset

FALSE

test_dataset

collection

Configuration parameters for the test dataset

FALSE

infer_dataset

collection

Configuration parameters for the inference dataset

FALSE

batch_gpu_size

int

The fixed batch size for a single GPU in order to achieve gradient accumulation

16

1

inf

mirror

bool

Whether to mirror the images as augmentation in training

True

train_dataset/validation_dataset/test_dataset#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

images_dir

string

The path to the zipped file of images or directory of images

???

infer_dataset#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

start_seed

int

The start seed for the seed dataset

0

0

inf

end_seed

int

The end seed for the seed dataset

100

0

inf

common#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

cond

bool

Whether to use conditional training

False

img_resolution

int

The resolution of the images

128

img_channels

int

The number of channels in the images

3

num_classes

int

The number of classes in the dataset

0

Training the Model#

Use the following command to run StyleGAN-XL training:

TRAIN_JOB_ID=$(tao-client stylegan_xl experiment-run-action --action train --id $EXPERIMENT_ID --specs "$SPECS")

See also

For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.

Evaluating the Model#

evaluate#

The evaluate parameter defines the hyperparameters of the evaluate process.

evaluate:
  num_nodes: 1
  num_gpus: 1
  gpu_ids: [0]
  checkpoint: /path/to/model.pth

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

num_gpus

int

The number of GPUs to run the evaluation job.

1

1

gpu_ids

list

List of GPU IDs to run the evaluation on. The length of this list

must be equal to the number of gpus in evaluate.num_gpus.

[0]

FALSE

num_nodes

int

Number of nodes to run the evaluation on. If > 1, then multi-node is enabled.

1

1

checkpoint

string

Path to the checkpoint used for evaluation.

???

trt_engine

string

The path to the TRT engine.

???

results_dir

string

Path to where all the assets generated from a task are stored.

vis_after_n_batches

int

The number of batches to visualize the results.

16

1

inf

To run evaluation with a StyleGAN-XL model, use this command:

EVAL_JOB_ID=$(tao-client stylegan_xl experiment-run-action --action evaluate --id $EXPERIMENT_ID --parent_job_id $TRAIN_JOB_ID --specs "$SPECS")

See also

For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.

Inferencing the Model#

inference#

The inference parameter defines the hyperparameters of the inference process.

inference:
  num_nodes: 1
  num_gpus: 1
  gpu_ids: [0]
  checkpoint: /path/to/model.pth
  truncation_psi: 1.0
  translate: [0.0, 0.0]
  rotate: 0.0
  class_idx: 0

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

num_gpus

int

The number of GPUs to run the inference job.

1

1

gpu_ids

list

List of GPU IDs to run the inference on. The length of this list
must be equal to the number of gpus in inference.num_gpus.
[0]







FALSE

num_nodes

int

Number of nodes to run the inference on. If > 1, then multi-node is enabled.

1

1

checkpoint

string

Path to the checkpoint used for inference.

???

trt_engine

string

The path to the TRT engine.

???

results_dir

string

Path to the place where all the assets generated from a task are stored.

vis_after_n_batches

int

The number of batches to visualize the results.

1

1

inf

truncation_psi

float

The truncation psi for the image generation.

1.0

0

1.0

translate

list

The translation for the image generation.

[0.0, 0.0]

FALSE

rotate

float

The rotation for the image generation.

0

0.0

360.0

centroids_path

string

The path to the centroids.

class_idx

int

The class index for the image generation.

0

0

inf

The inference tool for StyleGAN-XL can be used to generate synthetic based on the random seeds specified in the infer_dataset

INFERENCE_JOB_ID=$(tao-client stylegan_xl experiment-run-action --action inference --id $EXPERIMENT_ID --parent_job_id $TRAIN_JOB_ID --specs "$SPECS")

See also

For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.

Exporting the Model#

export#

The export parameter defines the hyperparameters for the export process.

export:
  gpu_id: 0
  checkpoint: /path/to/model.pth
  on_cpu: False
  onnx_file: "${results_dir}/stylegan/styleganxl.onnx"
  batch_size: -1
  opset_version: 17
  onnxruntime:
    test_onnxruntime: True
    sample_result_dir: "${results_dir}/stylegan"
    runtime_seed: 0
    runtime_batch_size: 2
    runtime_class_dix: 2

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

results_dir

hidden

The path to the results directory.

gpu_id

int

The GPU ID.

0

checkpoint

string

The absolute path to the checkpoint.

???

onnx_file

string

The absolute path to the onnx file.

???

on_cpu

bool

Whether to run the export on the CPU.

False

opset_version

int

The ONNX opset version.

12

batch_size

int

The batch size for the export. -1 means the dynamic batch size.

-1

verbose

bool

Whether to print the verbose output.

False

onnxruntime

collection

Configuration parameters for the ONNX runtime.

FALSE

onnxruntime#

Field

value_type

description

default_value

valid_min

valid_max

valid_options

automl_enabled

test_onnxruntime

bool

Whether to test the ONNX runtime.

True

sample_result_dir

hidden

The path to the sample result directory.

runtime_seed

int

The seed for the runtime.

0

runtime_batch_size

int

The batch size for the runtime.

1

runtime_class_dix

int

The class index for the runtime.

0

Use the following command to export the model:

EXPORT_JOB_ID=$(tao-client stylegan_xl experiment-run-action --action export --id $EXPERIMENT_ID --parent_job_id $TRAIN_JOB_ID --specs "$SPECS")

See also

For information on how to create an experiment using the remote client, refer to the Creating an experiment section in the Remote Client documentation.