Reconstruct an AV Scene with NuRec#

This guide walks you through training the NuRec reconstruction model to generate a 3D scene from real-world sensor data. Once you have a trained scene, you can render from novel camera positions — see Generate Novel Views with NuRec.

Before you begin, make sure you have the right hardware and software set up to run NuRec, and that you have prepared your data.

Download the NuRec Container#

docker pull nvcr.io/nvidia/nre/nre-ga:26.04

All NuRec operations use the same Docker invocation structure: docker run with two volume mounts (your dataset directory and an output directory), followed by NuRec-specific arguments. The examples in this guide use those mounts consistently.

Tip

Prerequisites:

You need an NGC API key (NGC_API_KEY) to pull and run NuRec containers. If you haven’t set this up yet, see Set Up Your Software Environment.
Make sure you have converted your dataset to the NCore format and generated the required auxiliary data using the nre-tools container before continuing. See Prepare Your Data for details.

Launch Reconstruction Model Training#

Choose a training path below. Instant NuRec is recommended. It seeds NuRec with a pre-trained 3D Gaussian representation generated from your NCore data, significantly reducing required training iterations without sacrificing reconstruction quality.

Instant NuRec (Recommended)

Step 1: Install the package

Clone the Instant NuRec repo, then run the setup script to create a virtual environment and install dependencies:

./setup.sh
source .venv/bin/activate

Step 2: Run Instant NuRec

Run inference with --merge to produce a single merged PLY file suitable for NuRec initialization:

python run_inference.py \
  --ncore-path /path/to/clip.json \
  --output-dir /path/to/instant_nurec_output \
  --merge

This writes a single PLY to your output directory named after the NCore sequence (e.g. clip.ply for --ncore-path /path/to/clip.json).

Note

Omit --merge to write per-chunk PLY files instead. Voxelization is bundled with merge and only runs when --merge is set. Use --n-gaussians to control the target Gaussian count after voxelization (default: 2,000,000 — the same as the NuRec initialization default, so this parameter does not need to be changed in most cases). The voxel size is determined by a bracketed binary search targeting [0.9 × n-gaussians, n-gaussians].

Using the pre-trained model

When you use Instant NuRec, you don’t need to download the model manually. On the first inference run, the package automatically fetches the pre-trained instant_nurec.pt checkpoint from the nvidia/instant-nurec HuggingFace repo and caches it at ~/.cache/huggingface/nvidia/instant_nurec/. This means your first inference run also takes longer than subsequent runs that leverage the cached model.

To use a local checkpoint instead, set the INSTANT_NUREC_FULL_PT environment variable to the path of your local checkpoint file:

export INSTANT_NUREC_FULL_PT=/path/to/local_checkpoint_copy.pt

Working Example Using HuggingFace Sample Data

The clip lives in a gated HF dataset. Accept the terms at nvidia/PhysicalAI-Autonomous-Vehicles-NCore while logged into Hugging Face, then hf auth login locally; the same auth covers the nvidia/instant-nurec model auto-download on first run.

# Download the clip (~2 GB)
huggingface-cli download \
    nvidia/PhysicalAI-Autonomous-Vehicles-NCore --repo-type dataset \
    --include "clips/000da9de-0ee5-465a-9a2d-e7e91d3016bb/*" \
    --local-dir ./demo_clip

# Reconstruct it
python run_inference.py \
    --ncore-path ./demo_clip/clips/000da9de-0ee5-465a-9a2d-e7e91d3016bb/pai_000da9de-0ee5-465a-9a2d-e7e91d3016bb.json \
    --output-dir ./demo_output \
    --merge

Success looks like a single PLY in your output directory — approximately 1.88 M Gaussians, kl-optimal voxelized from 2.87 M merged (3.18 M pre-merge across 2 chunks). Omit --merge to write per-chunk PLYs instead.

Step 3: Pass the Gaussians to NuRec

Important

Before running NuRec training on PAI dataset clips, you must generate auxiliary data. See Generate NuRec Auxiliary Data for instructions. The auxiliary files must be present in the same directory as the .json manifest and .zarr.itar files before training.

Run NuRec training with one additional volume mount for the Instant NuRec output and the initialization config flags:

docker run --shm-size=64g --rm --gpus all \
  -e NGC_API_KEY=${NGC_API_KEY} \
  --volume <DATASET_DIR>:/workdir/dataset \
  --volume <OUTPUT_DIR>:/workdir/output \
  --volume /path/to/instant_nurec_output:/workdir/instant_nurec \
  nvcr.io/nvidia/nre/nre-ga:26.04 \
  mode=train \
  out_dir=/workdir/output \
  --config-name=configs/apps/prod/Hyperion-8.1/car2sim_6cam.yaml \
  dataset.path=/workdir/dataset/pai_<CLIP_ID>.json \
  dataset.camera_ids=[<CAM1>,<CAM2>,<CAM3>] \
  dataset.lidar_ids=[lidar_top_360fov] \
  dataset.aux_data=True \
  model/gaussians/initialization@model.layers.background.initialization=nrm_ply \
  model.layers.background.initialization.path=/workdir/instant_nurec/merged.ply \
  model.layers.background.initialization.num_point_cloud_points=2000000 \
  dataset.n_samples_per_epoch=30000 \
  model.layers.background.optimizers.0.params.positions.args.lr=5.06e-5 \
  model.layers.dynamic_rigids.optimizers.0.params.positions.args.lr=4e-5

Key parameters:

--config-name — The training configuration file. The example above uses a multi-camera AV configuration. See Customize NuRec Configuration for available configs and overrides.
dataset.path — Path to the NCore .json manifest file, relative to the dataset volume mount.
dataset.camera_ids — Comma-separated list of camera IDs to use for training (no spaces).
dataset.aux_data=True — Enables loading of auxiliary data generated by the nre-tools container.

Standard Training

Run the following command to begin training without an initialization seed.

docker run --shm-size=64g --rm --gpus all \
  -e NGC_API_KEY=${NGC_API_KEY} \
  --volume <DATASET_DIR>:/workdir/dataset \
  --volume <OUTPUT_DIR>:/workdir/output \
  nvcr.io/nvidia/nre/nre-ga:26.04 \
  mode=train \
  out_dir=/workdir/output \
  --config-name=configs/apps/prod/Hyperion-8.1/car2sim_6cam.yaml \
  dataset.path=/workdir/dataset/pai_<CLIP_ID>.json \
  dataset.camera_ids=[<CAM1>,<CAM2>,<CAM3>] \
  dataset.lidar_ids=[lidar_top_360fov] \
  dataset.aux_data=True

Replace <DATASET_DIR> with the path to the directory containing your NCore .json manifest, .zarr.itar files, and auxiliary .aux.*.zarr files. Replace <OUTPUT_DIR> with the path where you want NuRec to write output.

Key parameters:

--config-name — The training configuration file. The example above uses a multi-camera AV configuration. See Customize NuRec Configuration for available configs and overrides.
dataset.path — Path to the NCore .json manifest file, relative to the dataset volume mount.
dataset.camera_ids — Comma-separated list of camera IDs to use for training (no spaces).
dataset.aux_data=True — Enables loading of auxiliary data generated by the nre-tools container.

Working Example Using HuggingFace Sample Data

The following example uses a clip from the NVIDIA NCore-converted Physical AI dataset. After downloading the clip and generating auxiliary data, your dataset directory should contain the .json manifest, .zarr.itar files, and .aux.*.zarr files together.

docker run --shm-size=64g --rm --gpus all \
  -e NGC_API_KEY=${NGC_API_KEY} \
  --volume /home/user/nurec/data:/workdir/dataset \
  --volume /home/user/nurec/output:/workdir/output \
  nvcr.io/nvidia/nre/nre-ga:26.04 \
  mode=train \
  out_dir=/workdir/output \
  --config-name=configs/apps/prod/Hyperion-8.1/car2sim_6cam.yaml \
  dataset.path=/workdir/dataset/pai_100ae358-f548-49b8-af4d-c0afdbcfe9ed.json \
  dataset.camera_ids=[camera_front_wide_120fov,camera_cross_left_120fov,camera_cross_right_120fov,camera_rear_left_70fov,camera_rear_right_70fov] \
  dataset.lidar_ids=[lidar_top_360fov] \
  dataset.aux_data=True

Note

Copy the auxiliary .aux.*.zarr files (generated by nre-tools) into the same directory as the .zarr.itar and .json files before training. This clip uses five cameras; the car2sim_6cam.yaml config supports up to six. Adjust dataset.camera_ids to match the cameras available in your dataset.

Run on Multi-GPU Systems#

By default, NuRec uses only the first visible GPU. To run on multiple GPUs, append the following flags to the launch command — set world_size to the total number of GPUs and num_nodes to the number of nodes (one GPU per task):

trainer.world_size=<NUM_GPUS> trainer.num_nodes=<NUM_NODES>

For example, to use 8 GPUs across 2 nodes (4 GPUs per node):

trainer.world_size=8 trainer.num_nodes=2

To limit which GPUs are visible, set the CUDA_VISIBLE_DEVICES environment variable before running the container.

Note

Setting both trainer.world_size and trainer.num_nodes to 0 enables SLURM mode, which uses all visible GPUs and compute nodes.

Note

Using Harmonizer at training time is not supported with multi-GPU training. Use a single GPU to enable training-time Harmonizer, or apply Harmonizer at inference time via the gRPC server.

Configuration Options#

Override the number of training iterations (default: 30 epochs):

trainer.max_epochs=N

Reconstruct using a subset of cameras:

dataset.camera_ids="['<ID1>','<ID2>','<ID3>']"

Note

Do not include spaces between IDs and commas.

Increase the debug log level:

+log_level=N

Log level options: 0 (FATAL only), 1 (ERROR+), 2 (WARNING+), 3 (INFO+), 4 (DEBUG, all logs).

For training, validation, and combined run modes, see Run NuRec Validation.

Training Artifacts#

The training pipeline produces the following output in <OUTPUT_DIR>/<RUN-ID>/, where <RUN-ID> is an identifier auto-generated by NuRec for each training run.

usd-out/last.usdz — The reconstructed 3D scene, packaged as a USDZ file. This is the primary artifact you’ll use in your simulation platform. It contains:
- An XODR drivable map
- USDA scene definition files (mapping, domelight, sequence tracks, rig trajectories)
- The AI-trained reconstruction checkpoint (Gaussian splat positions and auxiliary data)
- JSON representations of sequence tracks and rig trajectories
config/parsed.yaml — The full configuration used for training (required for validation and rendering).
checkpoints/last.ckpt — The final trained checkpoint (required for validation and novel view rendering).
Log files from the training pipeline.

Next Steps#

Before using your scene in production, run validation to confirm reconstruction quality — see Run NuRec Validation.

To render from camera positions beyond the original trajectory, continue to Generate Novel Views with NuRec.

To integrate your scene directly with a simulation platform, launch the NuRec gRPC API server pointing at your last.usdz file.

To view all available arguments for training, validation, and export utilities, run the following:

docker run --shm-size=64g --rm --gpus all \
  -e NGC_API_KEY=${NGC_API_KEY} \
  nvcr.io/nvidia/nre/nre-ga:26.04 --help

To get help for a specific export utility:

docker run --shm-size=64g --rm --gpus all \
  -e NGC_API_KEY=${NGC_API_KEY} \
  nvcr.io/nvidia/nre/nre-ga:26.04 \
  <EXPORT_UTILITY_NAME> \
  --help