Use NuRec for Autonomous Vehicles#
To use NuRec to reconstruct real-world scenes as 3D simulations, follow the steps in this guide.
Before you begin, make sure you have the right hardware and software setup to run NuRec and prepared your data.
Download the NuRec Container#
To download the NuRec container, run the following command:
docker pull nvcr.io/nvidia/nre/nre-ga:latest
Tip
Tip: Make sure you have converted your dataset to the NCore format and have
generated required auxiliary data (using the nre-tools container).
See Prepare Your Data for more
details.
Run the Docker Base Command#
All NuRec operations use the same Docker base command. The only differences between training, validation, and export are the NuRec-specific arguments appended after the container name.
docker run --shm-size=64g --rm --gpus all \
-e NGC_API_KEY=${NGC_API_KEY} \
--volume <DATASET_DIR>:/workdir/dataset \
--volume <OUTPUT_DIR>:/workdir/output \
nvcr.io/nvidia/nre/nre-ga:latest \
<NUREC ARGUMENTS>
Replace <DATASET_DIR> with the
path to the directory containing your NCore .json manifest, .zarr.itar files,
and auxiliary .aux.*.zarr
files. Replace <OUTPUT_DIR> with the
path where you want NuRec to write output.
Launch the Reconstruction Model Training#
To begin training, append the following NuRec arguments to the Docker base command:
docker run --shm-size=64g --rm --gpus all \
-e NGC_API_KEY=${NGC_API_KEY} \
--volume <DATASET_DIR>:/workdir/dataset \
--volume <OUTPUT_DIR>:/workdir/output \
nvcr.io/nvidia/nre/nre-ga:latest \
mode=train \
out_dir=/workdir/output \
--config-name=configs/apps/prod/Hyperion-8.1/car2sim_6cam.yaml \
dataset.path=/workdir/dataset/pai_<clip_id>.json \
dataset.camera_ids=[<CAM1>,<CAM2>,<CAM3>] \
dataset.lidar_ids=[lidar_top_360fov] \
dataset.aux_data=True
The key parameters are:
-
--config-name: The training configuration file. The example above uses a multi-camera AV configuration. See Customize NuRec Configuration for available configs and overrides. -
dataset.path: Path to the NCore.jsonmanifest file, relative to the dataset volume mount. -
dataset.camera_ids: Comma-separated list of camera IDs to use for training (no spaces). -
dataset.aux_data=True: Enables loading of auxiliary data generated by thenre-toolscontainer.
Working Example Using HuggingFace Sample Data#
The following example uses a clip from the NVIDIA
NCore-converted Physical AI dataset.
After downloading the clip and generating auxiliary data, your dataset directory should
contain the .json
manifest,
.zarr.itar files, and
.aux.*.zarr files
together.
docker run --shm-size=64g --rm --gpus all \
-e NGC_API_KEY=${NGC_API_KEY} \
--volume /home/user/nurec/data:/workdir/dataset \
--volume /home/user/nurec/output:/workdir/output \
nvcr.io/nvidia/nre/nre-ga:latest \
mode=train \
out_dir=/workdir/output \
--config-name=configs/apps/prod/Hyperion-8.1/car2sim_6cam.yaml \
dataset.path=/workdir/dataset/pai_100ae358-f548-49b8-af4d-c0afdbcfe9ed.json \
dataset.camera_ids=[camera_front_wide_120fov,camera_cross_left_120fov,camera_cross_right_120fov,camera_rear_left_70fov,camera_rear_right_70fov] \
dataset.lidar_ids=[lidar_top_360fov] \
dataset.aux_data=True
Notes:
-
Copy the auxiliary
.aux.*.zarrfiles (generated bynre-tools) into the same directory as the.zarr.itarand.jsonfiles before training. -
The
config-nameflag should point to the configuration file for the model you want to train. You can find more configuration files in theconfigs/apps/prod/folder. For more information, see Customize NuRec Configuration.
Run NuRec on Multi-GPU Systems#
To run NuRec on multi-GPU systems, append the following flags to the launch command:
trainer.world_size=<NUM_GPUS> trainer.num_nodes=<NUM_NODES>
Notes:
-
<NUM_GPUS>is the number of GPUs to use for training. -
<NUM_NODES>is the number of nodes to use for training. -
You can assign four tasks per node, and 1 GPU per task, so, if you have 8 GPUs, you can use 2 nodes.
-
By default, NuRec uses only the first visible GPUs. When you set
trainer.world_size=0andtrainer.num_nodes=0, NuRec will try to utilize all the visible GPUs and compute nodes (operating as a SLURM environment). To limit the visible GPUs, use the CUDA_VISIBLE_DEVICES environment variable and identify the GPUs by ID. -
If you specify the available GPUs, the
trainer.world_sizeflag pulls the GPUs in the order they are specified in theCUDA_VISIBLE_DEVICESenvironment variable. For example, if you have 6 GPUs and you specifyCUDA_VISIBLE_DEVICES=1,2,3,4,5,0and pass thetrainer.world_size=4flag, NuRec uses the GPUs that correspond to the IDs1,2,3,4. -
Using Fixer at training time is not supported with multi-GPU training. Use NuRec on a single GPU to enable training-time Fixer or use Fixer at inference time.
Optional Configuration#
-
There are multiple modes available for the reconstruction:
train,val,trainval.-
train: Runs training steps for reconstruction -
val: Runs validation using a trained checkpoint -
trainval: Runs both training and validation steps
-
-
The number of training epochs is set to 30 by default. Append the following option to override the epoch value:
trainer.max_epochs=N -
For reconstruction on a subset of cameras, append the following option at the end of the launch command:
dataset.camera_ids="['<ID1>','<ID2>','<ID3>']"
Note: There should not be any space between the IDs and commas.
-
Increase the debug log level by appending the following option at the end of the launch command:
+log_level=NWhere N can be any of the following options:
-
For FATAL errors only, use
0 -
For ERROR messages and fatal errors, use
1 -
For WARNING (and errors), use
2 -
For INFO (and lower verbosity levels), use
3 -
For DEBUG (all logs), use
4
-
Details about the artifacts from the training#
The training pipeline creates two types of artifacts:
-
The full configuration used for training, named
parsed.yaml -
Checkpoints from the training epochs
The validation pipeline creates three types of artifacts:
-
A
metrics.yamlfile containing per-frame metrics and other details -
A depth map per frame and its corresponding video file
-
An opacity map per frame and its corresponding video file
-
A segmentation map per frame and its corresponding video file
There is also a folder with log files from both the training and validation pipelines.
Reconstruct Novel Views via Validation Pipeline#
You can also generate novel views, using the X-, Y-, and Z-axis shifts, as shown in the following figure.
To run validation on the trained model on your docker container, use the following command:
docker run --shm-size=64g --rm --gpus all \
-e NGC_API_KEY=${NGC_API_KEY} \
--volume /path/to/dataset:/workdir/dataset \
--volume /path/to/output/:/workdir/output \
nvcr.io/nvidia/nre/nre-ga:latest \
--config-name=/workdir/output/config/parsed.yaml \
mode=val \
resume=/workdir/output/checkpoints/last.ckpt \
out_dir=/workdir/output \
dataset.val_sensor_transl_delta_m="[0,2,0]"
Note:
-
Here the
/path/to/outputis different from the training step. It should point to the internal top directory containing the artifacts from the training. -
This step may ask you to configure
wandb. You may select option (3) to skip the configuration step.
For novel view synthesis:
-
You can use
dataset.val_sensor_transl_delta_m="[x,y,z]"to provide shifts in translation. It accepts values in meters. There should not be any space between the values. -
You can use
dataset.val_sensor_rot_delta_deg="[degree1, degree2, degree3]"to provide roll-pitch-yaw Euler angles relative to the car as shown in the picture above. There shouldn’t be any space between the values.
Details about the artifacts from the training#
The validation pipeline creates the following artifacts:
-
Validation creates a file named
metrics.yamlin the/path/to/output/valdirectory to store metrics. -
You can check the PSNR vs training view in
metrics.yaml. It should be under thetest/psnrfield in the YAML file. -
MP4 files to show the base reconstruction, depth map, opacity.
-
Individual images to show the base reconstruction, depth map, opacity per frame.
Help#
You can run help command to get more information about training, validation, and the export utilities.
docker run --shm-size=64g --rm --gpus all \
-e NGC_API_KEY=${NGC_API_KEY} \
nvcr.io/nvidia/nre/nre-ga:latest --help
docker run --shm-size=64g --rm --gpus all \
-e NGC_API_KEY=${NGC_API_KEY} \
nvcr.io/nvidia/nre/nre-ga:latest \
<export-utility-name> \
--help