Generate NuRec Auxiliary Data#
NuRec requires an additional dataset to reconstruct scenes when you convert and input your own real-world data. This dataset, called NuRec Auxiliary Data, includes the following required and optional data types:
Semantic Segmentation Data (conditionally required)
Depth Estimation Data (optional)
DINOv2 Feature Extraction (optional)
LiDAR Segmentation and Visibility (optional, recommended)
Metadata and Configuration (required)
To learn more about each data type, see the Learn About NuRec Auxiliary Data Types section.
Generate the Data#
You generate NuRec Auxiliary Data in its own container, available on NGC. To generate auxiliary data, follow the steps in this section.
Download the NuRec Auxiliary Data container by running the following command:
docker pull nvcr.io/nvidia/nre/nre-tools-ga:26.04
To start generating the data in the Docker container, edit the following command to reflect the correct paths and then run it:
docker run --shm-size=2g -it --rm --gpus all \
-e NGC_API_KEY=${NGC_API_KEY} \
--volume /path/to/dataset:/workdir/dataset \
--volume /path/to/output:/workdir/output \
nvcr.io/nvidia/nre/nre-tools-ga:26.04 \
--dataset-path=/workdir/dataset/<DATASET_NAME>.zarr.itar \
--output-dir=/workdir/output \
--camera-id=<ID1> --camera-id=<ID2> --camera-id=<ID3> \
--store-meta \
--no-seg-logits \
--lidar-seg-camvis
Notes:
Point the dataset volume mount to the directory where the
.zarr.itarand.jsonfile are saved from the previous section.Replace
<DATASET_NAME>.zarr.itarwith the filename for the file with extension.zarr.itar.Use the
--camera-id=<ID1> --camera-id=<ID2>flags to pass camera IDs multiple times. If you don’t pass any camera IDs, the model generates auxiliary data for all cameras. For example, Ifcamera_mainis your camera of choice, pass the--camera-id=camera_mainflag.Camera IDs can be found in the JSON file generated by NCore.
You can use the
--numba-num-threads <N>flag to increase the number of CPU threads and generate the auxiliary data faster.Files generated by above command will have
.auxadded to the filename:<DATASET_NAME>.aux.<>.zarr.itar.
Copy the generated
.aux.<>.zarrfiles to the same directory as your NCore data, then follow the steps in Reconstruct an AV Scene for training.
Options for Generating the Data#
The following table outlines all the options you can pass when you run the command to generate the supplemental auxiliary data.
Note: Either shard-file-pattern or dataset-path are required, in addition to output-dir. Use --dataset-path to point to the NCore .json manifest file. Use --shard-file-pattern for monolithic shard files. If you’re using the NVIDIA NCore-converted Physical AI dataset or NVIDIA Physical AI raw dataset, use dataset-path.
Option |
Description |
|---|---|
|
Data shard pattern to load (supports range expansion) |
|
Path to NCore |
|
Path to the output folder |
|
Cameras to be used (multiple value option, all if not specified) |
|
Lidars to be used (multiple value option, all if not specified) |
|
Perform segmentation, please choose a backend. |
|
Perform semantic segmentation and save logits |
|
Enable running TRT optimized models |
|
DINOv2 backbone to be used for feature extraction |
|
PCA dimension for the features to be extracted ( |
|
DINOv2 feature width (default 256) |
|
Perform lidar segmentation and point-in-cameras visibility determination |
|
Whether to use CUDA-based ensemble function for lidar segmentation |
|
Perform depth estimation, please choose a backend. |
|
Estimate the relative depth (as opposed to metric) |
|
The maximum metric depth predicted by the metric depth estimation network. |
|
The resolution of the inputs to the depth estimation network |
|
Store depth in a quantized form (as PNG) |
|
Perform automatic ego-mask estimation |
|
Number of frames to sample for ego-mask estimation per second (default: 0.2). |
|
Aggregation method for ego-mask estimation when using multiple samples. |
|
Zarr store type to store the aux data in |
|
Open shards consolidated meta-data |
|
Number of numba threads to use (use |
|
Enable debug logging outputs |
|
Enable outputting visualization results |
|
Store meta-file per shard with CLI arguments and maglev runtime logging (if available) |
|
Show all the command-line options and exit. |
Learn About NuRec Auxiliary Data Types#
Semantic Segmentation Data (conditionally required):
Method: Uses Mask2Former with DINOv2 backbone
Outputs:
Semantic segmentation masks (stored as PNG images)
Semantic segmentation logits (optional, for training/fine-tuning)
Default: –no-seg-logits (disabled by default)
Usage: Only stored when –seg-logits flag is used
Purpose: Used for advanced training techniques but not core functionality
Per-pixel class labels for scene understanding
Default: –segmentation-backend=”mask2former” (enabled by default)
Can be disabled: –segmentation-backend=”none”
Dependency: Required if LiDAR segmentation is enabled
Usage: Core for multi-modal understanding but can be skipped for pure image-based training
Depth Estimation Data (optional):
Method: Uses DepthAnythingV2 models
Types:
Relative depth: Normalized depth values [0,1]
Metric depth: Absolute depth values in meters (default max 80m for outdoor scenes)
Storage: Can be stored as quantized PNG or raw float16 values
Resolution: Configurable input resolution (default 1036px)
Default: –depth-backend=”none” (disabled by default)
Usage: Only generated when explicitly enabled with –depth-backend=depthanythingv2
Purpose: Provides geometric constraints but not essential for basic NeRF training
DINOv2 Feature Extraction (optional):
Purpose: Dense visual features for neural rendering
Models: Various DINOv2 variants (ViT-S/B/L/G with 14x14 patches)
Processing:
Optional PCA dimensionality reduction
Color transformation for visualization
Feature-to-color mapping for neural field training
Output: High-dimensional feature vectors per pixel patch
Default: –dinov2-backend=”none” (disabled by default)
Usage: Only used for advanced feature-based rendering when explicitly enabled
Purpose: Enhances semantic consistency and novel view synthesis quality, but not required
LiDAR Segmentation and Visibility (optional, recommended):
Method: Projects camera semantic segmentation onto LiDAR point clouds
Outputs:
Per-point semantic labels for LiDAR data
Point-in-camera visibility information
Ensemble-based label fusion from multiple camera views
Uses: CUDA-accelerated ensemble methods for performance
Default: –lidar-seg-camvis (enabled by default)
Can be disabled: –no-lidar-seg-camvis
Dependency: Requires semantic segmentation to be available (either generated or pre-existing)
Purpose: Essential for multi-modal NeRF training with LiDAR data
Metadata and Configuration (required):
Camera calibration and sensor metadata
Processing parameters and CLI arguments
Model metadata (versions, configurations, etc.)
Runtime information and workflow logging