Important
NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.
Datasets
Note
It is the responsibility of each user to check the content of the dataset, review the applicable licenses, and determine if it is suitable for their intended use. Users should review any applicable links associated with the dataset before placing the data on their machine.
Rays dataset
Ray datasets are specialized data structures designed for applications in computer graphics, notably in 3D reconstruction, neural rendering, and ray tracing.
Ray datasets are characterized by their detailed representation of rays, each defined by an origin point (rays_o) and a direction vector (rays_d). These datasets are closely tied to specific image dimensions, including height and width, which dictate the resolution and aspect ratio of the target images. Alongside the core ray data, these datasets typically include additional metadata such as camera parameters, depth values, and color information. The diversity and complexity of the dataset, encompassing a range of viewpoints and lighting conditions, play a crucial role in capturing the nuances of real-world light behavior.
Random Poses Dataset
The Random Poses Dataset randomly generates camera poses, each translating to a unique set of rays characterized by their origins and directions. This randomization is key to covering a wide range of potential viewpoints and angles, mimicking a comprehensive exploration of a 3D scene. This diverse sampling is essential for training robust NeRF models capable of accurately reconstructing and rendering 3D environments from previously unseen angles.
The dataset inherently accounts for the necessary parameters of ray generation, such as the height and width of the target images, ensuring that the rays are compatible with the specific requirements of the rendering or reconstruction algorithms. In addition to the ray origins and directions, the dataset may also include other relevant metadata like camera intrinsic and extrinsic parameters, contributing to a more detailed and versatile training process.
An example of RandomPosesDataset usage as a training dataset is shown below:
model:
data:
train_batch_size: 1
train_shuffle: false
train_dataset:
_target_: nemo.collections.multimodal.data.nerf.random_poses.RandomPosesDataset
internal_batch_size: 100
width: 512
height: 512
radius_range: [3.0, 3.5]
theta_range: [45, 105]
phi_range: [-180, 180]
fovx_range: [10, 30]
fovy_range: [10, 30]
jitter: False
jitter_center: 0.2
jitter_target: 0.2
jitter_up: 0.02
uniform_sphere_rate: 0
angle_overhead: 30
angle_front: 60
Circle Poses Dataset
Circle Poses Dataset is a specialized ray dataset designed for generating samples of rays in a circular pattern. The key feature of this dataset is its ability to simulate camera positions arranged along a circular path, focusing on a central point. This arrangement is particularly useful for capturing scenes from multiple, evenly spaced angles, ensuring a comprehensive view around a central axis.
The defining parameter of the Circle Poses Dataset is its size, which dictates the number of samples or camera poses around the circle. A larger size results in more camera positions being generated, offering finer granularity and coverage of the circle. Each camera pose corresponds to a unique set of rays, with origins and directions calculated based on the position around the circle and the focus on the central point.
The Circle Poses Dataset is particularly valuable during validation and testing to generate a holistic view of the reconstructed scene.
An example of CirclePosesDataset usage as a validation dataset is shown below:
model:
data:
val_batch_size: 1
val_shuffle: false
val_dataset:
_target_: nemo.collections.multimodal.data.nerf.circle_poses.CirclePosesDataset
size: 5
width: 512
height: 512
angle_overhead: 30
angle_front: 60