Overview#

TAO is a pre-training, finetuning, and optimization application for computer vision DNNs (deep neural networks). The finetuning pipelines in TAO are implemented with the PyTorch Deep Learning Framework. The details of these pipelines, including hyperparameters and features, are covered in the subsequent sections.

The source code for these networks are hosted on GitHub.

Self-Supervised Learning

Pretrain backbones from unlabeled data for downstream tasks.

Self-Supervised Learning
Image Classification

Classify images into categories using transformer and CNN backbones.

Image Classification
Object Detection

Detect and localize objects in images with bounding boxes.

Object Detection
Segmentation

Semantic and instance segmentation of scenes and objects.

Segmentation
Visual ChangeNet

Detect changes between image pairs for classification and segmentation.

Visual ChangeNet
Depth Estimation - Monocular and Stereo

Estimate per-pixel depth and disparity from images.

Depth Estimation - Monocular and Stereo
3D Perception

Detect, reconstruct, and estimate pose from 3D and LiDAR data.

3D Perception
Recognition & Re-Identification

Recognize and re-identify objects, people, characters, and poses.

Recognition & Re-Identification
Optical Inspection

Detect defects and anomalies for automated optical inspection.

Optical Inspection
Self-Supervised Learning

Pretrain backbones from unlabeled data for downstream tasks.

Self-Supervised Learning