Requirements and Installation¶
The TLT is designed to run on x86 systems with an NVIDIA GPU (e.g., GPU-powered workstation, DGX system) or can be run in any cloud with an NVIDIA GPU. For inference, models can be deployed on any edge device such as an embedded Jetson platform or in a data center with GPUs like T4 or A100. This page lists recommended system requirements for the installation and use of the TLT.
Hardware Requirements¶
The following system configuration is recommended to achieve reasonable training performance with the TLT and supported models provided:
32 GB system RAM
32 GB of GPU RAM
8 core CPU
1 NVIDIA GPU
100 GB of SSD space
TLT is supported on A100, V100 and RTX 30x0 GPUs.
Software Requirements¶
In addition to the TLT package, the following software is required to take advantage of all the tutorials, examples and supported models within the containers provided:
Ubuntu 18.04 LTS
NVIDIA GPU Cloud account and API key
Note
DeepStream 5.0 - NVIDIA SDK for IVA inference is recommended.
Installation Prerequisites¶
Perform the following prerequisite steps before installing TLT:
Install Docker.
Install NVIDIA GPU driver v455.xx or above.
Install nvidia docker2
Get an NGC account and API key:
Go to NGC and click the Transfer Learning Toolkit container in the Catalog tab. This message is displayed: “Sign in to access the PULL feature of this repository”.
Enter your Email address and click Next, or click Create an Account.
Choose your organization when prompted for Organization/Team.
Click Sign In.
Execute
docker login nvcr.io
from the command line and enter these login credentials:Username: “$oauthtoken”
Password: “YOUR_NGC_API_KEY”
Note
If you have followed the default installation instructions for docker-ce
you may need
to have sudo
access to run docker
commands. In order to circumvent this,
TLT recommends you to follow these post-installation steps to make sure that the docker
commands can be run without sudo.
Installation¶
The Transfer Learning Toolkit (TLT) is a Python pip package that is available to download from the NVIDIA DevZone. The package uses the docker CLI internally to interact with the NGC Docker registry to download and instantiate the underlying docker containers. You must have an NGC account and an API key associated with your account. See the Installation Prerequisites section for details on creating an NGC account and obtaining an API key.
Running the Transfer Learning Toolkit¶
The procedure to install and run the Transfer Learning Toolkit is detailed in this section.
Use the examples¶
Example Jupyter notebooks for all the tasks that are supported in TLT are available in NGC resources. TLT provides sample workflows for Computer Vision and Conversational AI.
Computer Vision
All the samples for the supported computer vision tasks are hosted on ngc under the TLT Computer Vision Samples. To run the available examples, download this sample resource by using the following commands.
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tlt_cv_samples/versions/v1.0.2/zip -O tlt_cv_samples_v1.0.2.zip
unzip -u tlt_cv_samples_v1.0.2.zip -d ./tlt_cv_samples_v1.0.2 && rm -rf tlt_cv_samples_v1.0.2.zip && cd ./tlt_cv_samples_v1.0.2
Conversational AI
The TLT Conversational AI package, provides several end to end sample workflows to train conversational AI models using TLT and subsequently deploying them to jarvis. You can find these samples at:
Conversational AI Task |
Jupyter Notebooks |
---|---|
Speech to Text |
|
Question Answering |
|
Text Classification |
|
Token Classification |
|
Punctuation and Capitalization |
|
Intent and Slot Classification |
You can download these resources, by using the NGC CLI command available at the NGC resource page. Once you download the respective tutorial resource, you may instantiate the jupyter notebook server.
jupyter notebook --ip 0.0.0.0 --allow-root --port 8888
Copy and paste the link produced from this command into your browser to access the notebook. The /workspace/examples folder will contain a demo notebook. Feel free to use any free port available to host the notebook if port 8888 is unavailable.
Downloading the Models¶
The Transfer Learning Toolkit Docker gives you access to a repository of pretrained models that can serve as a starting point when training deep neural networks. These models are hosted on the NGC. To download the models, please download the NGC CLI and install it. More information about the NGC Catalog CLI is available here. Once you have installed the CLI, you may follow the instructions below to configure the NGC CLI and download the models.
Configure the NGC API key¶
Using the NGC API Key obtained in Installation Prerequisites, configure the enclosed ngc cli by executing this command and following the prompts:
ngc config set
Get a list of models¶
Use this command to get a list of models that are hosted in the NGC model registry:
ngc registry model list <model_glob_string>
For the computer vision models, here is an example of using this command:
ngc registry model list nvidia/tlt_pretrained_*
Note
All our classification models have names based on this template:
nvidia/tlt_pretrained_classification:<template>
.
To view all the conversational AI models, you may using the following command:
ngc registry model list nvidia/tlt-jarvis/*
Download a model¶
Use this command to download the model you have chosen from the NGC model registry:
ngc registry model download-version <ORG/model_name:version> -dest <path_to_download_dir>
For example, use this command to download the resnet 18 classification model to the
$USER_EXPERIMENT_DIR
directory:
ngc registry model download-version
nvidia/tlt_pretrained_classification:resnet18 --dest
$USER_EXPERIMENT_DIR/pretrained_resnet18
Downloaded 82.41 MB in 9s, Download speed: 9.14 MB/s
----------------------------------------------------
Transfer id: tlt_iva_classification_resnet18_v1 Download status: Completed.
Downloaded local path: /workspace/tlt-experiments/pretrained_resnet18/
tlt_resnet18_classification_v1
Total files downloaded: 2
Total downloaded size: 82.41 MB
Started at: 2019-07-16 01:29:53.028400
Completed at: 2019-07-16 01:30:02.053016
Duration taken: 9s seconds