Running TLT on Google Cloud Platform
====================================

.. _running_tlt_on_gcp:

Google Cloud Platform provides the Compute Engine, which is a a computing and hosting service
that lets you create and run virtual machines on Google infrastructure. The Compute Engine
provides a Linux or a Windows VM. To run Transfer Learning Toolkit, you will need to
set up a Linux VM.

Setting up a VM Linux VM Instance
---------------------------------

1. Instructions to set up a VM are outlined in the `official
   compute engine instructions <https://cloud.google.com/compute/docs/quickstart-linux>`_.

2. Select a compute engine from the **VM Instances** option in the console.

3. Create a new instance using the **Create Instance** tab

4. Set the machine family of the instance :code:`GPU`.

5. Set boot image to Ubuntu, with the following options:

    :code:`Boot disk type`: Balanced persistent dist
    :code:`Size (GB)` > 200

6. Select your default network.

7. Spin up the VM by clicking **Create**.

.. Note::
    NVIDIA recommends using the A2 series of VM instances that are powered by the NVIDIA Tesla A100 GPU's
    for best training performance.

Using the VM
------------

Once you have set up the instance, note the IP address of the VM
created from the console.

1. Set up SSH access

   a. Generate an SSH key from the terminal you intend to use to log in to the created VM. 
      You can do so by running the command below and following the prompts:
   
      .. code::

         ssh-keygen -t rsa -b 4096
       
    b. Copy the contents of the :code:`~/.ssh/id_rsa.pub` file and
       add it to the instance.

    c. Use the login ID in the public key to log in to the public
       IP address of the instance.

Setting up the VM and Enabling GPUs
-----------------------------------

1. Prepare the OS dependencies and check the GPUs:

    .. code::

        sudo apt-get update
        sudo apt-get -y upgrade
        sudo apt-get install -y pciutils

        lspci | grep nvidia

2. Install the NVIDIA GPU driver:

    .. code::
    
        sudo apt-get -y install nvidia-driver-460
        sudo apt-get -y docker.io
        sudo apt-get install python3-pip unzip

3. Install docker-ce and nvidia-docker2:

    .. code::

        distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
        curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
        curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list
        sudo apt-get update
        sudo apt-get install -y nvidia-docker2
        systemctl restart docker
        usermod -a -G docker $USER

   You can verify the docker installation and the GPU instances, as shown below:

    .. code::
    
        docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

        +-----------------------------------------------------------------------------+
        | NVIDIA-SMI 460.39       Driver Version: 460.39       CUDA Version: 11.2     |
        |-------------------------------+----------------------+----------------------+
        | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
        |-------------------------------+----------------------+----------------------+

4. Log in to the docker registry :code:`nvcr.io` by running the command below:

    .. code::
    
        docker login nvcr.io
 
   The username here is :code:`$oauthtoken` and the password is the :code:`NGC API KEY`. You may set this API
   key from the `NGC website`_.

    .. _NGC website: https://ngc.nvidia.com/setup

Installing the Pre-requisites for TLT
-------------------------------------

1. Upgrade :code:`python-pip` to the latest version:

    .. code::

        pip3 install --upgrade pip

2. Install the virtualenv wrapper:

    .. code:: bash

        pip3 install virtualenvwrapper

3. Configure the virtualenv wrapper:

    .. code :: bash

        export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
        export WORKON_HOME=/home/ubuntu/.virtualenvs
        export PATH=/home/ubuntu/.local/bin:$PATH
        source /home/ubuntu/.local/bin/virtualenvwrapper.sh

    .. Note::
        You may also add these commands to the :code:`~/.bashrc` of the VM to retain them for multiple
        sessions.

4. Create a virtualenv for the launcher using the following command:

    .. code:: bash

        mkvirtualenv -p /usr/bin/python3 launcher

    .. Note::
        You only need to create a virtualenv once in the instance. When you restart the instance, simply run the commands in step 3
        and invoke the same virtual env using the command below:

        .. code:: bash

            workon launcher

5. Install jupyterlab in the virtualenv using the command below

    .. code:: bash

        pip3 install jupyterlab

Downloading and Running Test Samples
------------------------------------

Now that you have created a virtualenv and installed all the dependencies, you are now ready to download and run
the TLT samples on the notebook. The instructions below assume that you are running the `TLT Computer
Vision <https://ngc.nvidia.com/catalog/resources/nvidia:tlt_cv_samples>`_ samples. For more Conversational AI samples,
refer to the sample notebooks in :ref:`this section <conv_ai_samples>`.

1. Download and unzip the notebooks from NGC using the commands below:

    .. code:: bash

        wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tlt_cv_samples/versions/v1.1.0/zip -O    tlt_cv_samples_v1.1.0.zip
        unzip -u tlt_cv_samples_v1.1.0.zip  -d ./tlt_cv_samples_v1.1.0 && cd ./tlt_cv_samples_v1.1.0

2. Launch the jupyter notebook using the command below:

    .. code:: bash

        jupyter notebook --ip 0.0.0.0 --port 8888 --allow-root --NotebookApp.token=<notebook_token>

    This will kick off the jupyter notebook server in the VM. To access this server, navigate to :code:`http://<dns_name>:8888/`
    and enter the :code:`<notebook_token>` used to start the notebook server, when prompted. The :code:`dns_name` here is the
    Public IPv4 DNS of the VM that you noted down earlier.