Installation
Using Clara requires the following:
Operating system
Clara Train requires Linux having been designed on Ubuntu, and Windows is not a supported platform for Clara Train.
Driver requirements
Clara 4.1 is based on the NVIDIA container for Pytorch, release 21.10: https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel_21-10.html#rel_21-10.
Details about the contents of the base container and GPU and driver requirements can be found from the link above.
If you are using a DGX system, you can follow this: https://docs.nvidia.com/deeplearning/frameworks/preparing-containers/index.html.
Download the docker container using these commands:
export dockerImage=nvcr.io/nvidia/clara-train-sdk:v4.1
docker pull $dockerImage
Once downloaded, run the docker using this command:
docker run -it --rm --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 --ipc=host --net=host --mount type=bind,source=/your/dataset/location,target=/workspace/data $dockerImage /bin/bash
The docker, by default, starts in the /opt/nvidia folder. To access local directories from within the docker, they have to be mounted in the docker.
To mount a directory, use the -v <source_dir>:<mount_dir> option. Here is an example:
docker run --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 -it --rm -v /home/<username>/clara-experiments:/workspace/clara-experiments $dockerImage /bin/bash
This mounts the /home/<username>/clara-experiments directory in your disk to /workspace/clara-experiments in docker.
More information for mounting directories can be found in Docker documentation
If you are on a network that uses a proxy server to connect to the Internet, you can provide proxy server details when launching the container.
docker run -it --rm -e HTTPS_PROXY=https_proxy_server_ip:https_proxy_server_port -e HTTP_PROXY=http_proxy_server_ip:http_proxy_server_port --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 $dockerImage /bin/bash
For GPU isolation in the docker, use --gpus=
with the latest docker release.
docker run -it --rm --gpus=1 --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 $dockerImage /bin/bash