Abstract

This guide helps you with usage issues or bugs that you might encounter using this product.

1. Overview Of DIGITS

DIGITS (the Deep Learning GPU Training System) is a webapp for training deep learning models. The currently supported frameworks are: Caffe and Tensorflow. DIGITS puts the power of deep learning into the hands of engineers and data scientists.

DIGITS is not a framework. DIGITS is a wrapper for Caffe and TensorFlow; which provides a graphical web interface to those frameworks rather than dealing with them directly on the command-line.

DIGITS can be used to rapidly train highly accurate deep neural network (DNNs) for image classification, segmentation, object detection tasks, and more. DIGITS simplifies common deep learning tasks such as managing data, designing and training neural networks on multi-GPU systems, monitoring performance in real time with advanced visualizations, and selecting the best performing model from the results browser for deployment. DIGITS is completely interactive so that data scientists can focus on designing and training networks rather than programming and debugging.

DIGITS is available through multiple channels such as:
  • GitHub download
  • NVIDIA’s Docker repository, nvcr.io

2. Configuration

DIGITS uses environment variables for configuration.

Note: Prior to #1091 (up to DIGITS 4.0), DIGITS used configuration files instead of environment variables.
Note:DIGITS is not designed to be run as an exposed external web service.

2.1. Environment variables

This table defines the environment variables DIGITS uses for configuration.

Variable Example value Description
DIGITS_JOBS_DIR ~/digits-jobs Location where job files are stored. Default is $DIGITS_ROOT/digits/jobs.
CAFFE_ROOT ~/caffe Path to your local Caffe build. Should contain build/tools/caffe and python/caffe/. If unset, looks for caffe in PATH and PYTHONPATH.
TORCH_ROOT ~/torch Path to your local Torch build. Should contain install/bin/th. If unset, looks for th in PATH.
DIGITS_LOGFILE_FILENAME ~/digits.log File for saving log messages. Default is $DIGITS_ROOT/digits/digits.log.
DIGITS_LOGFILE_LEVEL DEBUG Minimum log message level to be saved (DEBUG/INFO/WARNING/ERROR/CRITICAL). Default is INFO.
DIGITS_SERVER_NAME The Big One The name of the server (accessible in the UI under "Info"). Default is the system hostname.
DIGITS_MODEL_STORE_URL http://localhost/modelstore A list of URL's, separated by comma. Default is the official NVIDIA store.
DIGITS_URL_PREFIX /custom-prefix A path to prepend before every URL. Sets the home-page to be at "http://localhost/custom-prefix" instead of "http://localhost/"/

3. Installation and Usage Issues

For questions involving the installation or usage of DIGITS see:

3.1. Configuration

If you have another server running on port 80 already, you may need to reconfigure DIGITS to use a different port.

sudo dpkg-reconfigure digits

All other configuration is done with environment variables. See Configuration.md for detailed information about which variables you can change.
  • Ubuntu 14.04
    • Edit /etc/init/digits.conf
    • Add/remove/edit lines that start with env
    • Restart with sudo service digits restart
  • Ubuntu 16.04
    • Edit /lib/systemd/system/digits.service
    • Add/remove/edit lines that start with Environment= in the [Service] section
    • Restart with sudo systemctl daemon-reload && sudo systemctl restart digits

3.2. Driver installations

If you try to install a new driver while the DIGITS server is running, you'll get an error about CUDA being in use. Shut down the server before installing a driver, and then restart it afterwards.

  • Ubuntu 14.04:
    sudo service digits stop
    # (install driver)
    sudo service digits start
    
  • Ubuntu 16.04:
    sudo systemctl stop digits
    # (install driver)
    sudo systemctl start digits
    

3.3. Permissions

The DIGITS server runs as www-data, so keep in mind that prebuilt LMDB datasets used for generic models need to be readable by the www-data user. In particular, the entire chain of directories from / to your data must be readable by www-data.

3.4. Torch and cusparse

There is at least one Torch package which is missing a required dependency on cusparse. If you see this error:

/usr/share/lua/5.1/cunn/THCUNN.lua:7: libcusparse.so.7.5: cannot open shared object file: No such file or directory

The simplest fix is to manually install the missing library:

sudo apt-get install cuda-cusparse-7-5
sudo ldconfig

3.5. Torch and HDF5

There is at least one Torch package which is missing a required dependency on libhdf5-dev. If you see this error:

ERROR: /usr/share/lua/5.1/trepl/init.lua:384: /usr/share/lua/5.1/trepl/init.lua:384: /usr/share/lua/5.1/hdf5/ffi.lua:29: libhdf5.so: cannot open shared object file: No such file or directory

The simplest fix is to manually install the missing library:

sudo apt-get install libhdf5-dev
sudo ldconfig

3.6. Other

If you run into an issue not addressed here, try searching through the GitHub issues and/or the user group.

4. Bug and feature requests

To file a new issue go to: https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2FNVIDIA%2FDIGITS%2Fissues%2Fnew

To contribute by opening a pull request: https://help.github.com/articles/about-pull-requests/

Note:

You will need to send a signed copy of the Contributor License Agreement to digits@nvidia.com before your change can be accepted.

5. Support

For the latest Release Notes, see the DIGITS Release Notes Documentation website (http://docs.nvidia.com/deeplearning/digits/digits-release-notes/index.html ).

For more information about DIGITS, see:
Note: There may be slight variations between the NVIDIA-docker images and this image.

Notices

Notice

THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the product described in this guide shall be limited in accordance with the NVIDIA terms and conditions of sale for the product.

THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.

NVIDIA makes no representation or warranty that the product described in this guide will be suitable for any specified use without further testing or modification. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to ensure the product is suitable and fit for the application planned by customer and to do the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this guide. NVIDIA does not accept any liability related to any default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this guide, or (ii) customer product designs.

Other than the right for customer to use the information in this guide with the product, no other license, either expressed or implied, is hereby granted by NVIDIA under this guide. Reproduction of information in this guide is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without alteration, and is accompanied by all associated conditions, limitations, and notices.

Trademarks

NVIDIA, the NVIDIA logo, DGX, DGX-1, DGX-2, and DGX Station are trademarks and/or registered trademarks of NVIDIA Corporation in the Unites States and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.