Authenticating Users

To make the DGX useful, users need to be added to the system in some fashion so they can be authenticated to use the system. Generally, this is referred to as user authentication. There are several different ways this can be accomplished, however, each method has its own pros and cons.

Local

The first way is to create users directly on the DGX system using the useradd command. Let’s assume you want to add a user dgxuser. You would first add the user via the following command.
$ useradd -m -s /bin/bash dgxuser
Where -s refers to the default shell for the user and -m creates the user’s home directory. After creating the user you need to add them to the docker group on the DGX.
$ sudo usermod -aG docker dgxuser

This adds the user dgxuser to the group docker. Any user that runs Docker containers has to be a member of this group.

Using authentication on the DGX is simple but not without its issues. First, there have been occasions when an OS upgrade on the DGX requires the reformatting of all the drives in the appliance. If this happens, you first must make sure all user data is copied somewhere off the DGX-1 before the upgrade. Second, you will have to recreate the users and add them to the docker group and copy their home data back to the DGX. This adds work and time to upgrading the system.
Important: While the 2x 960GB NVME SSDs on the DGX-2, meant for the OS partition, are in RAID-1 configuration, there is no RAID-1 on the OS drive for the DGX-1 and DGX Station. Hence, if the OS drive fails on the DGX-1 or the DGX Station, you will lose all the users and everything in the /home directories. Therefore, it is highly recommended that you backup the pertinent files on the DGX system as well as /home for the users.

NIS Vs NIS+

Another authentication option is to use NIS or NIS+. In this case, the DGX would be a client in the NIS/NIS+ configuration. As with using local authentication as previously discussed, there is the possibility that the OS drive in the DGX could be overwritten during an upgrade (not all upgrades reformat the drives, but it’s possible). This means that the administrator may have to reinstall the NIS configuration on the DGX.

Also, remember that the DGX-1 and DGX Station have a single OS drive. If this drive fails, the administrator will have to re-configure the NIS/NIS+ configuration, therefore, backups are encouraged; even for DGX-2 systems, which do have 2x OS drives in a RAID-1 configuration.
Note: It is possible that if, in the unlikely event that technical support for the DGX is needed, the NVIDIA engineers may require the administrator to disconnect from the NIS/NIS+ server.

LDAP

A third option for authentication is LDAP (Lightweight Directory Access Protocol). It has become very popular in the clustering world, particularly for Linux. You can configure LDAP on the DGX for user information and authentication from an LDAP server. However, as with NIS, there are possible repercussions.
CAUTION:
  • The first is that the OS drive is a single drive on the DGX-1 and DGX Station. If the drive fails, you will have to rebuild the LDAP configuration (backups are highly recommended).
  • The second is that, as previously mentioned, if, in the unlikely event of needing tech support, you may be asked to disconnect the DGX system from the LDAP server so that the system can be triaged.

Active Directory

One other option for user authentication is connecting the DGX system to an Active Directory (AD) server. This may require the system administrator to install some extra tools into the DGX. This means that this approach should also include the two cautions that were repeated before where the single OS drive may be reformatted for an upgrade or that it may fail (again, backups are highly recommended). It also means that in the unlikely case of needing to involve NVIDIA technical support, you may be asked to take the system off the AD network and remove any added software (this is unlikely but possible).