Installation and Setup

Registering Your DGX-1

Be sure to register your DGX-1 with NVIDIA as soon as you receive your purchase confirmation e-mail. Registration enables your hardware warranty and allows you to set up an NVIDIA DGX Container Registry account.

To register your DGX-1, you will need information provided in your purchase confirmation e-mail. If you do not have the information, send an e-mail to NVIDIA Enterprise Support at enterprisesupport@nvidia.com.

  1. From a browser, go to the NVIDIA DGX Product Registration page.
  2. Enter all required information and then click SUBMIT to complete the registration process and receive all warranty entitlements and, if applicable, DGX-1 support services entitlements.

Obtaining Software and Software Updates

You must register your DGX-1 in order to receive software updates. Once registered, you will receive an email notification whenever a new software update is available. You can access software update instructions as well as software downloads through the Enterprise Support site as follows:

  • From your browser, go to NVIDIA Enterprise Services, and log in.
  • Click the Announcements tab, which contains download links and supplemental documentation.
  • Refer to the DGX OS Server Software Release Notes for instructions on how to perform a software update.

Choosing a Setup Location / Site Preparation

Decide on a suitable location for setting up and operating the DGX-1. The location should be clean, dust-free, and well ventilated.

General Conditions

  • Prepare a sufficiently wide aisle to accommodate the unboxed chassis (chassis dimensions - 5.16”H x 17.5"W x 34.1"D).
  • The rack must accommodate a 134 lb, 3U rack mount system (chassis dimensions - 5.16”H x 17.5"W x 34.1"D).
  • The rack must have square mounting holes.
  • Leave enough clearance in front of the rack (36" (91.4 cm)) to enable you to install the unit into the rack.
  • Leave approximately 30" (76.2cm) of clearance in the back of the rack to allow for sufficient airflow and ease in servicing.
  • Always make sure the rack is secured and stable before adding or removing the appliance or any other component.
  • Prepare adequate sound-proofing: The equipment fans can generate 72-100 dBA.

Environmental Conditions

  • Operating environment
    • Temperature: 5 ◦ C to 35 ◦ C (41 ◦ F to 95 ◦ F)
    • Relative humidity: 20% to 85% noncondensing
  • Air flow
    • The chassis fans can produce a maximum of 340 CFM of air flow.
    • Do not block the ventilation areas at the front and rear of the chassis.
    • Minimize any restrictions on air flow around the chassis.

Connections

  • Power:
    • The DGX-1 is powered through four 1600W power supply units, each rated at 200-240VAC, 8A, 50/60 Hz. Total system power: 3200W
    • C13/C14 cables provided for each power supply to connect to a compatible PDU.
  • Network: Dual 10GBASE-T RJ45 connection
  • IPMI: 10/100BASE-T RJ45 connection
  • InfiniBand: Qty 4 - QSFP28 ports, InfiniBand and Ethernet compliant

Preparing for Network Access

  • The IPMI port and Ethernet ports can be connected to your local LAN.

    These ports are configured for DHCP by default.

    • To use DHCP, connect the port to a local DHCP server which should provide an IP address and assign a DNS configuration to the DGX-1.
    • If DHCP is not available, then you will need to set up a static IP for each Ethernet port.
  • NVIDIA recommends that customers follow best security practices for BMC management (IPMI port). These include, but are not limited to, such measures as:
    • Restricting the DGX-1 IPMI port to an isolated, dedicated, management network
    • Using a separate, firewalled subnet
    • Configuring a separate VLAN for BMC traffic if a dedicated network is not available
  • If you will be operating the DGX-1 in cloud-managed mode, then
    • Make sure that DNS is enabled
    • Make sure that the ports listed in the following table are open and available on your firewall to the DGX-1:
    Port (Protocol) Direction Use
    53 (UDP) Outbound DNS
    80 (TCP) Outbound HTTP, package updates
    123 (UDP)

    Outbound/

    Inbound

    NTP client
    443 (TCP)

    Inbound/

    Outbound

    For internet (HTTP/HTTPS) connection to DGX-1 Cloud Services

    If port 443 is proxied through a corporate firewall, then WebSocket protocol traffic must be supported

    2376 (TCP) Inbound For interacting with running containers using attach/exec commands
  • If you will be using the DGX-1 in Base OS mode, make sure your network can connect to the following:

    If access to those URLs requires use of a proxy, refer to Setting Up a System Proxy for setup instructions.

Unpacking the DGX-1

  1. Remove the shrinkwrap.
  2. Collapse the yellow "Do not stack" cone, if included.
  3. Open the main DGX-1 box, then remove the accessory and rail kit boxes.

    CAUTION: At least four people, or a mechanical assist, are required to remove the DGX-1 from the box. To reduce the risk of personal injury or damage to the equipment, always observe local occupational health and safety requirements and guidelines for material handling.

    DO NOT use the handles at the front of the DGX-1 to lift the unit. The handles are designed for sliding the unit out of a rack, and not for carrying the full weight of the DGX-1.

  4. Remove the protective plastic sheet from the top of the DGX-1.
  5. Preserve and retain packaging.
  6. Be sure to inspect each piece of equipment shipped in the packing box. If anything is missing or damaged, contact your supplier.

What's In the Box

The NVIDIA DGX-1 shipping box includes the following:

  • NVIDIA DGX-1
  • Bezel
  • Rail hardware kit
  • Accessory Box
    • AC Power Cables (qty 4 – IEC 60320 C13/14, compatible with data center PDUs)
    • Hard disk bay screws
    • Toxic Substance Notice & Safety Instructions
    • Quick Start Guide
    • DVD containing source files for open source software
Note: The four power cables included in the box are not optional. All power cables are necessary and must be plugged into individual 10 A capable sockets for optimal DGX-1 operation. Failure to do so can result in a reduction in power redundancy, a reduction in performance, or a complete system failure.

Installing the DGX-1 Into a Rack

CAUTION: To prevent bodily injury when mounting or servicing the DGX-1 in a rack, you must take special precautions to ensure that the system remains stable. The following guidelines are provided to ensure your safety.

• The DGX-1 should be mounted at the bottom of the rack if it is the only unit in the rack.

• When mounting the DGX-1 in a partially filled rack, load the rack from the bottom to the top with the heaviest component at the bottom of the rack.

• If the rack is provided with stabilizing devices, install the stabilizers before mounting or servicing the DGX-1 in the rack.

• The DGX-1 weighs approximately 134 lbs, so an equipment lift is required to safely lift the unit and then accurately align the chassis rails with the rack rails.

DO NOT use the handles at the front of the DGX-1 to lift the unit. The handles are designed for sliding the unit out of a rack, and not for carrying the full weight of the DGX-1.

Installing the Rails

Note: The rail assemblies shipped with the appliance fit into a standard 19” rack between 26-inches and 33.5-inches deep (66 cm to 85 cm). The outer rail is adjustable from approximately 23.5” to 34” (59.7 cm to 86.4 cm)

Refer to the instructions in the rail packaging for details on installing the rails onto the rack and chassis.

The following are supplemental instructions:

  1. Use a Phillips screwdriver to assist in mounting the rails to the rack.
  2. If necessary, detach the inner rails from the outer slide rails.
  3. Follow any designations on the inner rail (or its outer rail mate) to determine the proper orientation and positioning to connect to the chassis, then secure to the chassis.IMPORTANT: Make sure that the reinforced hole at the front end of the rail is positioned on the bottom side of the rail, and that it aligns with the thumbscrew on the front of the DGX-1. If the hole is positioned on the top side, then the rail is on the wrong side of the DGX-1 and the DGX-1 will not fit properly in the rack.
  4. Follow any designations on the outer slide rail to determine front/back and left-side/right-side positioning against the rack.
  5. Secure the back of one of the slide rails to the rack, then extend the rail until it fits securely to the front of the rack.
  6. Secure the slide rail to the front of the rack.
  7. Repeat steps 4-6 for the other slide rail.

Mounting the DGX-1

Note:CAUTION: Stability hazard — The rack stabilizing mechanism must be in place, or the rack must be bolted to the floor before you slide the DGX-1 out for servicing. Failure to stabilize the rack can cause the rack to tip over.
  1. Confirm that the DGX-1 has the inner rails attached and that you have already mounted the outer rails into the rack.
  2. With the front of the unit facing away from the rack, use an equipment lift to assist in sliding the unit into the rack as follows:

    CAUTION: The DGX-1 weighs approximately 134 lbs, so an equipment lift is required to safely lift the unit and then accurately align the chassis rails with the rack rails.

    1. Align the inner chassis rails with the front of the outer rack rails.
    2. Slide the inner rails into the outer rails, keeping the pressure even on both sides (you may have to depress the locking tabs when inserting).

      When the DGX-1 has been pushed completely into the rack, you should hear the locking tabs "click" into the locked position.



  3. Lock the unit in place using the thumb screws located on the front of the unit.





Attaching the Bezel

The bezel is designed to attach easily to the front of the DGX-1.

  1. Prepare the DGX-1 by making sure that the power supply handles (located at the power supply fans) are flipped up.



  2. Move any other obstructions, such as cable ties, away from the outer edge of the DGX-1.
  3. With the bezel positioned so that the NVIDIA logo is visible from the front and is on the left hand side, line up the pins near the corners of the DGX-1 with the holes in back of the bezel, then gently press the bezel against the DGX-1.
    Note:CAUTION: Be careful not to accidentally press the power button that is on the right edge of the DGX-1 when removing or installing the bezel.





    The bezel is held in place magnetically .

Connecting the Power Cables

  1. Open the accessory box and remove the four C13/C14 power cables.
  2. Use the cables to connect each of the four plugs at the right-rear of the DGX-1 to a PDU.



    1. Secure each cable to the DGX-1, using the power cable retention clips attached to the power plugs.
    2. Connect each cable to the PDU. Ensure that the cables are distributed over at least two circuits and, if using 3-phase PDUs, they are balanced across all phases as much as possible. Ideally, each cable should connect to a different PDU.
    3. Verify that each cable is firmly inserted into the PDU. There is usually a click to indicate full insertion.

Connecting the Network Cables

  1. Using an Ethernet cable, connect one of the dual Ethernet ports (em1 or em2) to your LAN for internet access to the NVIDIA Cloud Portal, remote access to launched application containers on the DGX-1, or to connect to the DGX-1 using SSH.



    The left-side/right-side ethernet port designation depends on the Base OS software version installed on the DGX-1 as listed in the table below.
    Ethernet Port Position Port Designation: Base OS Software 2.x and earlier Port Designation: Base OS Software 3.x and later
    Right Side em1 enp1s0f0
    Left Side em2 enp1s0f1
    Note: NVIDIA recommends connecting only one of the Ethernet ports to your LAN. If you are connecting both Ethernet ports, they must each be connected to separate networks, The DGX-1 is not configured from the factory to have multiple Ethernet interfaces on the same network.
  2. Using an Ethernet cable, connect the IPMI (BMC) port to your LAN for remote access to the base management controllerr (BMC). Vefiy that all network cables are firmly inserted into the DGX-1 and the associated network switch.

Setting Up the DGX-1

These instructions describe the setup process that occurs the first time the DGX-1 is powered on after delivery. Be prepared to accept all EULAs and to set up your username and password.

  1. Connect a display to the VGA connector, and a keyboard to any of the USB ports.



    For best display results, use a monitor with a native resolution of 1024x768 or lower.
  2. Power on the DGX-1.



    The system will take a few minutes to boot.

    You may be presented with end user license agreements (EULAs) for the NVIDIA software at this point in the setup, depending on the DGX-1 software version. Accept all EULAs to proceed with the installation.

    You are prompted to configure the DGX-1 software.

  3. Perform the steps to configure the DGX-1 software.
    • Select your time zone and keyboard layout.
    • Create a user account with your name, username, and password.

      You will need these credentials to log in to the DGX-1 as well as to log in to the BMC remotely. When logging in to the BMC, enter your username for both the User ID as well as the password. Be sure to create a unique BMC password at the first opportunity.

      Note: The BMC software will not accept "sysadmin" for a user name. If you create this user name for the system log in, "sysadmin" will not be available for logging in to the BMC.
    • Choose a primary network interface for the DGX-1.
      Note: After you select the primary network interface, the system attempts to configure the interface for DHCP and then asks you to enter a hostname for the system. If DHCP is not available, you will have the option to configure the network manually. If you need to configure a static IP address on a network interface connected to a DHCP network, select Cancel at the Network configuration – Please enter the hostname for the system screen. The system will then present a screen with the option to configure the network manually.
    • Choose a host name for the DGX-1.
    • Choose to install predefined software.

      Press the space bar to select or deselect the software to install.

      Note: By default, the DGX-1 installs only minimal software packages necessary to ensure system functionality. You can deselect the OpenSSH package; however, NVIDIA recommends that you keep this package selected, and uninstall it only if required by your IT security policy.
  4. Select OK to continue. You may be presented with end user license agreements (EULAs) for the NVIDIA software at this point in the setup, depending on the DGX-1 software version. Accept all EULAs to complete the installation. The system completes the installation, reboots, then presents the system login prompt:
    <hostname> login:
    Password:
  5. Log in.
Refer to the DGX OS Server release notes for information on available over-the-network software updates.

Post Setup Instructions for DGX OS Server Software Version 2.x and Earlier

These instructions apply if your DGX-1 is installed with software version 2.x or earlier.

To determine the DGX OS Server software version on your system, enter the following command.
$ grep VERSION /etc/dgx-release
DGX_SWBUILD_VERSION="3.1.1"
  1. If your network is configured for DHCP, then make sure that dynamic DNS updates are enabled. Check whether /etc/resolv.conf is a link to /run/resolvconf/resolv.conf.
    $ ls -l /etc/resolv.conf
    Expected output:
    lrwxrwxrwx 1 root root 29 Dec  1 21:19 /etc/resolv.conf ->
    ../run/resolvconf/resolv.conf
    • If the expected output appears, then skip to step 2.
    • If this does not appear, then enable dynamic DNS updates as follows:
    1. Launch the Resolvconf Reconfigure package.
      $ sudo dpkg-reconfigure resolvconf
      The Configuring resolvconf screen appears.
    2. Select <Yes>when asked whether to prepare /etc/resolv.conf for dynamic updates.
    3. Select <No> when asked whether to append original file to dynamic file.
    4. Select <OK> at the Reboot recommended screen. You do not need to reboot. You are returned to the command line.
    5. Bring down the interface, where <network interface> is em1 or em2, whichever you have set up as your primary network interface.
      $ sudo ifdown <network interface>
      Expected output:
      ifdown: interface <network interface> not configured
    6. Bring up the interface, where <network interface> is em1 or em2, whichever you have set up as your primary network interface.
      $ sudo ifup <network interface>
      Expected output (last line):
      …
      bound to <IP address> -- renewal in …
    7. Repeat step 1 to confirm that /etc/resolv.conf is a link to /run/resolvconf/resolv.conf.
  2. Make sure that the nvidia-peer-memory module is installed.
    $ lsmod | grep nv_peer_mem
    If the following output appears, then your DGX-1 setup is complete and you do not need to perform the next steps.
    nv_peer_mem            16384  0
    nvidia              11911168  30
    nv_peer_mem,nvidia_modeset,nvidia_uv
    mib_core               143360  13
    rdma_cm,ib_cm,ib_sa,iw_cm,nv_peer_mem,mlx4_ib,mlx5_ib,
    ib_mad,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib
  3. If there is no output to the lsmod command, then build and install the nvidia-peer-memory module.
    1. Get and install the module.
       $ sudo apt-get update
       $ sudo apt-get install --reinstall mlnx-ofed-kernel-dkms nvidia-peer-memory-dkms
      
      Expected output.
       DKMS: install completed.
       Processing triggers for initramfs-tools (0.103ubuntu4.2) ...
       update-initramfs: Generating /boot/initrd.img-4.4.0-45-generic
      
    2. Add the module to the Linux kernel.
       $ sudo modprobe nv_peer_mem
      
      There is no expected output for this command.
    3. Repeat step 2 to confirm that the nvidia-peer-memory module has been added.