Setting up NVIDIA SHARP Environment
NVIDIA SHARP binary distribution is available as part of HPC-X, MLNX_OFED and UFM packages (among SHARP binaries, UFM includes Aggregation Manager (AM) only).
Prior to installing and using NVIDIA SHARP, make sure the following requirements are met.
Run Aggregation Manager using a "root user" as trusted entities.
Make sure onboard Subnet Manager is disabled in the managed switches. (Aggregation Manager is a central entity running on a dedicated server with a master Subnet Manager. This dedicated server cannot serve as a compute node.
Configure TCP/IP before running NVIDIA SHARP and Aggregation Manager communicate over TCP/IP.
Run NVIDIA Switch-IB 2/NVIDIA Quantum/NVIDIA Quantum-2 switches with the supported firmware versions as specified in the Prerequisites section in the Release Notes (use ibdiagnet utility to check the installed firmware version on the switches).
Enabled IPoIB interface in compute servers in order to enable using UD multicast for result distribution in SHARP.
Make sure SHARP Aggregation Manager out-of-the-box subnets are configured with SM using the following routing engines:
Tree based topologies: updn, ar_updn, ftree, ar_ftree
DragonFly+ topology: dfp
Hypercube topologies: dor routing engine with dor_hyper_cube_mode enabled
When using HPC-X package, please refer to HPC-X User Manual for installation and configuration procedures.
This deployment guide includes examples on the environment variables HPCX_SHARP_DIR and OMPI_HOME, and assumes that HPC-X installation is in a shared folder accessible from all compute nodes.
To download the HPC-X packages, go here.
When using MLNX_OFED distribution, the HPCX_SHARP_DIR environment variable has to be set to redirect to SHARP installation directory (default location: /opt/mellanox/sharp), and OMPI_HOME environment variable to the MPI installation directory.
To download MLNX_OFED packages, go here.
When using Aggregation Manager from UFM, NVIDIA SHARP support has to be enabled in UFM. For further information, refer to the UFM User Manual.
UFM package includes only SHARP Aggregation Manager. Other NVIDIA SHARP components are not available through UFM and should be installed from either HPC-X or MLNX_OFED packages.
Device |
Capabilities and limitations |
NVIDIA Quantum |
Note: The number of SHARP streaming aggregation operations is limited to one active tree per switch |
NVIDIA Quantum-2 |
Note: Multiple SHARP streaming aggregation operations can be operated in parallel by a single Quantum-2 switch. The limit is one active tree per port |
ConnectX-5 |
Supports SHARP low latency operation only |
ConnectX-6 and above |
Supports both SHARP low latency and streaming aggregation operations |