Running NVIDIA SHARP Daemons
NVIDIA SHARP software, i.e. NVIDIA SHARP daemon (sharpd) should be executed on every compute node, and the Aggregation Manager daemon (sharp_am) should be executed on a dedicated server along with the Subnet Manager.
This section describes how to install Aggregation Manager and SHARP daemon as services on management and compute nodes in the fabric using NVIDIA SHARP daemon script provided with NVIDIA SHARP.
Installing Aggregation Manager as a service is required when used from the HPC-X or from MLNX_OFED packages.
Installing NVIDIA SHARP daemon as a service is required when used from the HPC-X and MLNX_OFED packages on systems that do not support SystemD with socket-based activation support.
In order to install/remove NVIDIA SHARP daemons on servers, use sharp_daemons_setup.sh script provided with the NVIDIA SHARP package. For example:
$HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh
Usage: sharp_daemons_setup.sh (-s | -r) [-p SHARP location dir] -d
<sharpd | sharp_am> [-m]
-s - Setup SHARP daemon
-r - Remove SHARP daemon
-p - Path to alternative SHARP location dir
-d - Daemon name (sharpd or sharp_am)
-b - Enable socket based activation of the service
Socket-based activation is only valid on systems with SystemD support.
Run the following as root:
# $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -s -d sharp_am
Daemon's log location is: /var/log/sharp_am.log
Set the "run level".
Start sharp_am as root.
# service sharp_am start
The following registration procedure requires pdsh package to be installed. In case the package is absent, use another parallel execution tool and consider the commands below as an example.
Run the following as root.
# pdsh -w <hostlist> $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -s -d sharpd
Daemon's log location: /var/log/sharpd.log
Set the "run level".
Start sharpd daemons as root.
# pdsh –w <hostlist> service sharpd start
Socket-based activation installs sharpd as a daemon that is automatically activated when an application tries to communicate with sharpd.
sharpd from MLNX_OFED is automatically installed with socket-based activation on systems with SystemD.
The following registration procedure requires pdsh package to be installed. In case the package is absent, use another parallel execution tool and consider the command below as an example.
Run the following as root:
# pdsh -w <hostlist> $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -s -d sharpd -b
Daemon's log location: /var/log/sharpd.log
To remove sharp_am, run the following on the AM host:
# $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -r -d sharp_am
To remove sharpd, run:
# pdsh -w <hostlist> $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -r -d sharpd
Upgrading daemons requires their removal and re-registration instructed in this section.