Running NVIDIA SHARP Daemons

NVIDIA SHARP software, i.e. NVIDIA SHARP daemon (sharpd) should be executed on every compute node, and the Aggregation Manager daemon (sharp_am) should be executed on a dedicated server along with the Subnet Manager.

This section describes how to install Aggregation Manager and SHARP daemon as services on management and compute nodes in the fabric using NVIDIA SHARP daemon script provided with NVIDIA SHARP.

Installing Aggregation Manager as a service is required when used from the HPC-X or from MLNX_OFED packages.

Installing NVIDIA SHARP daemon as a service is required when used from the HPC-X and MLNX_OFED packages on systems that do not support SystemD with socket-based activation support.

In order to install/remove NVIDIA SHARP daemons on servers, use sharp_daemons_setup.sh script provided with the NVIDIA SHARP package. For example:

Copy
Copied!
            

$HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh   Usage: sharp_daemons_setup.sh (-s | -r) [-p SHARP location dir] -d <sharpd | sharp_am> [-m] -s - Setup SHARP daemon -r - Remove SHARP daemon -p - Path to alternative SHARP location dir -d - Daemon name (sharpd or sharp_am) -b - Enable socket based activation of the service

Warning

Socket-based activation is only valid on systems with SystemD support.

  1. Run the following as root:

    Copy
    Copied!
                

    # $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -s -d sharp_am

    Daemon's log location is: /var/log/sharp_am.log

  2. Set the "run level".

  3. Start sharp_am as root.

    Copy
    Copied!
                

    # service sharp_am start

Warning

The following registration procedure requires pdsh package to be installed. In case the package is absent, use another parallel execution tool and consider the commands below as an example.

  1. Run the following as root.

    Copy
    Copied!
                

    # pdsh -w <hostlist> $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -s -d sharpd

    Daemon's log location: /var/log/sharpd.log

  2. Set the "run level".

  3. Start sharpd daemons as root.

    Copy
    Copied!
                

    # pdsh –w <hostlist> service sharpd start

Socket-based activation installs sharpd as a daemon that is automatically activated when an application tries to communicate with sharpd.

sharpd from MLNX_OFED is automatically installed with socket-based activation on systems with SystemD.

Warning

The following registration procedure requires pdsh package to be installed. In case the package is absent, use another parallel execution tool and consider the command below as an example.

Run the following as root:

Copy
Copied!
            

# pdsh -w <hostlist> $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -s -d sharpd -b

Daemon's log location: /var/log/sharpd.log

To remove sharp_am, run the following on the AM host:

Copy
Copied!
            

# $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -r -d sharp_am

To remove sharpd, run:

Copy
Copied!
            

# pdsh -w <hostlist> $HPCX_SHARP_DIR/sbin/sharp_daemons_setup.sh -r -d sharpd

Upgrading daemons requires their removal and re-registration instructed in this section.

© Copyright 2023, NVIDIA. Last updated on May 23, 2023.