Testing NVIDIA SHARP Setup
Run ibdiagnet utility with SHARP diagnostics option.
$ibdiagnet --sharp --fabric_summary
Check fabric summary table in ibdiagnet output for the number of identified aggregation nodes. For example:
Fabric Summary
Total Nodes : 24
IB Switches : 4
IB Channel Adapters : 16
IB Aggregation Nodes : 4
IB Routers : 0
Total number of links : 24
Links at 4x50 : 24
Master SM: Port=1
LID=1
GUID=0x248a070300a28c4d
devid=4119
Priority:0
Node_Type=CA Node_Description=pnemo HCA-2
Standby SM : No Standby SM
Check summary table in ibdiagnet output for errors in SHARP diagnostics stage. For example:
Summary
-I- Stage Warnings Errors Comment
-I- Discovery 0
0
-I- Lids Check 0
0
-I- Links Check 0
0
-I- Subnet Manager 0
0
-I- Port Counters 0
0
-I- Nodes Information 0
0
-I- Speed / Width checks 0
0
-I- Alias GUIDs 0
0
-I- Virtualization 0
0
-I- Partition Keys 0
0
-I- Temperature Sensing 0
0
-I- SHARP 0
0
Check in SHARP diagnostics output file (/var/tmp/ibdiagnet2/ibdiagnet2.sharp) that SHARP aggregation trees are configured in the subnet.
For example: count number of configured aggregation trees constructed by Aggregation Manager using grep command:
$cat /var/tmp/ibdiagnet2/ibdiagnet2.sharp | grep -c TreeID
126
NVIDIA SHARP distribution provides sharp_hello test utility for testing SHARP's end-to-end functionality on a compute node. It creates a single SHARP job and sends a barrier request to SHARP Aggregation node.
Help
$sharp_hello -h
usage: sharp_hello <-d | --ib_dev> <device> [OPTIONS]
OPTIONS:
[-d | --ib_dev] - HCA to use
[-v | --verbose] - libsharp coll verbosity level(default
:2
)
Levels: (0
-fatal 1
-err 2
-warn 3
-info 4
-debug 5
-trace)
[-V | --version] - print program version
[-h | --help] - show this
usage
Example #1
$ sharp_hello -d mlx5_0:1
-v 3
[thor001:0
:15042
- context.c:581
] INFO job (ID: 12159720107860141553
) resource request quota: ( osts:0
user_data_per_ost:0
max_groups:0
max_qps:1
max_group_channels:1
, num_trees:1
)
[thor001:0
:15042
- context.c:751
] INFO tree_info: type:LLT tree idx:0
treeID:0x0
caps:0x6
quota: ( osts:167
user_data_per_ost:1024
max_groups:167
max_qps:1
max_group_channels:1
)
[thor001:0
:15042
- comm.c:393
] INFO [group#:0
] group id:a tree idx:0
tree_type:LLT rail_idx:0
group size:1
quota: (osts:2
user_data_per_ost:1024
) mgid: (subnet prefix:0xff12a01bfe800000
interface
id:0x3f020000000a
) mlid:c007
Test Passed.
Example #2
$ SHARP_COLL_ENABLE_SAT=1
sharp_hello -d mlx5_0:1
-v 3
[swx-dgx01:0
:59023
- context.c:581
] INFO job (ID: 15134963379905498623
) resource request quota: ( osts:0
user_data_per_ost:0
max_groups:0
max_qps:1
max_group_channels:1
, num_trees:1
)
[swx-dgx01:0
:59023
- context.c:751
] INFO tree_info: type:LLT tree idx:0
treeID:0x0
caps:0x6
quota: ( osts:167
user_data_per_ost:1024
max_groups:167
max_qps:1
max_group_channels:1
)
[swx-dgx01:0
:59023
- context.c:755
] INFO tree_info: type:SAT tree idx:1
treeID:0x3f
caps:0x16
[swx-dgx01:0
:59023
- comm.c:393
] INFO [group#:0
] group id:3c tree idx:0
tree_type:LLT rail_idx:0
group size:1
quota: (osts:2
user_data_per_ost:1024
) mgid: (subnet prefix:0xff12a01bfe800000
interface
id:0xd6060000003c
) mlid:c004
[swx-dgx01:0
:59023
- comm.c:393
] INFO [group#:1
] group id:3c tree idx:1
tree_type:SAT rail_idx:0
group size:1
quota: (osts:64
user_data_per_ost:0
) mgid: (subnet prefix:0x0
interface
id:0x0
) mlid:0
Test Passed
NVIDIA SHARP distribution provides a source code for the benchmark to test native SHARP low-level performance for allreduce and barrier operations.
Source code:
$module load hpcx
$HPCX_SHARP_DIR/share/sharp/examples/mpi/coll/
Build and run instructions:
$module load hpcx
$HPCX_SHARP_DIR/opt/Mellanox/sharp/share/sharp/examples/mpi/coll/README
NVIDIA SHARP Benchmark Script
NVIDIA SHARP distribution provides a test script which executes OSU (allreduce, barrier) benchmark running with and without NVIDIA SHARP. To run the NVIDIA SHARP benchmark script, the following packages are required to be installed.
ssh
pdsh
environment-modules.x86_64
You can find this script at $HPCX_SHARP_DIR/sbin/sharp_benchmark.sh after loading the HPC-X module. This script should be launched from a host running SM and Aggregation Manager. It receives a list of compute nodes from SLURM allocation or from “hostlist” environment variable. “hostlist” is a comma-separated list which requires hca environment variables to be supplied. It runs OSU allreduce and barrier benchmarks with and without NVIDIA SHARP.
Help
This script includes OSU benchmarks for
MPI_Allreduce and MPI_Barrier blocking collective operations.
Both benchmarks run with and without using SHARP technology.
Usage: sharp_benchmark.sh [-t] [-d] [-h] [-f]
-t - tests list (e.g. sharp:barrier)
-d - dry run
-h - display this
help and exit
-f - supress error in prerequsites checking
Configuration:
Runtime:
sharp_ppn - number of processes per compute node (default
1
)
sharp_ib_dev - Infiniband device used for
communication. Format <device_name>:<port_number>.
For example: sharp_ib_dev="mlx5_0:1"
This is a mandatory parameter. If it's absent, sharp_benchmark.sh tries to use the first active device on local machine
sharp_groups_num - number of groups per communicator. (default
is the number of devices in sharp_ib_dev)
sharp_num_trees - number of trees to request. (default
num tress based on the #rails and #channels)
sharp_job_members_type - type of sharp job members list. (default
is SHARP_MEMBER_LIST_PROCESSES_DATA)
sharp_hostlist - hostnames of compute nodes used in the benchmark. The list may include normal host names,
a range of hosts in hostlist format. Under SLURM allocation, SLURM_NODELIST is used as a default
sharp_test_iters - number of test iterations (default
10000
)
sharp_test_skip_iters - number of test iterations (default
1000
)
sharp_test_max_data - max data size used for
testing (default
and maximum 4096
)
Environment:
SHARP_INI_FILE - takes configuration from given file instead of /labhome/danielk/.sharp_benchmark.ini
SHARP_TMP_DIR - store temporary files here instead of /tmp
HCOLL_INSTALL - use specified hcoll install instead from hpcx
Examples:
sharp_ib_dev="mlx5_0:1"
sharp_benchmark.sh # run using "mlx5_0:1"
IB port. Rest parameters are loaded from /labhome/danielk/.sharp_benchmark.ini or default
SHARP_INI_FILE=~/benchmark.ini sharp_benchmark.sh # Override default
configuration file
SHARP_INI_FILE=~/benchmark.ini sharp_hostlist=ajna0[2
-3
] sharp_ib_dev="mlx5_0:1"
sharp_benchmark.sh # Use specific host list
sharp_ppn=1
sharp_hostlist=ajna0[1
-8
] sharp_ib_dev="mlx5_0:1"
sharp_benchmark.sh -d # Print commands without actual run
Dependencies:
This script uses "python-hostlist"
package
. Visit https://www.nsc.liu.se/~kent/python-hostlist/ for details