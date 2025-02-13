To profile MPI API

Copy Copied! $ export IPM_KEYFILE=$HPCX_IPM_DIR/etc/ipm_key_mpi $ export IPM_LOG=FULL $ export LD_PRELOAD=$HPCX_IPM_DIR/lib/libipm.so $ mpirun -x LD_PRELOAD <...> $ $HPCX_IPM_DIR/bin/ipm_parse -html outfile.xml

For further details on profiling MPI API, please refer to: http://ipm-hpc.org/

The NVIDIA®-supplied version of IPM contains an additional feature (Barrier before Collective), not found in the standard package, that allows end users to easily determine the extent of application imbalance in applications which use collectives. This feature instruments each collective so that it calls MPI_Barrier() before calling the collective operation itself. Time spent in this MPI_Barrier() is not counted as communication time, so by running an application with and without the Barrier before Collective feature, the extent to which application imbalance is a factor in performance can be assessed.

The instrumentation can be applied on a per-collective basis, and is controlled by the following environment variables:

Copy Copied! $ export IPM_ADD_BARRIER_TO_REDUCE= 1 $ export IPM_ADD_BARRIER_TO_ALLREDUCE= 1 $ export IPM_ADD_BARRIER_TO_GATHER= 1 $ export IPM_ADD_BARRIER_TO_ALL_GATHER= 1 $ export IPM_ADD_BARRIER_TO_ALLTOALL= 1 $ export IPM_ADD_BARRIER_TO_ALLTOALLV= 1 $ export IPM_ADD_BARRIER_TO_BROADCAST= 1 $ export IPM_ADD_BARRIER_TO_SCATTER= 1 $ export IPM_ADD_BARRIER_TO_SCATTERV= 1 $ export IPM_ADD_BARRIER_TO_GATHERV= 1 $ export IPM_ADD_BARRIER_TO_ALLGATHERV= 1 $ export IPM_ADD_BARRIER_TO_REDUCE_SCATTER= 1

By default, all values are set to '0'.