Operating NVIDIA SHARP in Dynamic Trees Allocation Mode

NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) Rev 3.7.0

A SHARP tree defines a set of switches and their connected links to be used by one or more SHARP jobs.

  • A single tree can be used by multiple jobs, as long as they are using different areas of the tree.

  • A single job can also utilize multiple trees, in case the job is operating on multiple rails, while each rail can use a different tree.

In SHARP v3.3 and earlier, sharp_am used to operate in "Static trees" mode by default. In this mode, SHARP trees were created in the sharp_am initialization phase. When a new SHARP job started, it was assigned to one of the existing SHARP trees that was available to operate the job.

As of SHARP v3.4, sharp_am's default operation mode is "Dynamic trees" mode. This mode is recommended as the preferred option to use.

When sharp_am operates in Dynamic trees mode, trees are not created in the initialization phase. Instead, they are created per job, immediately assigned to the job that requires them, and are deleted once the job ends.

The Dynamic trees mode of operation has some benefits over the Static trees mode, as it defines the SHARP configuration on the switches only when necessary, and enables better utilization of the fabric resource. There are various scenarios in which a Static mode of operation may respond with “No resources” to a SHARP job request, while in Dynamic mode, the SHARP job would be fulfilled.

sharp_am takes multiple factors into consideration when deciding on the trees to create for each job. Initially, the allocated trees must meet the job's requirements. However, sharp_am also aims to allocate trees in a manner that preserves available links and switch resources for future jobs that may be needed.

The distinction between the combinations of trees that can be created in a regular FatTree versus a Quasi Fat Tree are significant. Consequently, sharp_am offers two distinct algorithms to determine how trees should be created for each job. One algorithm is optimized for SuperPOD fabrics, while the other is tailored for Quasi Fat Trees (QFTs).

Note the following:

  • Only one algorithm can be used at a given time

  • sharp_am should be restarted when switching algorithms

  • Under specific circumstances, when employing one algorithm and running multiple jobs simultaneously, sharp_am might potentially declare "No resources" for a particular job request. However, if the other algorithm were utilized, the resources would be distributed differently, fulfilling all job requests.

Please note that there are no definitive right or wrong algorithms for any given topology, as each algorithm comes with its own advantages and limitations. Additionally, certain features are exclusive to specific algorithms.

It is recommended to consult with NVIDIA experts regarding the suitable algorithm for your system. You can contact us through either of the following methods:

E-mail: Enterprisesupport@nvidia.com

Enterprise Support page: https://www.nvidia.com/en-us/support/enterprise

Note

sharp_am's default operation mode is set to Dynamic trees mode. Follow the instructions below in case you have the mode set to Static trees or a change of algorithm is required.

To operate in Dynamic trees mode, make sure dynamic_tree_allocation parameter is set to TRUE.

By default, the SuperPOD-oriented algorithm is used. To switch to the QFT-oriented algorithm, use the dynamic_tree_algorithm parameter.

If the number of root switches in the fabric is larger than 126 when using the SuperPOD-oriented algorithm, it is desired to modify max_trees_to_build to be equal to the number of root switches.

Note that sharp_am restart is required for the configuration to take effect.

  • Dynamic trees allocation mode is currently available for fat-tree and Quasi-Fat-Tree (QFT) topologies only, and is not supported for Dragonfly or hypercube topologies. In case sharp_am is configured to operate in Dynamic mode and the topology does not match, sharp_am will automatically operate in Static mode.

  • When operating in Dynamic trees mode, ibdiagnet may print warning messages about the existence of multiple distinct trees with the same tree ID. In Dynamic trees mode, this is a valid situation and these warnings should be ignored.

    Warning example: -W- <> - In Node <> found root tree (parent qpn <>) which is already exists for treeID: <>Note: You can avoid this warning by adding the following parameters to the ibdiagnet command line:--sharp_opt ad_hoc

  • Dynamic trees creation does not support a case in which all root switches are down and restarted. If such a scenario takes place, sharp_am should be restarted once the root switches are up and running.

© Copyright 2024, NVIDIA. Last updated on May 6, 2024.