Appendix A: Using SHARP with Quasi Fat Trees and DFP Topologies
sharp_am considers multiple factors when determining how to allocate trees for each job. While the trees must satisfy the job's requirements, sharp_am also aims to preserve fabric resources - such as links and switches - for future jobs.
Topology plays a key role in how trees are allocated. The available combinations differ significantly between standard Fat-Tree, Quasi-Fat Tree (QFT), and Dragonfly (DFP) topologies. Accordingly, sharp_am supports two distinct allocation algorithms:
Fat-Tree Algorithm (default) – Optimized for standard Fat-Tree fabrics.
QFT/Dragonfly Algorithm – Tailored for Quasi-Fat Tree and Dragonfly topologies.
To use the QFT/Dragonfly algorithm, update the configuration file conf/sharp/sharp_am.cfg and set:
dynamic_tree_algorithm = 1
A restart of sharp_am is required for the change to take effect.
Notes:
Only one algorithm can be active at a time.
Restart
sharp_amwhen switching algorithms.In some scenarios, using one algorithm with concurrent jobs may result in a “No resources” error for a specific job. Switching to the alternative algorithm may resolve the issue by redistributing resources differently.
Fabric-Specific Behavior
When operating in QFP fabric, use of the QFP-optimized algorithm is mandatory. In fact, sharp_am will automatically enforce this mode, regardless of the configured value.
In Fat-Tree or Quasi-Fat Tree topologies, using the corresponding optimized algorithm is optional—but recommended. In certain cases, NVIDIA may advise using the non-default algorithm based on specific workload or topology characteristics.
Algorithm Selection Support
Selecting the appropriate algorithm can significantly impact SHARP efficiency and job success. For guidance tailored to your system, contact NVIDIA Enterprise Support: