ClusterKit
ClusterKit is a multipurpose node assessment tool for high-performance clusters, aimed at conducting the following tests:
- General Assessments: Latency, bandwidth, effective bandwidth, memory bandwidth, ordered ring bandwidth, and random ring bandwidth 
- GPU Communication Tests: Memory bandwidth, GPU-GPU latency and bandwidth, GPU-Host latency and bandwidth, and NCCL bandwidth and latency 
- Collective Evaluations: Barrier, allreduce, broadcast, alltoall, and NCCL 
- Bisectional Bandwidth 
- CPU/GPU Stress 
- It is recommended to install ClusterKit on a shared directory. 
- If such directory does not exist - make sure that all scripts are available on all the hosts in the exact same directory. 
- SLURM or passwordless ssh connectivity across the hosts.