4. Real-World Use Cases#
Use Performance Explorer to gain actionable insights and streamline AI deployments:
Make Data-Driven Decisions About Cluster Scale
Use case.
Evaluate AI workload performance at various GPU counts to balance training time and cost.
Key actions.
Examine the tradeoffs between Total Cost to Train and Total Time to Train across cluster scales to determine the optimal GPU count for your workload.
Example scenario.
Examine the performance scaling characteristics of the NVIDIA DGX Cloud H100 Reference Architecture for your workload to quantify how adding GPUs impacts total cost.
Navigate the Precision Switch From BF16 to FP8
Use case.
Decide whether to switch from BF16 to FP8 precision.
Key actions.
Compare how BF16 and FP8 precision affect training throughput and project costs.
Example scenario.
Use DGX Cloud Benchmarking to compare FP8 and BF16 performance for your workload.
Assess Performance Gains From the Latest NeMo Framework
Use case.
Determine if upgrading to the latest NVIDIA NeMo Framework version improves your AI workload performance.
Key actions.
Measure performance metrics from your current framework version with the latest version.
Example scenario.
Evaluate how recent NeMo Framework updates affect your workload’s training throughput and potential cost savings.