4. Real-World Use Cases#

Use Performance Explorer to gain actionable insights and streamline AI deployments:

  • Make Data-Driven Decisions About Cluster Scale

    • Use case.

      Evaluate AI workload performance at various GPU counts to balance training time and cost.

    • Key actions.

      Examine the tradeoffs between Total Cost to Train and Total Time to Train across cluster scales to determine the optimal GPU count for your workload.

    • Example scenario.

      Examine the performance scaling characteristics of the NVIDIA DGX Cloud H100 Reference Architecture for your workload to quantify how adding GPUs impacts total cost.

  • Navigate the Precision Switch From BF16 to FP8

    • Use case.

      Decide whether to switch from BF16 to FP8 precision.

    • Key actions.

      Compare how BF16 and FP8 precision affect training throughput and project costs.

    • Example scenario.

      Use DGX Cloud Benchmarking to compare FP8 and BF16 performance for your workload.

  • Assess Performance Gains From the Latest NeMo Framework

    • Use case.

      Determine if upgrading to the latest NVIDIA NeMo Framework version improves your AI workload performance.

    • Key actions.

      Measure performance metrics from your current framework version with the latest version.

    • Example scenario.

      Evaluate how recent NeMo Framework updates affect your workload’s training throughput and potential cost savings.