Frequently Asked Questions#
Q: What is MAX-P?
MAX-P (Maximum Performance) is a power profile configuration that prioritizes maximum performance. It allows the system to consume more power to achieve higher throughput, faster inference times, and potentially better-quality results in AI workloads.
Q: What is MAX-Q?
MAX-Q (Maximum Efficiency) is a power profile configuration that prioritizes power efficiency. It optimizes the system to achieve a balance between performance and power consumption, typically resulting in lower power usage while maintaining acceptable performance levels.
Q: Why would a customer use MAX-Q if they had enough power for MAX-P?
While MAX-P offers maximum performance, there are compelling reasons to use MAX-Q, especially for Total Cost of Ownership (TCO), including:
Energy Efficiency and Cost Savings: MAX-Q setups can significantly reduce data center electricity costs.
Sustainability Goals: MAX-Q supports efforts to minimize carbon footprint.
Power Constraints and Infrastructure Costs: MAX-Q helps maximize existing infrastructure, possibly delaying costly expansions.
Workload Optimization: Suitable for batch jobs that don’t require peak performance, thus aiding energy savings.
Thermal Management and Hardware Lifespan: Reduced power lowers heat, improves hardware longevity, and cuts cooling costs.
Regulatory Compliance: Helps meet energy regulations in various regions.
Q: How are conflicting profiles handled?
An arbitration engine manages conflicts and priorities. For example, if both MAX-P and MAX-Q are requested, only MAX-P will be enforced because it has higher priority. For non-conflicting profiles, the system automatically chooses the best tuning knob values across the selected profiles for the specific workload.