NVIDIA Tegra
NVIDIA Jetson Linux Developer Guide
32.4.3 Release

 

Power Management for Jetson Nano and Jetson TX1 Devices

 
Interacting Features
Kernel Space Power Saving Features
Chipset Power States
Clock and Voltage Management
Regulator Framework
CPU Power Management
Frequency Management with cpufreq
Idle Management with cpuidle
Memory Power Management
EMC Frequency Scaling Policy
WiFi Power Management
Supported Modes and Power Efficiency
Thermal Management
Linux Thermal Framework
Thermal Zone
Thermal Management in Linux
Thermal Sensors
Thermal Cooling
Thermal Specifications
Software-Based Power Consumption Modeling
Power Monitor Information
Carrier Board Information (Jetson TX1 only)
Examples
Under Voltage and Over Current Protection
Related Tools and Techniques
GPU 3D Frequency Scaling
Getting and Setting Frequencies
Maximizing Jetson Nano or Jetson TX1 Performance
Using CPU Hotplugging
nvpmodel GUI
The NVIDIA® Jetson Nano™ and NVIDIA® Jetson™ TX1 modules and NVIDIA® Jetson™ Board Support Package (BSP) provide many features related to power management, thermal management, and electrical management. These features deliver the best user experience possible given the constraints of a particular platform. The target user experience ensures the perception that the device provides:
Uniformly high performance
Excellent battery life
Perfect stability
Comfortable and cool to the touch
This topic describes the power, thermal, and electrical management features visible to software, as well as some tools and related techniques.

Interacting Features

Power, thermal, and electrical management features place dynamic constraints on many operational settings (“knobs”), such as:
Clock gate settings
Clock frequencies
Power gate (or regulator enable) settings
Voltages
Processor power state (i.e., which idle state is selected for the CPU)
Peripheral power state (i.e., which idle state is selected for an I/O controller)
Chipset power state
Availability of CPU cores to the OS
Some of these knobs are constrained by more than one feature. For example, cpufreq implements load based scaling based on how busy the CPU is, and adjusts the CPU frequency accordingly. CPU thermal management, however, can override the target frequency of cpufreq. Consequently, before you attempt to debug power, performance, thermal, or electrical problems, you must familiarize yourself with all of the power, thermal, and electrical management features in the BSP.

Kernel Space Power Saving Features

This section describes BSP features that save power and extend battery life. Many of these features are implemented by the Linux kernel, with support from firmware and hardware, and without significant involvement from the user space.

Chipset Power States

The supported power states are listed in order of increasing flexibility or configurability:
Off: There is only one way for a system to be off.
Deep Sleep (SC7) offers a small amount of configurability. For example, prior to entering Deep Sleep, software can select which of the many hardware wake events can wake the chip from Deep Sleep.
Active state is extraordinarily flexible in terms of power and performance. It encompasses activity levels from low power audio playback through peak performance. Power consumption in Active state can range from tens of milliwatts to several watts.
Supported Power States
The supported power states are:
Power State
Functionality
Characteristics
Off
Power rails
None of the power rails supplying the SoC and DRAM are powered.
State
No state is maintained in the SoC or DRAM.
Exiting
Into Active state via cold boot.
Deep Sleep (SC7)
Power rails
VDD_RTC, VDDIO_DDR, VDDIO_SYS, and DRAM power rails are powered on. VDD_CORE and VDD_CPU are powered off.
State
The SoC maintains a small amount of state information in the PMC block. DRAM maintains state.
Exiting
Into Active state via a pre-defined set of wake events.
Active
Power rails
VDD_RTC, VDDIO_DDR, VDDIO_SYS, VDD_CORE, and DRAM rails are powered on. Other power rails, including VDD_CPU and VDD_GPU, may be powered on.
State
Software actively manages the power states of the devices that make up the SoC.
Exiting
Software can initiate a transition from Active to any other power state.
Power State Mapping to Linux
BSP maps hardware power states to Linux power states as follows.
Chipset Power State
Linux Power State
Comments
Off
Off
Deep Sleep (SC7)
Suspend to RAM
Software can choose whether to enter Deep Sleep before the OS enters Suspend.
Active
Running/Idle (display on or off)
Many SoC devices may be idle or disabled under driver control. For example, VDD_GPU may be powered off and the companion GPU may be power-gated.
 
Note:
For Jetson Nano and Jetson TX1 the name of the chipset power state is SCy instead of LPx.
Deep Sleep (SC7)
You can initiate deep sleep from the user space with this command if the systemd init system is in use:
$ sudo systemctl suspend
Alternatively, you can use this:
$ sudo bash -c "echo mem > /sys/power/state"
The first method of entering deep sleep is preferred because it cooperates better with systemd, which maintains the Linux runlevel. Use the second method if your system is not running systemd.
The system can be awakened from deep sleep by common wake sources available on Jetson platforms:
Wake Source
Usage
Power button
Press and release the power button on the Jetson device. If the power button is not available, connect then disconnect the power button pin and ground.
RTC alarm
Before entering low power state, program the RTC alarm with the command:
$ sudo bash -c "echo `date '+%s' -d '+ 10 seconds'` > /sys/class/rtc/rtc0/wakealarm"
Micro USB cable hotplug
Connect or disconnect a micro-USB cable to the USB micro-B port for flashing the device.
USB remote wakeup
Press any key on a USB keyboard connected to the device. Note that Linux does not support a USB mouse as a wake source.
Wake on LAN
On another machine on the same LAN, enter:
$ sudo etherwake -i <interface> <MAC_address_of_target>
SD card detection
Insert or remove SD card.

Clock and Voltage Management

Because frequency is proportional to voltage, dynamic voltage scaling is closely related to frequency scaling. For example, higher frequencies require higher voltages and vice versa.
Most clock register manipulation on Jetson Nano and Jetson TX1 is handled by the Linux kernel. The Linux kernel driver on the CPU exposes a simplified view of the physical clock tree to software on the main CPU via the Linux Common Clock Framework.

Regulator Framework

The Linux regulator framework provides an abstraction that allows regulator consumer drivers to dynamically adjust voltage or current regulators at runtime, without knowledge of the underlying hardware power tree.
The framework provides a mechanism that platform initialization code can use to declare a power tree topology and assign a driver that provides regulators for each node in the hardware power tree. Such a driver is called a regulator provider driver.
BSP configures the platform power tree appropriately for Jetson Nano and Jetson TX1. Additionally, drivers within BSP act as regulator consumers where appropriate.
When you port BSP to a new platform, you must ensure that:
The platform power tree is configured to match the underlying hardware.
All drivers for peripheral devices use the regulator consumer APIs correctly.
The device tree and board configuration file information for your new platform avoid conflicts between functions using the same I/O pads. BSP drivers registering as regulator consumers can cause I/O pads on the chip to be unavailable for other functions.

CPU Power Management

The CPU power management strategy uses dynamic frequency scaling with dynamic voltage scaling, idle power states, and core management tuned for the Jetson Nano/Jetson TX1 architecture.

Frequency Management with cpufreq

BSP implements CPU Dynamic Frequency Scaling (DFS) with the Linux cpufreq subsystem. The cpufreq subsystem comprises:
Platform drivers to implement the clock adjustment mechanism
Governors to implement frequency scaling policies
A core framework to connect governors to platform drivers
The policy for frequency scaling depends on which cpufreq governor is selected at runtime.
For details, see the information at:
<top>/kernel/kernel-4.9/Documentation/cpu-freq/
For each Jetson hardware reference design, NVIDIA selects a cpufreq governor and tunes it to achieve a balance between power and performance.
When a governor requests a CPU frequency change, the cpufreq platform driver reconciles that request with the constraints imposed by thermal and electrical limits, and updates the CPU clock speed.
Jetson Nano and Jetson TX1 use a DFLL to clock each CPU. Hardware (with the assistance of the Linux Kernel) ensures that the CPU voltage is appropriate for the DFLL to deliver requested CPU frequencies.

Idle Management with cpuidle

The Linux cpuidle infrastructure supports the implementation of SoC-specific idle states for each CPU core. cpuidle lacks direct support for idle states applicable to an entire CPU cluster and for idle states extending beyond a CPU cluster.
For more information about the Linux cpuidle infrastructure, see:
<top>/kernel/kernel-4.9/Documentation/cpuidle/
NVIDIA provides an SoC-specific cpuidle driver that plugs into the cpuidle framework to enable CPU idle power management.
CPU Idle
For each core there is an idle task which is scheduled when no other runnable tasks are left in the run queue for that core. This task places the core in a low-power state selected by the cpuidle governor. The core stays in that state until an interrupt wakes it up to process more work.
When the last active core in a CPU cluster goes into an idle or offline state, the idle task puts the entire CPU cluster in a low-power state.
CCplex Idle States
The table below summarizes the CPU core, cluster, and CCplex (CPU Complex) idle states available on Jetson Nano and Jetson TX1, and the BSP software support for them.
Type of State
State
Meaning
Software support
Core state
C1
Clock gating (also known as WFI).
Supported
C7
Power gating
Supported
Cluster state
CC1
Auto clock gating
Not supported
CC4
Cluster retention
Not supported
CC6
Non-cpu power gating
Not supported
CC7
Rail gating
Not supported
Core states are denoted as Cx states, and cluster states are denoted as CCx states.
To enable CPU idle
To enable CPU idle you must enable the appropriate kernel configuration option and the appropriate device tree node. (Enabling either one alone is not effective.)
To enable CPU idle in the configuration file, set this option:
CONFIG_CPU_IDLE=y
To enable CPU idle in the device tree, enable the device tree node cpuidle:
cpuidle {
compatible = "nvidia,tegra210-cpuidle";
status = "okay";
};
To disable cpuidle at boot time
Disable the device tree node cpuidle.
To display CPU idle status
To determine whether CPU idle is enabled by sysfs, enter these commands:
$ cat /sys/devices/system/cpu/cpuidle/current_driver
If CPU idle is enabled, the command displays:
arm_idle
To disable a core/cluster power state at boot time
Remove or disable the appropriate core/cluster state node:
tegra210-soc-base.dtsi
or
Modify the appropriate core/cluster state node by setting the property min-residency-us to a high value, e.g., 0xffffffff.
For example, to disable power state C7:
C7: c7 {
compatible = "arm,idle-state";
arm,psci-suspend-param= <0x40000007>;
wakeup-latency-us = <130>;
min-residency-us = <0x1000>;
idle-state-name = "c7-cpu-powergated"
status = "okay";
};
To get and set a CPU’s core power state
The pathnames of the nodes that represent core power states are:
/sys/devices/system/cpu/cpu<x>/cpuidle/state<y>
Where:
<x> is a core ID.
<y> is the index of core power state: 0 for C1 (WFI), or 1 for C7.
Note:
A core power state’s status is 1 if the state is disabled, and 0 if it is enabled. This is the reverse of the usual Boolean sense of 0 and 1.
To get the status of core power state <y> on core <x>, read the appropriate node. To set the status, write an ASCII 0 to 1 to the node.
Following are several useful commands for getting and setting the core power state.
To display the name of the core power state with index <y>, enter the command:
cat /sys/devices/system/cpu/cpu<x>/cpuidle/state<y>/name
For example, this command displays WFI:
$ cat /sys/devices/system/cpu<x>/cpuidle/state0/name
This command displays C7:
$ cat /sys/devices/system/cpu<x>/cpuidle/state1/name
Note:
Because Jetson Nano and Jetson TX1 use the upstream cpuidle-arm driver, the name of state0 displays as WFI rather than C1.
To get the status of core power state <y> on core <x>:
cat /sys/devices/system/cpu/cpu<x>/cpuidle/state<y>/disable
To change the status of core power state <y> on CPU core <x>:
echo <b> > /sys/devices/system/cpu/cpu<x>/cpuidle/state<y>/disable
Note:
Remember that a status of 1 disables the core power state is disabled, and 0 enables it.
To get cluster status
To get the status of the cluster states that are enabled for cluster A57 (the only cluster defined on Jetson Nano), enter the command:
$ cat /sys/kernel/debug/cpuidle_t210/fast_cluster_states_enable
The value returned is a list of core and cluster idle states that are enabled by BSP:
To get per-core state usage statistics
To get the number of times the kernel requested a specified core to enter a specified state, read this node:
$ cat /sys/devices/system/cpu/cpu<x>/cpuidle/state<y>/usage
To get the total time in microseconds that a specified core has spent in a specified state since boot, read this node:
$ cat /sys/devices/system/cpu/cpu<x>/cpuidle/state<y>/time
For example, to get the number of times that core 2 of cluster A57 has entered state C7, enter the command:
$ cat /sys/devices/system/cpu/cpu2/cpuidle/state1/usage
To get the total time in microseconds that core 2 of cluster A57 has spent in state C7:
$ cat /sys/devices/system/cpu/cpu2/cpuidle/state1/time

Memory Power Management

NVIDIA SoC chipsets include power saving features whose operation is largely invisible to software at runtime. Most of those features are statically enabled at boot, according to settings in the boot configuration table (BCT).
Additionally, BSP implements EMC frequency scaling, which is dynamic frequency scaling for the memory controller (EMC/MC) and DRAM. This is a critical power saving feature that requires tuning and characterization for each new printed circuit board design.
The calibration results include a BCT and an EMC DVFS table specific to the board design. The EMC DVFS table must be included in the platform BPMP device tree file.

EMC Frequency Scaling Policy

The following factors affect EMC frequency scaling policy at runtime:
The entries in the EMC DVFS table
The average memory bandwidth used (as measured by the EMC activity monitor)
Requests made by various device drivers (cpufreq, graphics drivers, USB, HDMI™, and display)
Any limits dynamically imposed by thermal throttling

WiFi Power Management

If you face WiFi performance issues, you may try disabling WiFi power management to see if that resolves them. Note that disabling WiFi power management increases power consumption.
Documentation about WiFi configuration options, including power management, including wifi.powersave, is available in the ubuntu.com documentation page Edit Connections.
To disable WiFi power management
1. Open the file /etc/NetworkManager/conf.d/default-wifi-powersave-on.conf.
2. Set the option wifi.powersave to 2. (This option’s default value is 3.)
3. Reboot the device or restart network manager.

Supported Modes and Power Efficiency

Jetson Nano is designed with a high efficiency Power Management Integrated Circuit (PMIC), voltage regulators, and power tree to optimize power efficiency. It supports two power modes, such as 5W (5 watts) and MaxN (10 watts). Each mode allows several configurations with various CPU frequencies and numbers of cores online. Jetson TX1 supports only MAXN mode.
You restrict the module to a predefined configuration by capping the memory, CPU, and GPU frequencies and number of cores online at pre-qualified values.
The following table shows the power modes predefined by NVIDIA and the associated caps on use of the module’s resources.
NVPModel clock configuration
Property
Jetson Nano
Jetson TX1
MAXN *
5W
UCM1 profile
UCM2 profile
Power Budget
10 watts
5 watts
n/a
n/a
Mode ID
0
1
0
0
Online CPU
4
2
4
4
CPU Maximal Frequency (MHz)
1479
918
1734
1632
GPU TPC
1
1
1
1
GPU Maximal Frequency (MHz)
921.6
640
994.4
998.4
Memory Maximal Frequency (MHz)
1600
1600
1600
1600
SOC clocks maximal frequency (MHz)
All modes
adsp 844.8
ape 499.2
host1x 408
isp 793.6
display 665.6
csi 750
nvdec 716.8
nvenc 716.8
nvjpg 627.2
pcie 500
se 627.2
tsec 408
tsecb 627.2
vi 793.6
vic03 627.2
* The default mode is MAXN (power budget 10 watts, mode ID 0).
To change the power mode
Enter the command:
$ sudo /usr/sbin/nvpmodel -m <x>
Where <x> is the power mode ID, e.g. 0 or 1.
Alternatively, use the nvpmodel GUI front end. For more information, see To use the nvpmodel GUI, later in this topic.
Once you set a power mode, the module stays in that mode until you change it. The mode persists across power cycles and SC7.
To display the current power mode
Enter the command:
$ sudo /usr/sbin/nvpmodel -q
Alternatively, see the mode displayed to the right of the NVIDIA icon in the nvpmodel window’s menu bar. For more information, see To use the nvpmodel GUI, later in this topic.
To learn about other options
Enter the command:
$ /usr/sbin/nvpmodel -h
To define a custom power mode
Add a mode definition to the file /etc/nvpmodel.conf.
This is an example entry for mode 1:
< POWER_MODEL ID=1 NAME=5W >
CPU_ONLINE CORE_0 1
CPU_ONLINE CORE_1 1
CPU_ONLINE CORE_2 0
CPU_ONLINE CORE_3 0
CPU_A57 MIN_FREQ 0
CPU_A57 MAX_FREQ 918000
GPU_POWER_CONTROL_ENABLE GPU_PWR_CNTL_EN on
GPU MIN_FREQ 0
GPU MAX_FREQ 640000000
GPU_POWER_CONTROL_DISABLE GPU_PWR_CNTL_DIS auto
EMC MAX_FREQ 1600000000
The unit of measure for CPU frequency is kilohertz. The unit for GPU and EMMC frequency is hertz. You must assign each custom mode a unique number in the ID field.
Test your custom mode to determine:
How many active cores to use
The frequency for each CPU cluster, and the GPU and EMC frequencies
The frequencies you select are subject to the MAXN limit defined in mode 0.

Thermal Management

Thermal management is essential for system stability and quality of user experience. Jetson Nano and Jetson TX1 thermal management provides the following capabilities:
Sensing for on-board and on-die thermal sensor temperature reporting
Cooldown for removing heat via the fan and for controlling heat via software clock throttling
Slowdown for hardware clock throttling
Shutdown for orderly software shutdown and hardware thermal shutdown
Thermal management is performed by software on the main CPU.
The following table identifies each thermal management action and the associated module for the SoC.
Thermal Action
Linux Kernel Driver
Associated Module
Sensing
soctherm.c
Kernel software
aotag.c
Kernel software
Cooldown for software throttling
tegra_throttle.c
Kernel software
pwm_fan.c
Kernel software
Slowdown for hardware throttling
soctherm.c
Kernel software
Software shutdown
thermal_core.c
Kernel software
Hardware shutdown
soctherm.c
Kernel software
aotag.c
Kernel software

Linux Thermal Framework

The Linux thermal framework provides generic user space and kernel space interfaces for working with devices that measure or control temperature. The central component of the framework is the thermal zone.
For more information about the Linux thermal framework, see:
<top>/kernel/kernel-4.9/Documentation/thermal/sysfs-api.txt

Thermal Zone

A thermal zone is a virtual object that represents an area on the die whose temperature is monitored and controlled. A thermal zone acts as an object with the following components:
Temperature sensor
Cooling device
Trip points
Governor
BSP includes drivers that provide interfaces to these components.
This topic introduces the components and demonstrates how they form a thermal zone on an NVIDIA SoC.
Configuring a Thermal Zone Using the Device Tree
A thermal zone provides knobs to tune the thermal response of the zone. BSP provides several thermal zones tuned to provide optimum thermal performance. You can modify the provided thermal zones by editing the entries in the device tree. You can define sensors that monitor temperature limits and perform cooling actions based on those limits. If a device becomes too hot, you can resolve the problem in most cases by tuning the thermal zone.
The following code snippet provides an example of a thermal zone definition for the Jetson Nano and Jetson TX1 platforms. This thermal zone monitors the temperature of the THERMAL_ZONE_CPU sensor. Clock throttling is performed using the CPU-balanced cooling device when the passive trip point cpu_throttle is crossed at 97° C.
CPU-therm {
thermal-zone-params {
governor-name = "step_wise";
};
trips {
cpu_critical {
temperature = <102000>;
hysteresis = <0>;
type = "critical";
writable;
};
cpu_heavy {
temperature = <100500>;
hysteresis = <0>;
type = "hot";
writable;
};
cpu_throttle {
temperature = <97000>;
hysteresis = <0>;
type = "passive";
writable;
};
};
cooling-maps {
map0 {
trip = <&{/thermal-zones/CPU-therm/trips/cpu_critical}>;
cdev-type = "tegra-shutdown";
cooling-device = <&{/soctherm@0x700E2000/throttle@critical}
THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
map1 {
trip = <&{/thermal-zones/CPU-therm/trips/cpu_heavy}>;
cdev-type = "tegra-heavy";
cooling-device = <&throttle_heavy 1 1>;
};
map2 {
trip = <&{/thermal-zones/CPU-therm/trips/cpu_throttle}>;
cdev-type = "cpu-balanced";
cooling-device = <&{/bthrot_cdev/cpu_balanced}
THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};
};
For more information about thermal knobs, see:
<top>/kernel/kernel-4.9/Documentation/devicetree/bindings/thermal.txt
Temperature Sensors
A temperature sensor in a thermal zone is responsible for reporting the temperature. Temperature is reported in units of 0.001° C. Jetson modules have several types of temperature sensors on the die and board.
For more information, see the section Thermal Management in Linux.
Trip Points
Thermal management uses trip points to communicate with thermal zones. A trip point identifies the temperature at which to perform a thermal action.
Trip points are classified as active or passive, based on the type of cooling they trigger. A trip point is classified as critical if it triggers a thermal shutdown. A cooling map specifies how a cooling device is associated with certain trip points. Jetson BSP supports fan and clock throttling.
Cooling Devices
A cooling device reduces the temperature of a power dissipating device. There are essentially two types of cooling devices:
An active cooling device, such as a fan, reduces the temperature of a power dissipating device by removing heat.
A passive cooling device, such as software or hardware clock throttling, reduces temperature by reducing device performance, and so reducing heat dissipation.
For more information, see Thermal Cooling.
Governors
Thermal management requires some form of feedback control system that keeps the device within a safe operating temperature. A governor implements this feedback control loop. While the Linux thermal framework provides many different governors, BSP provides a simple Proportional Integral Derivative (PID) controller for all passive throttling needs.
BSP-Specific Thermal Zones
BSP defines platform-specific thermal zones. The zones are tuned to provide the best performance within the thermal constraints of the device. Each thermal zone uses a temperature sensor that is controlled by the Linux kernel as described in the following table.
Thermal Zone
Thermal Sensor
Associated Module
CPU-therm
THERMAL_ZONE_CPU
Linux kernel
Linux kernel
GPU-therm
THERMAL_ZONE_GPU
Linux kernel
PLL-therm
THERMAL_ZONE_PLLX
Linux kernel
AO-therm
THERMAL_ZONE_AO
Linux kernel
Linux kernel
PMIC-die
PMIC
Linux kernel
Tboard_tegra
Tmp451 local sensor
Linux kernel
Tdiode_tegra
Tmp451 remote sensor
Linux kernel
thermal-fan-est
Weighted average of CPU and GPU
Linux kernel
Gains achieved by tuning are limited by the Thermal Design Power (TDP) of the system. Tuning cannot remedy a faulty TDP. Removing all the thermal zones does not guarantee maximum performance, and can cause irreversible damage to the device.

Thermal Management in Linux

The Linux kernel provided by BSP includes several drivers for on-board and on-die temperature sensing.

Thermal Sensors

Jetson TX1/Nano has several types of sensors to support hardware and software cooling strategies.
On-die thermal sensors
BSP includes drivers for on-die soctherm and aotag thermal sensors as follows:
Thermal Sensor
Sensed Location
AOTAG
AOTAG
Co-locate with TDIODE in pad-ring
SOC_THERM
PLLX
Placed adjacent to PLLX
MEM (x2)
Placed between data bricks of pad blocks of memory channels 0 and 1 accordingly
CPU (x4)
Centrally located in the CPU
GPU
Within the GPU
NCT Sensors
Jetson BSP includes a driver for on-board sensor devices such as:
NCT1008
NCT72
TMP451
These devices can sense their own temperature as well as the temperature of a remote diode. On Jetson platforms these sensors are set up as follows:
Thermal Zone
Thermal Sensor
Sensed Location
Tdiode_tegra
Remote sensor
Temperature on die near GPU
Tboard_tegra
Local sensor
Temperature of the board
Jetson BSP configures these sensors to operate in an extended mode to increase the temperature range to −64° C to 191° C.
Operation During SC7
On Jetson Nano and Jetson TX1, the voltage rail that powers the sensor is gated when Jetson enters state SC7. Consequently, the sensor is stopped when Jetson enters SC7 and is turned back on when Jetson exits SC7.
Thermal Capabilities
The NCT sensors generate thermal events for:
Thermal zone trip points
Hardware thermal shutdown
Correction Offset
The NCT sensors allow software to program a static offset temperature for remote sensors. This accounts for any inaccuracy that may arise in the sensor hardware. Jetson BSP reads the offset from the device tree and programs it into the offset register on boot. The offset is calculated and validated via oil bath experiments.

Thermal Cooling

BSP provides thermal management using fan control and throttling of various clocks in the system.
Fan Management
BSP provides active cooling by fan management through the cooling device pwm-fan, which provides:
Fan speed control by programming the PWM controller
Ramp-up and ramp-down control to change the speed of the fan smoothly
Fan control during various power states
The PWM-RPM mapping, and the various ramp rates, are stored as part of the device tree binary. The pwm-fan cooling device maps these PWM values to a cooling state. The fan cooling device can be attached to monitor the temperature of any of the BSP sensors. As the temperature increases, the governor picks a progressively deeper cooling state for the fan. This results in a higher RPM for the fan, which produces more cooling.
SoC thermal management uses the fan as the first line of defense to delay clock throttling until a much higher temperature is reached.
Software Clock Throttling
BSP provides thermal cooling by throttling various clocks in the system. When a thermal sensor’s temperature rises above a throttling trip point, clock throttling employs the DVFS capabilities of the clocks to reduce their operating frequencies, and thereby the voltages of the rails that power the clocks. This reduction in frequency and voltage reduces power consumption, which helps to control the temperature.
Because BSP provides cooling by reducing the clock frequency, it directly impacts performance and the user experience. If a device feels warm and seems sluggish, it may be due to thermal throttling on the clocks. You can remedy this by tuning the thermal zones provided in the following BSP balanced cooling devices:
gpu_balanced
cpu_balanced
emergency_balanced
Each of these balanced cooling devices provides several cooling states, each of which translates to a maximum allowable operating frequency for the CPU, GPU, and EMC clocks. These frequencies are optimized to provide the best possible performance at a given temperature. The frequency tables for these clocks are part of the device tree binary.
The governor uses the current temperature of a thermal zone as an input to the feedback control loop. Similarly, it uses the output of the control loop to set a new cooling state for the thermal zone’s cooling device. As the device heats up, the governor picks progressively higher cooling states, which result in higher frequency caps for all of the clocks, and potentially greater cooling. BSP performs this thermal throttling of the clocks to maintain the junction temperature of the die within recommended safe limits. For software throttling trip temperatures, see the table in Thermal Specifications.
Hardware Throttling
Each element in a power delivery system includes limitations such as:
The amount of current a battery can supply without shutting down
The amount of current a regulator can provide before it fails to maintain its output voltage
The amount of ripple current an inductor in a switching regulator can tolerate without overheating
These limitations can result in fast transient electrical and thermal events such as:
Overcurrent at the battery
Voltage drop at the PMIC
Temperature spikes
The Linux kernel refers to these events as OC alarms and triggers hardware throttling of the clocks to handle them.
Impact
Like software throttling, hardware throttling may reduce performance. Because the triggering events are rare and transient in nature, though, the user experience is minimally impacted.
The host OS is not notified of these events, but you can detect the resulting drop in clock rates by using a performance measuring tool that samples the CPU cycle counters. While thermal management in the host OS seeks to control temperature on an ongoing basis, hardware throttling clamps down the clocks to handle events.
Throttle Points and Vector Configuration
The BPMP device tree binary holds the various throttle points and the throttle settings that govern when and how throttling is performed. The soctherm driver in the firmware programs the hardware and handles any interrupts resulting from these events. You can change the throttle points by changing the kernel device tree.
This table shows the hardware throttling levels:
Hardware Throttling
Clock Throttled Percentage
Heavy
87.5
Medium
75
Light
50
Throttle vectors are optimized for limiting peak current consumption while maximizing performance. To manage peak current consumption, the Linux kernel supports capping the CPU and GPU clocks at three levels (light, medium, and heavy) as described in the device tree bindings. Clock capping prevents the CPU and GPU from drawing more current than their voltage regulators can supply. For hardware throttling trip temperature, see the table in Thermal Specifications.
Design Considerations
Designing failsafe measures into Power Management Integrated Circuits (PMICs), or using the battery controller to shut down the device when the events described here occur, results in a bad user experience. Similarly, designing power delivery hardware for worst-case loads results in large and costly components.
Consequently, NVIDIA SoCs are designed for use with power delivery systems that are adequate for common loads. NVIDIA SoCs actively manage their components to avoid exceeding their design limits. When events are transient, the advantage of this approach to power management becomes more compelling.
Thermal Shutdown
Thermal zones also define a special type of trip point called a critical trip point which triggers a software shutdown. A critical trip point allows the operating system to save its state and perform an orderly shutdown before overheating causes a hardware shutdown. BSP defines a critical trip point for each thermal zone. You can set a lower temperature limit for the software shutdown.
A hardware shutdown (a thermtrip) occurs after all of the other cooling strategies have failed, and in particular, after software shutdown has failed to occur when it should. The SoC performs a hardware shutdown by asserting the reset pin on the PMIC. This is intended to be a rare event.
For hardware shutdown limits, see the table in Thermal Specifications.

Thermal Specifications

This table describes the supported power states.
Thermal Zone
Thermal Sensor
Cooling Action
Jetson TX1
Jetson Nano
CPU-therm
THERMAL_ZONE_CPU
SW throttling
97.0° C
97.0° C
HW throttling
100.5° C
100.5° C
SW shutdown
102.0° C
102.0° C
HW shutdown
102.5° C
102.5° C
GPU-therm
THERMAL_ZONE_AUX
SW throttling
97.5° C
97.5° C
HW throttling
101.0° C
101.0° C
SW shutdown
102.5° C
102.5° C
HW shutdown
103.0° C
103.0° C
PLL-therm
THERMAL_ZONE_PLLX
DRAM cooling
-
70° C
Tboard_tegra *
tmp451 local sensor
DRAM cooling
70° C
n/a
HW shutdown
120.0° C
n/a
Tdiode_tegra *
tmp451 remote sensor
HW shutdown
105.0° C
n/a
PMIC-Die
PMIC
HW shutdown
120.0° C
120.0° C
thermal-fan-est
Weighted average of CPU, and GPU
Fan ON
51.0° C
51.0° C
* The tmp451 on-board thermal sensor does not present in the Jetson Nano Module.

Software-Based Power Consumption Modeling

The Jetson Nano and Jetson TX1 modules have a three-channel INA3221 power monitor at I2C address 0x40.

Power Monitor Information

The information from the INA3221 power monitor can be read using sysfs nodes. The naming convention for sysfs nodes is:
Command
Description
rail_name_<N>
Exports the rail name.
in_current<N>_input
Exports rail current in milliamperes.
in_voltage<N>_input
Exports rail voltage in millivolts.
In_power<N>_input
Exports rail power in milliwatts.
crit_current_limit_<N>
Exports rail critical current limit in milliamperes.
Where <N> is a channel number 0-2.
 
Note:
The INA driver may also present other nodes. Do not modify any INA sysfs node value. Modifying these values can result in damage to the device.
The sysfs nodes to read for rail names, voltage, current, power, and critical current limit are at:
Jetson TX1: /sys/bus/i2c/drivers/ina3221x/1-0040/iio:device0
Jetson Nano: /sys/bus/i2c/drivers/ina3221x/6-0040/iio:device0/
The rail names for I2C address 0x40 are:
Rail Name
Description
Channel 0: VDD_IN
Main module power input.
Channel 1: VDD_GPU
GPU power rail.
Channel 2: VDD_CPU
CPU power rail.

Carrier Board Information (Jetson TX1 only)

The Jetson TX1 Developer Kit carrier board has three-channel INA3221 power monitors at I2C addresses 0x42 and 0x43. The sysfs nodes to read rail name, voltage, current, power, and critical current limit are at:
/sys/bus/i2c/drivers/ina3221x/1-0042/iio:device2
/sys/bus/i2c/drivers/ina3221x/1-0043/iio:device3
The rail names for I2C address 0x42 are:
Rail Name
Description
Channel 0: VDD_MUX
Carrier board power input.
Channel 1: VDD_5V_IO_SYS
Carrier board 5 V supply.
Channel 2: VDD_3V3_SYS
Carrier board 3.3 V supply.
The rail names for I2C address 0x43 are:
Rail Name
Description
Channel 0: VDD_3V3_IO_SLP
Carrier board 3.3 V sleep supply.
Channel 1: VDD_1V8_IO (Name on schematic is VDD_1V8)
Carrier board 1.8 V supply.
Channel 2: VDD_3V3_SYS_M2
3.3 V supply for M.2 Key E connector.

Examples

To read INA3221 at 0x40, the channel-0 rail name (i.e., VDD_IN), enter the command:
$ cat /sys/bus/i2c/drivers/ina3221x/6-0040/iio:device0/rail_name_0
To read VDD_IN voltage, current, and power, enter the commands:
$ cat /sys/bus/i2c/drivers/ina3221x/6-0040/iio:device0/in_current0_input
$ cat /sys/bus/i2c/drivers/ina3221x/6-0040/iio:device0/in_voltage0_input
$ cat /sys/bus/i2c/drivers/ina3221x/6-0040/iio:device0/in_power0_input
To read VDD_IN critical current limit, enter the command:
$ cat /sys/bus/i2c/drivers/ina3221x/6-0040/iio:device0/crit_current_limit_0
 
To set VDD_IN critical current limit, enter the command:
$ echo <current> > /sys/bus/i2c/drivers/ina3221x/6-0040/iio:device0/crit_current_limit_0
 
Where <current> is a critical current limit to be set for VDD_IN rail in milliamperes.
Note:
With regard to accuracy, assume a 5% guard band for INA measurements greater than 200 milliwatts. Below that, accuracy may deviate by as much as 15%.

Under Voltage and Over Current Protection

The Jetson Nano and Jetson TX1 modules have a built-in under voltage and over current protection mechanism. Over current protection is provided by an on-board INA3221 power monitor, which can trigger hardware clock throttling via soctherm-OC to reduce power consumption when module input current exceeds a software-defined critical current limit. Similarly, under voltage protection is provided by on-board voltage comparator circuit which can trigger hardware clock throttling via soctherm-OC when module input voltage drops below a hardware-defined low voltage threshold.
When clocks are throttled due to under voltage or over current event:
1) The kernel prints a “soctherm: OC ALARM” message on the debug console.
2) The nvpmodel GUI panel indicator icon changes from green to red and pops up the message “System is now being throttled.”
3) The carrier board power LED begins blinking (Jetson Nano only).

Related Tools and Techniques

This section describes tools and techniques for managing power.

GPU 3D Frequency Scaling

GPU 3D frequency scaling is enabled by default.
To disable 3D frequency scaling
Enter the command:
$ echo 0 > /sys/devices/57000000.gpu/enable_3d_scaling
To enable 3D frequency scaling
Enter the command:
$ echo 1 > /sys/devices/57000000.gpu/enable_3d_scaling

Getting and Setting Frequencies

Use the following procedures to set frequencies and report current frequency settings.
Note:
In all of these procedures, <x> is a CPU core number. For example, to apply a command to CPU core 1, replace cpu<x> with cpu1.
To get system clock information
Enter the command:
$ cat /sys/kernel/debug/clk/clk_summary
To print the CPU lower boundary, upper boundary, and current frequency
Enter the commands:
$ cat /sys/devices/system/cpu/cpu<x>/cpufreq/cpuinfo_min_freq
$ cat /sys/devices/system/cpu/cpu<x>/cpufreq/cpuinfo_max_freq
$ cat /sys/devices/system/cpu/cpu<x>/cpufreq/cpuinfo_cur_freq
To change the CPU upper boundary
Enter the command:
$ echo <cpu_freq> > /sys/devices/system/cpu/cpu<x>/cpufreq/scaling_max_freq
To change the CPU lower boundary
Enter the command:
$ echo <cpu_freq> > /sys/devices/system/cpu/cpu<x>/cpufreq/scaling_min_freq
To set the static CPU frequency
Enter the commands:
$ echo <cpu_freq> > /sys/devices/system/cpu/cpu<x>/cpufreq/scaling_min_freq
$ echo <cpu_freq> > /sys/devices/system/cpu/cpu<x>/cpufreq/scaling_max_freq
Where <cpu_freq> is the frequency value available at:
/sys/devices/system/cpu/cpu<x>/cpufreq/scaling_available_frequencies
To print the GPU lower boundary, upper boundary, and current frequency
Enter the commands:
$ cat /sys/devices/57000000.gpu/devfreq/57000000.gpu/min_freq
$ cat /sys/devices/57000000.gpu/devfreq/57000000.gpu/max_freq
$ cat /sys/devices/57000000.gpu/devfreq/57000000.gpu/cur_freq
To change the GPU upper boundary
Enter the command:
$ echo <gpu_freq> > /sys/devices/57000000.gpu/devfreq/57000000.gpu/max_freq
To change the GPU lower boundary
Enter the command:
$ echo <gpu_freq> > /sys/devices/57000000.gpu/devfreq/57000000.gpu/min_freq
To set the static GPU frequency
Enter the command:
$ echo <gpu_freq> > /sys/devices/57000000.gpu/devfreq/57000000.gpu/min_freq
$ echo <gpu_freq> > /sys/devices/57000000.gpu/devfreq/57000000.gpu/max_freq
Where <gpu_freq> is the value available in:
/sys/devices/57000000.gpu/devfreq/57000000.gpu/available_frequencies
To print the EMC lower boundary, upper boundary, and current frequency
Enter the commands:
$ cat /sys/kernel/debug/tegra_bwmgr/emc_min_rate
$ cat /sys/kernel/debug/tegra_bwmgr/emc_max_rate
$ cat /sys/kernel/debug/tegra_bwmgr/emc_rate
To set static EMC frequency
Enter the commands:
$ echo <emc_freq> > /sys/kernel/debug/clk/override.emc/clk_update_rate
$ echo 1 > /sys/kernel/debug/clk/override.emc/clk_state
Where <emc_freq> is a frequency value between the EMC minimum and maximum frequencies.

Maximizing Jetson Nano or Jetson TX1 Performance

BSP provides the jetson_clocks script to maximize Jetson Nano or Jetson TX1 performance by setting the static maximum frequencies of the CPU, GPU, and EMC clocks. You can also use the script to show current clock settings, store current clock settings into a file, and restore clock settings from a file.
The script is available at:
/usr/bin/jetson_clocks
To run the script, enter:
$ jetson_clocks [options]
Option
Description
--show
Displays the current settings.
--store [<file>]
Stores the current settings to a file. The default file is l4t_dfs.conf.
--restore [<file>]
Restores the saved settings from a file. The default file is l4t_dfs.conf.
--fan
Set maximum PWM fan speed.
To show the current settings
Enter the command:
$ sudo /usr/bin/jetson_clocks --show
To store the current settings
Enter the command:
$ sudo /usr/bin/jetson_clocks --store
To maximize platform performance
Enter the command:
$ sudo /usr/bin/jetson_clocks
To maximize platform performance and fan speed
Enter the command:
$ sudo /usr/bin/jetson_clocks --fan
 
Note:
Starting with Release 32.4, jetson_clocks no longer sets maximum fan speed by default. If you prefer the old behavior, use the --fan option.
To restore the previous settings
Enter the command:
$ sudo /usr/bin/jetson_clocks --restore

Using CPU Hotplugging

You can manage CPU hotplugging as follows.
To turn a slave CPU on or off manually
Enter the command:
$ echo <b> > /sys/devices/system/cpu/cpu<x>/online
Where:
<b> is 1 to turn the CPU on, or 0 to turn it off
<x> is the number of the CPU core number
To check a CPU’s state
Enter the command:
$ cat /sys/devices/system/cpu/cpu<x>/online
Where <x> is the CPU core number.

nvpmodel GUI

The nvpmodel GUI is a GUI front end for the nvpmodel command line tool. It is an easy way to access power-related functionality and information.
To use the nvpmodel GUI
The nvpmodel GUI is represented by an NVIDIA icon on the right side the Ubuntu desktop’s top bar:
The current power mode is displayed next to the NVIDIA icon. In the illustration above, the current mode is MAXN.
To switch the current power mode, click the NVIDIA icon to open a dropdown menu from the icon. Click “Power mode” to open a submenu of power modes.
Click the power mode you want to set.
To run tegrastats, click the NVIDIA icon to open the dropdown menu.
Click “Run tegrastats” to spawn a terminal window and run tegrastats.
If system input voltage drops below a safe level, the nvpmodel GUI displays a desktop notification to warn you that the system is being throttled back to avoid a shutdown due to insufficient power.
The tegrastats display provides power-related information such as CPU, GPU, and EMC frequencies and the temperatures of thermal zones registered to the system.