NVIDIA Tegra
NVIDIA DRIVE OS 5.1 Linux

Developer Guide
5.1.0.2 Release


 
Watchdog Timer
 
Programming Counter Expirations at 1/3 Timeout
Enabling WDT0 from Kernel or User Space
If an application terminates or hangs, a hardware watchdog timer eventually expires, triggering a CPU reset, and enabling the system to recover without user intervention.
Note:
For information about the available Tegra Watchdog Timers and configurations, see the “Watchdog Timers (WDTs)” section of the Tegra Technical Reference Manual (TRM) for your chip.
This hardware, when turned on, has a timer that starts decrementing. The default timeout value is 120 seconds. When the timeout condition occurs, the WDT1 hardware sends a reset signal to the CPU that causes it to reset.
Programming Counter Expirations at 1/3 Timeout
NVIDIA® Tegra® devices have three counter expirations. To ensure the ultimate watchdog reset timeout is the same as the timeout requested by application, you must program the timer period for each counter expiration to be 1/3 of the timeout. For example, for a timeout period of 120 seconds, the timer period for counter expiration is 120/3 = 40 seconds, the first expiry happening at 40 seconds and the third expiry (for WDT reset) happening at 40 x 3 = 120 seconds.
If CPU lockup happens just prior to the first expiry, then reboot could happen in ~80 seconds. In all other cases, reboot happens in 80-120 seconds.
To modify the WDT expiration
1. Locate the Linux device tree file at:
<top>/hardware/nvidia/soc/t19x/kernel-dts/tegra194-soc/tegra194-soc-base.dtsi
2. Modify the value of the nvidia,heardbeat-init property under the watchdog@30d0000 node; this property value defaults to 120 seconds.
watchdog@30d0000 {
…..
nvidia,heartbeat-init = <120>;
….
};
 
Enabling WDT0 from Kernel or User Space
You can enable WDT0 from the kernel or from user space. If WDT0 is enabled in the kernel, during kernel boot, the kernel loads the WDT0 driver and then starts resetting (or kicking) WDT0. This prevents the device restarting under normal operation.
If you already enabled the default WDT0 driver from the Linux kernel, your applications in the user space do not need to kick WDT0.
Alternatively, applications can manually enable WDT0 from user space using standard Linux system calls and then by kicking the watchdog periodically. (For more information, see the sample code in the second procedure below.)
Typically, enabling WDT0 is sufficient for system monitoring. If you must enable watchdog on other CPUs, you must modify the WDT driver.
To enable WDT0 from the Linux kernel
1. Locate and modify the kernel configuration file:
<top>/drive-oss-src/kernel/arch/arm64/configs/tegra_t186ref_gnu_linux_defconfig
2. Add the following 2 lines under CONFIG_WATCHDOG_NOWAYOUT:
CONFIG_TEGRA<ver>_WATCHDOG=y
CONFIG_TEGRA_WATCHDOG_ENABLE_ON_PROBE=y
Where <ver> is the shorthand nnx for the chip family, such as 18X for t18x family of processors.
To enable WDT0 from user space
1. Locate and modify the kernel configuration file:
<top>/drive-oss-src/kernel/arch/arm64/configs/tegra_t186ref_gnu_linux_defconfig
2. Add the following line under CONFIG_WATCHDOG_NOWAYOUT=y:
CONFIG_TEGRA<ver>_WATCHDOG=y
The WDT0 device node is /dev/watchdog0. The following user-space sample code shows opening, enabling, obtaining and specifiying the timeout value, and kicking the watchdog timer.
int fd, ret;
int timeout = 0;
 
/* open WDT0 device (WDT0 enables itself automatically) */
fd = open("/dev/watchdog0", O_RDWR);
if(fd < 0) {
fprintf(stderr, "Open watchdog device failed!\n");
return -1;
}
/* WDT0 is counting now,check the default timeout value */
ret = ioctl(fd, WDIOC_GETTIMEOUT, &timeout);
if(ret) {
fprintf(stderr, "Get watchdog timeout value failed!\n");
return -1;
}
fprintf(stdout, "Watchdog timeout value: %d\n", timeout);
 
/* set new timeout value 60s */
/* Note the value should be within [5, 1000] */
timeout = 60;
ret = ioctl(fd, WDIOC_SETTIMEOUT, &timeout);
if(ret) {
fprintf(stderr, "Set watchdog timeout value failed!\n");
return -1;
}
fprintf(stdout, "New watchdog timeout value: %d\n", timeout);
 
/*Kick WDT0, this should be running periodically */
ret = ioctl(fd, WDIOC_KEEPALIVE, NULL);
if(ret) {
fprintf(stderr, "Kick watchdog failed!\n");
return -1;
}
 
Note:
No RT task with priority higher than watchdog timer is supported. Priority of watchdog timer is -2 in case of Hypervisor and -51 in case of native linux.