TensorFlow For Jetson Platform

This document describes the key features, software enhancements and improvements, and known issues regarding NVIDIATensorFlow on the Jetson platform. See Installing TensorFlow for Jetson Platform for installation information.

Key Features and Enhancements

This release includes the following key features and enhancements.

As of the 23.11 release, NVIDIA optimized TensorFlow docker containers will also support iGPU architectures, including some Jetson devices. For more information on the containers themselves, please refer to the TensorFlow Container Release Notes.
As of the 20.02 release, the TensorFlow package has been renamed from tensorflow-gpu to tensorflow. This change will only affect the package's installation and its name within the python package manager, not how TensorFlow is imported or accessed once installed on the system. In order to properly upgrade your version of TensorFlow, you may need to uninstall the tensorflow-gpu package before installing tensorflow, as having both installed simultaneously will lead to undefined behavior.
TensorFlow 2.0 is now available for installation. See the TensorFlow’s Effective TensorFlow 2 guide for details about the update. Note that TensorFlow 2.x is not fully compatible with TensorFlow 1.x releases, therefore, code written for the older framework may not work with the newer package.
Bug fixes and improvements for TF-TRT. For more information, see the TensorFlow-TensorRT (TF-TRT) User Guide and the TensorFlow Container Release Notes.

Compatibility

TensorFlow Version	NVIDIA Framework Container	NVIDIA Framework Wheel	JetPack Version
2.17.0	25.02	-	6.2
2.17.0	25.01	-	6.2
2.17.0	24.12, 24.11, 24.10, 24.09	24.09	6.1
2.16.1	24.07, 24.06	24.07, 24.06	6.0
2.15.0	24.05	24.05	6.0
2.15.0	24.04, 24.03, 24.02	24.04, 24.03, 24.02	6.0 Developer Preview
2.14.0	23.12, 23.11, 24.01	23.12, 23.11, 24.01	6.0 Developer Preview
2.12.0		23.06, 23.05, 23.04	5.1.x
2.11.0		23.03, 23.02, 23.01	5.1.x
2.10.1		22.12	5.0.2
2.10.0		22.11, 22.10
2.9.1		22.09, 22.07
2.9.1		22.06	5.0.1
2.8.0		22.05, 22.04, 22.03	5.0
2.7.0		22.01	4.6.1
2.6.2		21.12	4.6
2.6.0		21.11, 21.09
2.5.0		21.08, 21.07
2.5.0		21.06	4.5
2.4.0		21.05, 21.04, 21.03, 21.02
2.3.1		20.12
2.3.1		20.12, 20.11, 20.10	4.4.x
2.3.0		20.09
2.2.0		20.08, 20.07, 20.06
2.1.0		20.04
2.1.0		20.03, 20.02	4.3
1.15.5		23.03, 23.02, 23.01	5.1.x
		22.12, 22.11, 22.10, 22.09, 22.07	5.0.2
		22.06	5.0.1
		22.05, 22.04, 22.03	5.0
		22.01	4.6.1
		21.12, 21.11, 21.09, 21.08, 21.07	4.6
		21.06, 21.05, 21.04, 21.03, 21.02	4.5
1.15.4		20.12	4.5
1.15.4		20.12, 20.11, 20.10	4.4.x
1.15.3		20.09, 20.08, 20.07
1.15.2		20.06, 20.04
1.15.2		20.03, 20.02	4.3
Older packages below are installed as `tensorflow-gpu`; more recent releases above as `tensorflow`.
2.0.0		20.01, 19.12	4.3
2.0.0		19.11	4.2.x
1.15.0		20.01, 19.12	4.3
1.15.0		19.11	4.2.x
1.14.0		19.10, 19.09, 19.07	4.2.x
1.14.0		19.09	3.3.1
1.13.1		19.05, 19.04, 19.03	4.2.x

If you are using TensorRT with TensorFlow, ensure you are familiar with the TensorRT Container Release Notes for any known issues.

Using TensorFlow With The Jetson Platform

Memory: If you observe any out-of-memory problems in TensorFlow, you can use a custom configuration to limit the amount of memory TensorFlow tries to allocate. This can be accomplished by allowing the GPU memory allocation to grow, or setting a hard limit on the amount of memory the allocator will attempt to use. Depending on which version of the framework you are using, please see either the TensorFlow 2 guide or the archived TensorFlow 1.x documentation for details.
Storage: If you need more storage, we recommend connecting an external SSD via SATA on AGX Orin or Xavier devices, or USB on the NX series.
Operators: In TensorFlow 1.x, if you want to see which operators of your graph are placed on your module, use tf.ConfigProto(log_device_placement=True) to see all the device placements.

Known Issues

Due to an issue with glibc on ARM, certain combinations of libraries may error out with a failure to allocate memory in static TLS block. If this occurs when trying to load TensorFlow on a Jetson device, a workaround is to set the LD_PRELOAD environment variable to ensure the problematic libraries are loaded with priority. For instance:
Copy

Copied!
```
            
            export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libGLdispatch.so
        
```
A previous version of the TensorFlow 2.1.0 package corresponding to the 20.04 release was not compiled for the correct version of TensorFlow, and has been replaced by a fixed version. The problematic package can by identified by running tensorflow.__version__ after importing TensorFlow. If no error is thrown, the version installed is correct; otherwise, it is recommended to upgrade to the fixed version of a newer TensorFlow release altogether.
In TensorFlow 1.14, when using NumPy 1.17 there is a known issue where importing TensorFlow will print out a number of FutureWarning messages. This does not affect performance and can be avoided by installing NumPy version 1.16 instead of 1.17.
Certain tasks with large memory requirements may fail due to the memory limitations of embedded systems. Creating a swap partition will effectively enable your device to make use of more memory at the cost of speed and storage space, and may solve the problem.
When accelerating the inference in TensorFlow with TensorRT (TF-TRT), you may experience problems with tf.estimatorand standard allocator (BFC allocator). It may cause crashes and nondeterministic accuracy. If you run into issues, use cuda_malloc as an allocator (type exportTF_GPU_ALLOCATOR=”cuda_malloc”).
Convolutions with a large kernel size using native TensorFlow are slow. It’s because the SwapDimension0And2InTensor3Simple kernel has no implementation using tiling. This issue will be resolved in a future release. Use TF-TRT to speed up the process.
The TF-TRT conversion sometimes crashes without any error message. It usually happens when the OS runs out of swap memory. In this case, it can kill processes to reclaim space. If you encounter this issue, add more swap memory.
Some NVIDIA tools, for example, nvidia-smi, are not supported on the Jetson platform. For more information, see the JetPack documentation.
Jetson AGX Xavier has 16GB of memory which is shared between the CPU and GPU. As a result, some applications that work on a GPU with 16GB of memory may not work on Jetson AGX Xavier because not all of the 16GB is available to the GPU.
The following TensorFlow unit tests will fail on Jetson AGX Xavier. Some of the TensorFlow failures are due to bugs in the test scripts; which affect only the AArch64 platform.
- //tensorflow/core/kernels:sparse_matmul_op_test_gpu
- //tensorflow/core/kernels:requantize_op_test
- //tensorflow/core/kernels:quantized_bias_add_op_test
- //tensorflow/core:util_tensor_slice_set_test
- //tensorflow/core:platform_stacktrace_handler_test
- //tensorflow/core:platform_profile_utils_cpu_utils_test
- //tensorflow/core:common_runtime_gpu_gpu_device_test
- //tensorflow/core:gpu_device_unified_memory_test_gpu
- //tensorflow/core:common_runtime_direct_session_with_debug_test
- //tensorflow/core:common_runtime_direct_session_with_debug_test
- //tensorflow/core:device_tracer_test
- //tensorflow/core:framework_unique_tensor_references_test
- //tensorflow/python:basic_session_run_hooks_test