Code Analysis Tools#
Compiler Sanitizers#
Google sanitizers are a set of code analysis tools.
Issues With dlopen And Address Sanitizer#
There is a known issue with sanitizers, which is documented here. When using dlopen on TensorRT under a sanitizer, there will be reports of memory leaks unless one of two solutions is adopted:
Do not call
dlclosewhen running under the sanitizers.Pass the flag
RTLD_NODELETEtodlopenwhen running under sanitizers.
Issues with dlopen and Thread Sanitizer#
The thread sanitizer can list errors when using dlopen from multiple threads. To suppress this warning, create a file called tsan.supp and add the following to the file:
race::dlopen
When running applications under thread sanitizer, set the environment variable using:
export TSAN_OPTIONS="suppressions=tsan.supp"
Issues with CUDA and Address Sanitizer#
The address sanitizer has a known issue with CUDA applications, which is documented here. To successfully run CUDA libraries such as TensorRT under the address sanitizer, add the option protect_shadow_gap=0 to the ASAN_OPTIONS environment variable.
Issues with Undefined Behavior Sanitizer#
UndefinedBehaviorSanitizer (UBSan) reports false positives with the -fvisibility=hidden option, as documented here. Add the -fno-sanitize=vptr option to avoid UBSan reporting such false positives.
Valgrind#
Valgrind is a framework for dynamic analysis tools that can automatically detect memory management and threading bugs in applications.
Some versions of Valgrind and glibc are affected by a bug, which causes false memory leaks to be reported when dlopen is used, which can generate spurious errors when running a TensorRT application under Valgrind’s memcheck tool. To work around this, add the following to a Valgrind suppressions file as documented here:
{
Memory leak errors with dlopen
Memcheck:Leak
match-leak-kinds: definite
...
fun:*dlopen*
...
}
Compute Sanitizer#
When running a TensorRT application under compute-sanitizer, cuGetProcAddress can fail with error code 500 due to missing functions. This error can be ignored or suppressed with --report-api-errors no option. This is due to CUDA backward compatibility checking if a function is usable on the CUDA toolkit/driver combination. The functions are introduced later in CUDA but unavailable on the current platform.
Understanding Formats Printed in Logs#
In logs from TensorRT, formats are printed as a type followed by stride and vectorization information. For example:
Half(60,1:8,12,3)
Where:
Halfindicates that the element type isDataType::kHALF, a 16-bit floating point:8indicates the format packs eight elements per vector and that vectorization is along the second axis.
The rest of the numbers are strides in units of vectors. For this tensor, the mapping of a coordinate (n,c,h,w) to an address is:
((half*)base_address) + (60*n + 1*floor(c/8) + 12*h + 3*w) * 8 + (c mod 8)
The 1: is common to NHWC formats. For example, here is another example of an NCHW format:
Int8(105,15:4,3,1)
The INT8 indicates that the element type is DataType::kINT8, and the :4 indicates a vector size of 4. For this tensor, the mapping of a coordinate (n,c,h,w) to an address is:
(int8_t*)base_address + (105*n + 15*floor(c/4) + 3*h + w) * 4 + (c mod 4)
Scalar formats have a vector size of 1. For brevity, printing omits the :1.
In general, the coordinates to address mappings have the following form:
(type*)base_address + (vec_coordinate · strides) * vec_size + vec_mod
Where:
the dot denotes an inner product
strides are the printed strides, that is, strides in units of vectors.
vec_sizeis the number of elements per vectorvec_coordinateis the original coordinate with the coordinate along the vectorized axis divided byvec_sizevec_modis the original coordinate along the vectorized axis modulovec_size