Code Analysis Tools#

Compiler Sanitizers#

Google sanitizers are a set of code analysis tools.

Issues With dlopen And Address Sanitizer#

There is a known issue with sanitizers, which is documented here. When using dlopen on TensorRT under a sanitizer, there will be reports of memory leaks unless one of two solutions is adopted:

  1. Do not call dlclose when running under the sanitizers.

  2. Pass the flag RTLD_NODELETE to dlopen when running under sanitizers.

Issues with dlopen and Thread Sanitizer#

The thread sanitizer can list errors when using dlopen from multiple threads. To suppress this warning, create a file called tsan.supp and add the following to the file:

race::dlopen

When running applications under thread sanitizer, set the environment variable using:

export TSAN_OPTIONS="suppressions=tsan.supp"

Issues with CUDA and Address Sanitizer#

The address sanitizer has a known issue with CUDA applications, which is documented here. To successfully run CUDA libraries such as TensorRT under the address sanitizer, add the option protect_shadow_gap=0 to the ASAN_OPTIONS environment variable.

Issues with Undefined Behavior Sanitizer#

UndefinedBehaviorSanitizer (UBSan) reports false positives with the -fvisibility=hidden option, as documented here. Add the -fno-sanitize=vptr option to avoid UBSan reporting such false positives.

Valgrind#

Valgrind is a framework for dynamic analysis tools that can automatically detect memory management and threading bugs in applications.

Some versions of Valgrind and glibc are affected by a bug, which causes false memory leaks to be reported when dlopen is used, which can generate spurious errors when running a TensorRT application under Valgrind’s memcheck tool. To work around this, add the following to a Valgrind suppressions file as documented here:

{
    Memory leak errors with dlopen
    Memcheck:Leak
    match-leak-kinds: definite
    ...
    fun:*dlopen*
    ...
}

Compute Sanitizer#

When running a TensorRT application under compute-sanitizer, cuGetProcAddress can fail with error code 500 due to missing functions. This error can be ignored or suppressed with --report-api-errors no option. This is due to CUDA backward compatibility checking if a function is usable on the CUDA toolkit/driver combination. The functions are introduced later in CUDA but unavailable on the current platform.

Understanding Formats Printed in Logs#

In logs from TensorRT, formats are printed as a type followed by stride and vectorization information. For example:

Half(60,1:8,12,3)

Where:

  • Half indicates that the element type is DataType::kHALF, a 16-bit floating point

  • :8 indicates the format packs eight elements per vector and that vectorization is along the second axis.

The rest of the numbers are strides in units of vectors. For this tensor, the mapping of a coordinate (n,c,h,w) to an address is:

((half*)base_address) + (60*n + 1*floor(c/8) + 12*h + 3*w) * 8 + (c mod 8)

The 1: is common to NHWC formats. For example, here is another example of an NCHW format:

Int8(105,15:4,3,1)

The INT8 indicates that the element type is DataType::kINT8, and the :4 indicates a vector size of 4. For this tensor, the mapping of a coordinate (n,c,h,w) to an address is:

(int8_t*)base_address + (105*n + 15*floor(c/4) + 3*h + w) * 4 + (c mod 4)

Scalar formats have a vector size of 1. For brevity, printing omits the :1.

In general, the coordinates to address mappings have the following form:

(type*)base_address + (vec_coordinate · strides) * vec_size + vec_mod

Where:

  • the dot denotes an inner product

  • strides are the printed strides, that is, strides in units of vectors.

  • vec_size is the number of elements per vector

  • vec_coordinate is the original coordinate with the coordinate along the vectorized axis divided by vec_size

  • vec_mod is the original coordinate along the vectorized axis modulo vec_size