Known Issues and Limitations#

  • nav.Module moves original torch.nn.Module to the CPU, in case of weight sharing that might result in unexpected behavior

  • For data dependent dynamic control flow (multiple computation graphs) nav.Module might copy the weights for each separate graph

  • Source model running in Python can cause OOM issue when GPU memory is larger than CPU RAM memory

  • Verify command could potentially experience CUDA OOM errors while trying to run inference on two models at the same time.

  • Dependencies between modules in optimized pipelines may lead to unexpected behavior and failure in Inplace Optimize

  • TensorRT might require manual installation of correct version of nvidia-cudnn-cu12 package

  • ONNXRuntime 1.17.x does not support ONNX IR 10 (onnx ver 1.16.0)

  • ONNXRuntime 1.17.x requires cuDNN 8.x

  • DistillBERT ONNX dynamo export does not support dynamic shapes