Known Issues and Limitations#

nav.Module moves original torch.nn.Module to the CPU, in case of weight sharing that might result in unexpected behavior
For data dependent dynamic control flow (multiple computation graphs) nav.Module might copy the weights for each separate graph
Source model running in Python can cause OOM issue when GPU memory is larger than CPU RAM memory
Verify command could potentially experience CUDA OOM errors while trying to run inference on two models at the same time.
Dependencies between modules in optimized pipelines may lead to unexpected behavior and failure in Inplace Optimize
TensorRT might require manual installation of correct version of nvidia-cudnn-cu12 package
ONNXRuntime 1.17.x does not support ONNX IR 10 (onnx ver 1.16.0)
ONNXRuntime 1.17.x requires cuDNN 8.x
DistillBERT ONNX dynamo export does not support dynamic shapes