Optimize model and use zero-copy runners#
In this example, we show how to use zero-copy runners for inference without unnecessary copying data from CPU to GPU and back.
We recommend running this example in NVIDIA NGC PyTorch container. To run the example, simply run the optimize.py script:
./optimize.py