Model Export to TensorRT-LLM

To enable the export stage with a ViT model, configure the configuration files:

In the stages field of conf/config.yaml, make sure the export stage is included. For example,

Copy
Copied!

            
            stages:
  - export
  ...

Configure infer.max_batch_size of the conf/export/vit/export_vit.yaml file to set the max_batch_size to use for the ONNX and NVIDIA TensorRT model.
Set the resolution of the model with max_dim in the infer field. This will be used to generate the ONNX and NVIDIA TensorRT formats.

Remarks:

To load a pretrained checkpoint for inference, set the restore_from_path field in the model section to the path of the pretrained checkpoint in .nemo format in conf/export/vit/export_vit.yaml. By default, this field links to the .nemo format checkpoint located in the ImageNet 1K fine-tuning checkpoints folder.