Introduction

Important

NVIDIA TensorRT-Cloud is provided as a developer preview in Early Access (EA). Access is restricted and is provided upon request (refer to Getting TensorRT-Cloud Access).

TensorRT-Cloud (TRTC) provides engine building capabilities across diverse NVIDIA GPUs, OS, and library dependencies with ease. The goal is to enable developers to build optimized TensorRT engines with the convenience of a command line interface (CLI) for the vast variety of NVIDIA GPU install base that applications need to support. This is done through on-demand engine building. This, coupled with the weight refit capabilities of NVIDIA TensorRT 10.0 allows you to integrate TensorRT accelerated inference in your applications without worrying about bloating your application binaries. TensorRT-Cloud also provides easy access to pre-built optimized engines for popular community models.

The TensorRT-Cloud CLI is the interface through which you interact with TensorRT-Cloud. The CLI provides a single point of access to all of the features, from building engines on-demand to downloading pre-built engines, and refitting weightless engines.