PyTriton Torch linear model deployment#

This example, shows how to optimize a simple linear model and deploy it to PyTriton.

Requirements#

The example requires the torch package. It can be installed in your current environment using pip:

pip install torch

Or you can use NVIDIA Torch container:

docker run -it --gpus 1 --shm-size 8gb -v ${PWD}:${PWD} -w ${PWD} nvcr.io/nvidia/pytorch:23.01-py3 bash

If you select to use container, we recommend installing NVIDIA Container Toolkit.

Install the Model Navigator#

Install the Triton Model Navigator following the installation guide for Torch:

pip install --extra-index-url https://pypi.ngc.nvidia.com .[torch]

Note: run this command from main catalog inside the repository

Run model optimization#

In the next step, the optimize process will be performed for the model.

python examples/triton/optimize.py

Once the process is done, the linear.nav package is created in the current working directory.

Start PyTriton server#

This step starts PyTriton server with the package generated in the previous step.

./serve.py

Example PyTriton client#

Use client to test model deployment on PyTriton

./client.py