NVIDIA Docs Hub NVIDIA LaunchPad Train an AI Model for Image Classification of Online Product (Red Hat OpenShift) Step #2: Start the Triton Inference Server

Step #2: Start the Triton Inference Server

As noted previously, The Triton Inference server Kubernetes deployment object has already been deployed to the cluster as part of the helm chart installation. But, . Now that we have saved the model as part of Step#1 of the lab (Jupyter notebook), let’s start the Triton Inference Server pod as part of the model deployment. Also, since we only have a single GPU to work with on this environment we need to scale down the training Jupyter notebook pod.

Using the System Console link on the left-hand navigation pane, open the System console. You will use the it to start the Triton Inference Server Pod.

Using the commands below, scale down the Jupyter pod.

Copy
Copied!

            
            oc scale deployments image-classification --replicas=0 -n classification-namespace

Wait for a few seconds and scale up the Triton Inference Server pod using the following command.

Copy
Copied!

            
            oc scale deployments image-classification-tritonserver-image-classification --replicas=1 -n classification-namespace

Keep checking the status of the Triton Inference Server pod using the command below. Only proceed to the next step once the pod is in a Running state. It might take a few minutes to pull the Triton Inference Server container from NGC.

Copy
Copied!

            
            oc get pods -A | grep triton

Once the pod is in a Running state. You can check the logs by running the command below.

Copy
Copied!

            
            oc logs name_of_the_triton_pod_from_previous_command

Within the console output, notice the Triton model repository contains the mobilenet_classifier model which was saved within the Jupyter Notebook and the status is Ready.