Step #2: Start the Triton Inference Server
As noted previously, The Triton Inference server Kubernetes deployment object has already been deployed to the cluster as part of the helm chart installation. But, . Now that we have saved the model as part of Step#1 of the lab (Jupyter notebook), let’s start the Triton Inference Server pod as part of the model deployment. Also, since we only have a single GPU to work with on this environment we need to scale down the training Jupyter notebook pod.
Using the System Console link on the left-hand navigation pane, open the System console. You will use the it to start the Triton Inference Server Pod.
![image-classification-nav.png](https://docscontent.nvidia.com/dims4/default/77b0427/2147483647/strip/true/crop/299x135+0+0/resize/299x135!/quality/90/?url=https%3A%2F%2Fk3-prod-nvidia-docs.s3.us-west-2.amazonaws.com%2Fbrightspot%2Fsphinx%2F00000186-1555-dd43-ad9e-bf5ff0cd0000%2Flaunchpad%2Fai%2Fclassification-openshift%2Flatest%2F_images%2Fimage-classification-nav.png)
Using the commands below, scale down the Jupyter pod.
oc scale deployments image-classification --replicas=0 -n classification-namespace
Wait for a few seconds and scale up the Triton Inference Server pod using the following command.
oc scale deployments image-classification-tritonserver-image-classification --replicas=1 -n classification-namespace
Keep checking the status of the Triton Inference Server pod using the command below. Only proceed to the next step once the pod is in a Running state. It might take a few minutes to pull the Triton Inference Server container from NGC.
oc get pods -A | grep triton
Once the pod is in a Running state. You can check the logs by running the command below.
oc logs name_of_the_triton_pod_from_previous_command
Within the console output, notice the Triton model repository contains the mobilenet_classifier model which was saved within the Jupyter Notebook and the status is Ready.