Deploying on Kubernetes with Helm Chart
You can deploy PaddleOCR NIM with a Helm chart. The Helm chart simplifies PaddleOCR NIM deployment on Kubernetes. It supports deployment with optional cluster, GPU, and storage configurations.
The Helm chart downloads the model and starts the service to begin running inferences.
NIMs are designed to be run on a system with NVIDIA GPUs, with the type and number of GPUs depending on the model. To use the Helm chart, you must have a Kubernetes cluster with appropriate GPU nodes and GPU Operator installed.
Benefits of Helm Chart Deployment
Using a Helm chart to deploy on Kubernetes has the following benefits compared to manual deployment:
Enables using Kubernetes Nodes and horizontally scaling the service
Encapsulates the complexity of running Docker commands directly
Enables monitoring metrics from the NIM
Setting Up the Environment
If you haven’t set up your NGC API key and do not know exactly which NIM you want to download and deploy, refer to the User Guide.
The Helm chart requires that you have a secret with your NGC API key configured for downloading private images, and one with your NGC API key, which is named ngc-api
in the following sections. The secrets should have the same key, but have different formats (dockerconfig.json vs opaque). Refer to the following Creating Secrets section for details.
These instructions require that you have exported your NGC_API_KEY
to the environment. Use the following command to export your key.
export NGC_API_KEY="<YOUR NGC API KEY>"
Fetching the Helm Chart
You can download the Helm chart from NGC by executing the following command:
helm fetch https://helm.ngc.nvidia.com/ohlfw0olaadg/ea-participants/charts/paddleocr-nim-0.2.0.tgz --username='$oauthtoken' --password=$NGC_API_KEY
Namespace
You can choose to deploy to whichever namespace is appropriate, but this document uses the namespace paddleocr-nim
. Use the following command to create that namespace.
kubectl create namespace paddleocr-nim
Creating Secrets
Use the following script to create the required secrets for the Helm chart.
DOCKER_CONFIG='{"auths":{"nvcr.io":{"username":"$oauthtoken", "password":"'${NGC_API_KEY}'" }}}'
# [Linux] Encode nvcr registry config as base64
NGC_REGISTRY_PASSWORD=$(echo -n $DOCKER_CONFIG | base64 -w0)
# [MacOS] Encode nvcr registry config as base64
NGC_REGISTRY_PASSWORD=$(echo -n $DOCKER_CONFIG | base64 -b0)
# Create image pull secret
cat <<EOF > imagepull.yaml
apiVersion: v1
kind: Secret
metadata:
name: nvcrimagepullsecret
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: ${NGC_REGISTRY_PASSWORD}
EOF
kubectl apply -n paddleocr-nim -f imagepull.yaml
kubectl create -n paddleocr-nim secret generic ngc-api \
--from-literal=NGC_API_KEY=${NGC_API_KEY} \
--from-literal=NGC_CLI_API_KEY=${NGC_API_KEY}
Configuration Considerations
By default, the following deployment commands create a single deployment with one replica using the paddleocr model. Use the following options to modify how the model behaves. Refer to Parameters for information about parameters.
image.repository
– The container (PaddleOCR NIM) to deployimage.tag
– The version of that container (PaddleOCR NIM)Storage options, based on the environment and cluster in use
resources
– Use this option when a model requires more than the default of one GPU. Refer to the support matrix and resource requirements.env
– An array of environment variables presented to the container, if advanced configuration is needed
Storage
This NIM uses persistent storage for storing downloaded models, and sample commands in this guide require the local-nfs
storage class. Use the following commands to install the local-nfs
storage class and provisioner in your Kubernetes cluster.
helm repo add nfs-ganesha-server-and-external-provisioner https://kubernetes-sigs.github.io/nfs-ganesha-server-and-external-provisioner/
helm install nfs-server nfs-ganesha-server-and-external-provisioner/nfs-server-provisioner --set storageClass.name=local-nfs
Advanced Storage Configuration
Storage is a particular concern when setting up NIMs. Models can be quite large, and you can fill a disk downloading models to emptyDir
volumes. We recommend that you mount persistent storage of some kind on your pod.
This chart supports two general categories:
Persistent Volume Claims (enabled with
persistence.enabled
)hostPath (enabled with
persistences.hostPath
)
By default, the chart uses the standard
storage class and creates a PersistentVolume
and a PersistentVolumeClaim
.
If you do not have a Storage Class Provisioner
that creates PersistentVolume
s automatically, set the value persistence.createPV=true
. This is also necessary when you use persistence.hostPath
on minikube.
If you have an existing PersistentVolumeClaim
where you’d like the models to be stored at, pass that value in at persistence.exsitingClaimName
.
Refer to the Helm options in Parameters.
Deploying
Basic deployment
helm upgrade --install \
--namespace paddleocr-nim \
paddleocr-nim \
--set persistence.class="local-nfs" \
paddleocr-nim-0.2.0.tgz
You can also change the version of the paddleocr model in use by adding the following after --namespace
--set image.tag=0.2.0 \
After deploying, use the following command to check whether the pod is running, as the initial image pull and model download can take upwards of 15 minutes.
kubectl get pods -n paddleocr-nim
This command should eventually return something similar to the following when the pod is running.
NAME READY STATUS RESTARTS AGE
paddleocr-nim-0 1/1 Running 0 8m44s
You can use the following command to check events for failures:
kubectl get events -n paddleocr-nim --sort-by='.lastTimestamp'
Recommended Configuration for Minikube
Minikube creates a hostPath based PV and PVC by default with this Helm chart. You should add the following setting to your Helm commands.
--set persistence.class=standard
Running Inference
In the previous example the API endpoint is exposed on port 8000 through the Kubernetes service of the default type with no ingress, since authentication is not handled by the NIM itself. The following commands require that the nvidia/paddleocr
model has been deployed.
If required, change the “model” value in the request JSON body to use a different model.
Use the following command to port-forward the service to your local machine to test inference.
kubectl port-forward -n paddleocr-nim service/paddleocr-nim 8000:8000
Create a directory data/structured-imgs
and copy in some .png formatted images so that data looks like so:
$ mkdir -p data/structured-imgs
$ ls -l data/structured-imgs
sample1.png
sample2.png
sample3.png
sample4.png
Send an inference request by running the following commands and Python 3.11 script.
# Create a virtual env (venv) for this test to isolate the dependencies
python3 -m venv paddleocr_venv
source paddleocr_venv/bin/activate
# Install pillow and requests libraries into your python 3 environment
pip3 install requests pillow
# paddleocr_inference_test.py
import base64
import json
import time
from io import BytesIO
from pathlib import Path
import requests
from PIL import Image
images = []
image_paths = list(Path("data/structured-imgs").glob("*"))
for image_path in image_paths:
image = Image.open(image_path)
buffered = BytesIO()
image.save(buffered, format="PNG")
base64_image = base64.b64encode(buffered.getvalue()).decode("utf-8")
image_url = f"data:image/png;base64,{base64_image}"
image = {"type": "image_url", "image_url": {"url": image_url}}
images.append(image)
message = {"content": images}
payload = {"messages": [message]}
start = time.time()
print(json.dumps(requests.post("http://localhost:8000/v1/infer", json=payload).json()))
print(f"{len(image_paths)} images completed in {time.time() - start} seconds")
# Run inference on the .png files in the /data directory
python3 paddleocr_inference_test.py
Logging
Use the following command to view the container log messages in the docker logs.
kubectl logs --selector=app.kubernetes.io/name=paddleocr-nim -n paddleocr-nim