Step #1: Set Up The Model Repository

Before we install TMS, we first have to configure the necessary cluster resources to allow TMS to access our model repository. This lab comes with a range of AI models pre-loaded, which are located in the model_repository directory under the following folders:

sid

These are NLP models for sensitive information detection (SID) used with the NVIDIA Morpheus Framework. The purpose of this model is to identify potentially sensitive information in network packet data as quickly as possible to limit exposure and take corrective action. Sensitive information is a broad term but can be generalized to any data that should be guarded from unauthorized access. Credit card numbers, passwords, authorization keys, and emails are all examples of sensitive information.

text-embeddings

This folder contains a Triton ensemble. model made up of three components. These three components comprise a deployable Hugging Face transformer model used for generating text embeddings.

image-rec

This folder contains a sample image recognition model from the Triton Infernce Server example model repository.

TMS has various support options for model repositories, such as NFS servers, AWS S3 storage, and more. You can find information about model repository options here. For this lab, we will store our models in a persistent volume on our cluster, where our models exist on the disk of the local machine for this lab.

Note

This model repository setup is not a recommended configuration for multi-node production environments, since the models would need to be stored on each node separately in order for the persistent volume to always have access to the models.

Open the SSH Console from the left pane, and create the manifests for the persistent volume and persistent volume claim for our model repository:

Copy
Copied!
            

cat << 'EOF' >> pv.yaml apiVersion: v1 kind: PersistentVolume metadata: name: pv-local labels: type: local spec: storageClassName: local capacity: storage: 5Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain hostPath: path: "/home/nvidia/model_repository" EOF

Copy
Copied!
            

cat << 'EOF' >> pvc.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: hostpath-pvc spec: storageClassName: local accessModes: - ReadWriteMany resources: requests: storage: 5Gi volumeName: pv-local selector: matchLabels: type: local EOF

Deploy the persistent volume and persistent volume claim to the cluster

Copy
Copied!
            

kubectl create -f pv.yaml && kubectl create -f pvc.yaml

Confirm that our persistent volume and persistent volume claim have both been created

Copy
Copied!
            

$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv-local 5Gi RWX Retain Bound default/hostpath-pvc local 19s tms-docs 3Gi RWO Recycle Bound tms-lp-lab/tms-docs manual 75m

Copy
Copied!
            

$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE hostpath-pvc Bound pv-local 5Gi RWX local 46s

© Copyright 2022-2023, NVIDIA. Last updated on Sep 29, 2023.