Production Deployment

Here is the app overview if you need to refer again.

Deployment Using Helm Charts

Order of Deployment

Follow this order of deployment:

Deploy FSL Application

helm install fsl https://helm.ngc.nvidia.com/nfgnkvuikvjm/mdx-v1_0/charts/fsl-app-0.2.2.tgz -f fsl-app-values.yaml --username='$oauthtoken' --password=<YOUR API KEY>

Note

Initialization of some services might take up to 30 minutes.
For troubleshooting issues, refer to the Reference Applications section.
Sample override values can be found at fsl-app-values.yaml in the “application-helm-configs” tar file.
The command above uses an override file to change the default values of the application. In order to use the application with the default values, you can use the same command without the -f fsl-app-values.yaml part.

Verify Deployment Installation

kubectl get pods -owide
kubectl get pods -owide | grep -v 'Compl\|Runn' <<=== to check if any pod is not running or failing

Note

If there are any failed pod(s) please debug the issue using troubleshooting steps described below.

Once the deployment of all services completed, you will be seeing the following components that comprise the FSL app:

VST

DeepStream

Similarity Search

Recognition Evaluator (Model Assessor)

Embedding Generation

Redis Message Broker

ELK Stack

MongoDB

Troubleshoot Pod/Deployment Failures

Steps to follow:

Check the events for the failed/crashed pods: kubectl describe pod <Failed_Pod_Name>.

View logs of failed pods to find failure error using: kubectl logs -f <Failed_Pod_Name>.

View logs of a specific container inside a pod using: kubectl logs -f <Failed_Pod_Name> -c <failed_pod_container_name> (Container name can be obtained it will list all the containers name running for a pod kubectl describe pod <Failed_Pod_Name>).

If pod is not running due to k8s scheduling then events will shows failure errors. Also if pod is crashing then logs for a pod/container why it failed to start.

Remove App Deployment

Delete a specific installed Helm chart
helm delete <chart_name>
Note
- Grab the chart name for deleting specific chart by running helm ls command.
Delete all the installed Helm charts
for helm_chart in `helm ls -q` ; do helm delete $helm_chart ; done
Note
- By default helm delete will not cleanup Kafka crd’s which will make Kafka cluster still running, in next steps will clean up Kafka cluster.

Clean up old PVCs and PVs to remove data

kubectl delete pvc --all --force && kubectl delete pv --all --force

Verify the cleanup
helm ls kubectl get pods,pvc,pv

Clean up old PVCs data from machine filesystem

sudo rm -rf /opt/mdx-localpath /opt/mdx-local-nfs-path /opt/hostpath