Production Deployment
Here is the app overview if you need to refer again.
Deployment Using Helm Charts
Order of Deployment
Follow this order of deployment:
Deploy FSL Application
helm install fsl https://helm.ngc.nvidia.com/nfgnkvuikvjm/mdx-v1_0/charts/fsl-app-0.2.2.tgz -f fsl-app-values.yaml --username='$oauthtoken' --password=<YOUR API KEY>
Note
Initialization of some services might take up to 30 minutes.
For troubleshooting issues, refer to the Reference Applications section.
Sample override values can be found at
fsl-app-values.yaml
in the “application-helm-configs” tar file.The command above uses an override file to change the default values of the application. In order to use the application with the default values, you can use the same command without the
-f fsl-app-values.yaml
part.
Verify Deployment Installation
kubectl get pods -owide kubectl get pods -owide | grep -v 'Compl\|Runn' <<=== to check if any pod is not running or failing
Note
If there are any failed pod(s) please debug the issue using troubleshooting steps described below.
Once the deployment of all services completed, you will be seeing the following components that comprise the FSL app:
Troubleshoot Pod/Deployment Failures
Steps to follow:
Check the events for the failed/crashed pods: kubectl describe pod <Failed_Pod_Name>.
View logs of failed pods to find failure error using: kubectl logs -f <Failed_Pod_Name>.
View logs of a specific container inside a pod using: kubectl logs -f <Failed_Pod_Name> -c <failed_pod_container_name> (Container name can be obtained it will list all the containers name running for a pod kubectl describe pod <Failed_Pod_Name>).
If pod is not running due to k8s scheduling then events will shows failure errors. Also if pod is crashing then logs for a pod/container why it failed to start.
Remove App Deployment
Delete a specific installed Helm chart
helm delete <chart_name>
Note
Grab the chart name for deleting specific chart by running
helm ls
command.
Delete all the installed Helm charts
for helm_chart in `helm ls -q` ; do helm delete $helm_chart ; done
Note
By default
helm delete
will not cleanup Kafka crd’s which will make Kafka cluster still running, in next steps will clean up Kafka cluster.
Clean up old PVCs and PVs to remove data
kubectl delete pvc --all --force && kubectl delete pv --all --force
Verify the cleanup
helm ls kubectl get pods,pvc,pv
Clean up old PVCs data from machine filesystem
sudo rm -rf /opt/mdx-localpath /opt/mdx-local-nfs-path /opt/hostpath