Step 4: Run The Workflow

Route Optimization

Once the previous helm install command is finished running, you should see several services you can access. Make a note of the provided URLs to access the services, you’ll use this later on in the guide.

services.png

Use the Jupyter notebook client URL from the available services to open up Jupyter in your browser. This should be in the format shown below.

Copy
Copied!
            

https://client-<NAMESPACE>.your-cluster.your-domain.com


Once Jupyter is open, open the route-opt-workflow.ipynb notebook from the left pane. You can then follow and execute the steps in the notebook to send an example request to cuOpt.

Note

You can use the Shift + Enter key combination to execute a cell in the notebook.

jupyter.png

After completing the notebook, let’s look at our Grafana dashboard. Navigate to the https://dashboards.your-cluster.your-domain.com link that was provided as the output to the workflow Helm chart installation. This will lead you to the Grafana Service sign-in page. The username is admin, and the password is obtained by running the command in the code block below on your Kubernetes Cluster.

Grafana:

  • User: admin

  • Password: <see code block below>

Copy
Copied!
            

kubectl get secret grafana-admin-credentials -n nvidia-monitoring -o json| jq -r '.data.GF_SECURITY_ADMIN_PASSWORD' | base64 -d


Why Monitoring

In production, every microservice needs observability. Looking at metrics allows a data scientist or a machine learning engineer to make informed decisions about the service’s scale and health. Capturing metrics like average queue time and latency allows the engineer to understand how the service behaves over time. If the service queue time has increased over time, it means that the server is receiving more requests than it can process. If the queue time has reached the allowable threshold, we need to scale the server to increase the number of replicas to process more requests.

Monitoring in a Cloud Native Environment

The problem of monitoring metrics is solved in Kubernetes with Prometheus and Grafana. Prometheus is an open-source monitoring and alerting tool. It “pulls” metrics (measurements) from microservices by sending HTTP requests and stores the results in a time-series database. Prometheus uses ServiceMonitor objects to scrape metrics from a Kubernetes service endpoint and store them as targets.

Monitoring a Service

In this case, we’re showing how to monitor the performance of the GPU Operator, which is used by cuOpt to accelerate solver requests. This is done by scaping the DCGM Exporter component of the operator using a service monitor. For more information on the metrics available, see the DCGM documentation.

You can also explore the source code, and modify the workflow to suit your own needs. The source code can be found in the NGC Collection for this workflow.

© Copyright 2022-2023, NVIDIA. Last updated on May 23, 2023.