NVIDIA Self-Hosted Server Overview

NVIDIA cuOpt is a containerized server and can be easily deployed locally and in CSPs.

_images/genms_cuopt.png

The example server uses HTTP POST requests on port 5000 to accept optimization input data. cuOpt works on the given data with options/constraints provided, and returns optimized route plans in response.

_images/cuOpt-self-hosted.png

Quickstart Guide

Step 1: Get an NVIDIA AI Enterprise Account

  1. Get a subscription for NVIDIA AI Enterprise (NVAIE) to get the cuOpt container to host in your cloud.

Step 2: Access NGC

  1. Log into NGC using the invite and choose the appropriate NGC org.

  2. Generate an NGC API key from settings. If you have not generated an API Key, you can generate it by going to the Setup option in your profile and choose Get API Key. Store this or generate a new one next time. More information can be found here.

Step 3: Pull a cuOpt Container

  1. Prerequisites for running the container:

    • nvidia-docker2 needs to be installed

    • A single NVIDIA GPU of Ampere or Hopper architecture.

    • Multi GPU is not supported. cuOpt uses only one GPU per cuOpt instance.

    • CPU - x86-64 >= 8 core (Recommended)

    • Memory >= 16 GB (Recommended)

    • Minimum Storage: 20 GB (8 GB container size)

    • CUDA 11.8+

    • Compute Capability >= 8.x

    • Minimum NVIDIA Driver Version: 450.36.06

  2. Go to the container section for cuOpt and copy the pull tag for the latest image.

    1. Within the Select a tag dropdown, locate the container image release that you want to run.

    2. Click the Copy Image Path button to copy the container image path.

  3. Log into the nvcr.io container registry in your cluster setup, using the NGC API key as shown below.

    docker login nvcr.io
    Username: $oauthtoken
    Password: <my-api-key>
    

    Note

    The username is $oauthtoken and the password is your API key.

  4. Pull the cuOpt container.

    The container for cuOpt can be found in the Containers tab in NGC. Please copy the tag from the cuOpt container page and use it as follows,

    # Save the image tag in a variable for use below
    export CUOPT_IMAGE="CONTAINER_TAG_COPIED_FROM_NGC"
    
    docker pull $CUOPT_IMAGE
    

Step 4: Running cuOpt

Note

The commands below run the cuOpt container in detached mode. To stop a detached container, use the docker stop command. To instead run the container in interactive mode, remove the -d option.

  1. If you have Docker 19.03 or later:

    • A typical command to launch the container is:

    docker run -it -d --gpus=1 --rm -p 5000:5000 $CUOPT_IMAGE
    
  2. If you have Docker 19.02 or earlier:

    • A typical command to launch the container is:

    nvidia-docker run -it -d --gpus=1 --rm -p 5000:5000 $CUOPT_IMAGE
    
  3. By default the container runs on port 5000, but this can be changed using environment variable CUOPT_SERVER_PORT,

    docker run -it -d --gpus=1 -e CUOPT_SERVER_PORT=8080 --rm -p 8080:8080 $CUOPT_IMAGE
    
  4. This command would launch the container and cuOpt API endpoints should be available for testing.

  5. If you have multiple GPUs and would like to choose a particular GPU, please use --gpus device=<GPU_ID>. GPU_ID can be found using the command nvidia-smi.

  6. Server can be configured for log levels and other options using environment variables, options are listed as follows:

    • CUOPT_SERVER_PORT : For server port (default 5000).

    • CUOPT_SERVER_IP : For server IP (default 0.0.0.0).

    • CUOPT_SERVER_LOG_LEVEL : Options are critical, error, warning, info, debug (default is info).

    • CUOPT_DATA_DIR : A shared mount path used to optionally pass cuOpt problem data files to the routes’ endpoint, instead of sending data over network (default is None).

    • CUOPT_RESULT_DIR : A shared mount path used to optionally pass cuOpt result files from the routes’ endpoint, instead of sending data over the network (default is None).

    • CUOPT_MAX_RESULT : Maximum size (kilobytes) of a result returned over http from the routes’ endpoint when CUOPT_RESULT_DIR is set. Set to 0 to have all results written to CUOPT_RESULT_DIR (default is 250).

    • CUOPT_SSL_CERTFILE : Filepath in container for SSL certificate, may need to mount a directory for this file with read and write access.

    • CUOPT_SSL_KEYFILE : Filepath in container for key file, may need to mount a directory for this file with read and write access. You can generate a self-signed certificate easily as follows:

      openssl genrsa -out ca.key 2048
      openssl req -new -x509 -days 365 -key ca.key -subj "/C=CN/ST=GD/L=SZ/O=Acme, Inc./CN=Acme Root CA" -out ca.crt
      
      openssl req -newkey rsa:2048 -nodes -keyout server.key -subj "/C=CN/ST=GD/L=SZ/O=Acme, Inc./CN=*.example.com" -out server.csr
      openssl x509 -req -extfile <(printf "subjectAltName=DNS:example.com,DNS:www.example.com") -days 365 -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt
      

      server.crt and server.key are meant for server, ca.crt is meant for client.

  7. Input data can be shared and results can be accumulated using shared mount paths rather than sending over network using options CUOPT_DATA_DIR and CUOPT_RESULT_DIR as follows:

    Create directories to share with the container and run the container.

    Note

    The data and results directories mounted on the cuOpt container need to be readable and writable by the container user, and also have the execute permission set. If they are not, the container will print an error message and exit. Be careful to set permissions correctly on those directories before running the cuOpt server.

    mkdir data
    mkdir result
    
    CID=$(docker run --rm --gpus=all --network=host \
    -v `pwd`/data:/cuopt_data  \
    -v `pwd`/results:/cuopt_results \
    -e CUOPT_DATA_DIR=/cuopt_data \
    -e CUOPT_RESULT_DIR=/cuopt_results \
    -e CUOPT_MAX_RESULT=0 \
    -e CUOPT_SERVER_PORT=8081 \
    -p 8081:8081 \
    -it -d $CUOPT_IMAGE)
    

    Create a data file in the data directory and send a POST request to cuOpt with name of the data file.

    echo "{
     \"cost_matrix_data\": {\"data\": {\"0\": [[0, 1], [1, 0]]}},
     \"task_data\": {\"task_locations\": [1], \"demand\": [[1]], \"task_time_windows\": [[0, 10]], \"service_times\": [1]},
     \"fleet_data\": {\"vehicle_locations\":[[0, 0]], \"capacities\": [[2]], \"vehicle_time_windows\":[[0, 20]] },
     \"solver_config\": {\"time_limit\": 2}
    }" >> data/data.json
    
    curl --location 'http://<SERVER_IP>:<SERVER:PORT>/cuopt/routes' \
    --header 'Content-Type: application/json' \
    --header "CLIENT-VERSION: custom" \
    --header "CUOPT-DATA-FILE: data.json" \
    -d {}
    

    result would be available in result directory as follows,

    cat results/*
    

    NOTE: Kill the running docker container once you have completed testing

Step 5: Testing container

  1. curl can be used instead of the thin client to communicate with server, for example:

    curl --location 'http://<SERVER_IP>:<SERVER:PORT>/cuopt/routes' \
    --header 'Content-Type: application/json' \
    --header "CLIENT-VERSION: custom" \
    -d '{
        "cost_matrix_data": {"data": {"0": [[0, 1], [1, 0]]}},
        "task_data": {"task_locations": [1], "demand": [[1]], "task_time_windows": [[0, 10]], "service_times": [1]},
        "fleet_data": {"vehicle_locations":[[0, 0]], "capacities": [[2]], "vehicle_time_windows":[[0, 20]] },
        "solver_config": {"time_limit": 2}
        }'
    

Step 6: Installing Thin Client Using Pip Index

Note

The self-hosted thin client requires Python 3.10.

  1. The thin client enables users to test quickly, but users can design their own clients using this.

  2. Whenever thin clients need to be updated or installed, you can directly install it using the NVIDIA pip index.

  3. Requirements:

    • Python == 3.10.X

    pip install --upgrade --extra-index-url https://pypi.nvidia.com cuopt-sh-client
    
  4. Users can also write their own thin client, please refer to Build Your Own Thin Client.

  5. For more information navigate to Self-Hosted Thin Client.

Step 7: Installing cuOpt Using a Helm Chart

  1. Create a namespace in Helm.

    1kubectl create namespace <some name>
    2export NAMESPACE="<some name>"
    
  2. Configure the NGC API Key as secret.

    kubectl create secret docker-registry ngc-docker-reg-secret \
    -n $NAMESPACE --docker-server=nvcr.io --docker-username='$oauthtoken' \
    --docker-password=$NGC_CLI_API_KEY
    
  3. Fetch the Helm Chart.

    The Helm chart for cuOpt can be found in the Helm Charts tab in NGC. Please copy the fetch tag from cuOpt Helm page and use it as follows:

    helm fetch <FETCH-TAG-COPIED-FROM-NGC> --username='$oauthtoken' --password=<YOUR API KEY> --untar
    
  4. Run the cuOpt server.

    helm install --namespace $NAMESPACE nvidia-cuopt-chart cuopt --values cuopt/values.yaml
    
  5. It might take some time to download the container, which is shown in the status as PodInitializing; otherwise it would be Running. Use the following commands to verify:

    kubectl -n $NAMESPACE get all
    
    NAME                                          READY   STATUS    RESTARTS   AGE
    pod/cuopt-cuopt-deployment-595656b9d6-dbqcb   1/1     Running   0          21s
    
    NAME                                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
    service/cuopt-cuopt-deployment-cuopt-service   ClusterIP   X.X.X.X          <none>        5000/TCP,8888/TCP   21s
    
    NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/cuopt-cuopt-deployment   1/1     1            1           21s
    
    NAME                                                DESIRED   CURRENT   READY   AGE
    replicaset.apps/cuopt-cuopt-deployment-595656b9d6   1         1         1       21s
    
  6. Uninstalling NVIDIA cuOpt Server:

    helm uninstall -n $NAMESPACE nvidia-cuopt-chart