Deployment Guide#

This guide covers deploying CuOpt using the NIM Operator.

Automated Deployment#

The easiest way to deploy CuOpt is using the provided deployment script.

Using the Deploy Script#

deploy.sh

deploy.sh (showing first 50 lines)#
 1#!/bin/bash
 2# SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
 3# SPDX-License-Identifier: Apache-2.0
 4
 5#
 6# CuOpt NIM Operator Deployment Script
 7#
 8# This script automates the deployment of NVIDIA cuOpt using the NIM Operator.
 9#
10# Usage:
11#   ./deploy.sh                           # Deploy with defaults
12#   ./deploy.sh --namespace my-ns         # Custom namespace
13#   ./deploy.sh --uninstall               # Remove deployment
14#   ./deploy.sh --help                    # Show help
15#
16
17set -e
18
19# Default values
20NAMESPACE="nim-service"
21CUOPT_IMAGE_TAG="25.12.0-cuda12.9-py3.13"
22SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
23UNINSTALL=false
24SKIP_PREREQUISITES=false
25WAIT_TIMEOUT=300
26
27# Colors for output
28RED='\033[0;31m'
29GREEN='\033[0;32m'
30YELLOW='\033[1;33m'
31NC='\033[0m' # No Color
32
33print_info() {
34    echo -e "${GREEN}[INFO]${NC} $1"
35}
36
37print_warn() {
38    echo -e "${YELLOW}[WARN]${NC} $1"
39}
40
41print_error() {
42    echo -e "${RED}[ERROR]${NC} $1"
43}
44
45usage() {
46    cat << EOF
47CuOpt NIM Operator Deployment Script
48
49Usage: $(basename "$0") [OPTIONS]
  1. Set your NGC API Key:

    export NGC_API_KEY=<your-ngc-api-key>
    
  2. Run the deployment script:

    ./deploy.sh
    
  3. With custom options:

    ./deploy.sh --namespace my-cuopt --wait 600
    

Script Options#

Usage: deploy.sh [OPTIONS]

Options:
    -n, --namespace NAME      Kubernetes namespace (default: nim-service)
    -t, --tag TAG             CuOpt image tag
    -u, --uninstall           Uninstall CuOpt deployment
    -s, --skip-prerequisites  Skip prerequisite checks
    -w, --wait SECONDS        Timeout for waiting on resources (default: 300)
    -h, --help                Show help message

Manual Deployment#

If you prefer to deploy manually, follow these steps.

Step 1: Create Namespace#

namespace.yaml

 1# SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
 2# SPDX-License-Identifier: Apache-2.0
 3
 4apiVersion: v1
 5kind: Namespace
 6metadata:
 7  name: nim-service
 8  labels:
 9    app.kubernetes.io/name: cuopt
10    app.kubernetes.io/component: nim-service

Apply the namespace:

kubectl apply -f namespace.yaml

Step 2: Create Secrets#

Create the image pull secret:

kubectl create secret -n nim-service docker-registry ngc-secret \
    --docker-server=nvcr.io \
    --docker-username='$oauthtoken' \
    --docker-password=${NGC_API_KEY}

Create the NGC API key secret:

kubectl create secret -n nim-service generic ngc-api-secret \
    --from-literal=NGC_API_KEY=${NGC_API_KEY}

Step 3: Deploy CuOpt NIMService#

cuopt-nimservice.yaml

 1# SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
 2# SPDX-License-Identifier: Apache-2.0
 3
 4apiVersion: apps.nvidia.com/v1alpha1
 5kind: NIMService
 6metadata:
 7  name: cuopt-service
 8  namespace: nim-service
 9spec:
10  image:
11    repository: nvcr.io/nvidia/cuopt/cuopt
12    tag: "25.12.0-cuda12.9-py3.13"
13    pullPolicy: IfNotPresent
14    pullSecrets:
15      - ngc-secret
16  authSecret: ngc-api-secret
17  env:
18    - name: CUOPT_DATA_DIR
19      value: /model-store
20    - name: CUOPT_SERVER_LOG_LEVEL
21      value: info
22    - name: CUOPT_SERVER_PORT
23      value: "8000"
24  storage:
25    pvc:
26      create: true
27      size: 10Gi
28      storageClass: ""
29      volumeAccessMode: "ReadWriteOnce"
30  resources:
31    limits:
32      nvidia.com/gpu: 1
33  expose:
34    service:
35      type: ClusterIP
36      port: 8000
37    ingress:
38      enabled: false
39  metrics:
40    enabled: true
41    serviceMonitor:
42      additionalLabels:
43        release: kube-prometheus-stack
44  scale:
45    enabled: false
46  startupProbe:
47    enabled: false
48  livenessProbe:
49    enabled: true
50    probe:
51      failureThreshold: 3
52      httpGet:
53        path: /v2/health/live
54        port: api
55      initialDelaySeconds: 15
56      periodSeconds: 10
57      successThreshold: 1
58      timeoutSeconds: 1
59  readinessProbe:
60    enabled: true
61    probe:
62      failureThreshold: 30
63      httpGet:
64        path: /v2/health/ready
65        port: api
66      initialDelaySeconds: 30
67      periodSeconds: 10
68      successThreshold: 1
69      timeoutSeconds: 1

Apply the NIMService:

kubectl apply -f cuopt-nimservice.yaml

Verifying the Deployment#

Check NIMService Status#

kubectl get nimservice -n nim-service

Expected output:

NAME            STATE   AGE
cuopt-service   Ready   5m

Check Pods#

kubectl get pods -n nim-service

Check Service#

kubectl get service -n nim-service

Expected output:

NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
cuopt-service   ClusterIP   10.101.219.185   <none>        8000/TCP   5m

Check Logs#

kubectl logs -f deployment/cuopt-service -n nim-service

You should see output indicating the server is running:

2025-11-19 20:52:50.655 INFO cuopt server version 25.10.01
2025-11-19 20:52:50.767 INFO Application startup complete.
2025-11-19 20:52:50.767 INFO Uvicorn running on http://0.0.0.0:5000

Testing the Deployment#

Port Forward for Local Testing#

kubectl port-forward svc/cuopt-service -n nim-service 8000:8000

Then access the service at http://localhost:8000.

Test with cuopt_sh CLI#

# Get the service ClusterIP
CUOPT_IP=$(kubectl get svc cuopt-service -n nim-service -o jsonpath='{.spec.clusterIP}')

# Run a test
cuopt_sh -i $CUOPT_IP -t LP /path/to/your/problem.mps

Example output:

2025-11-19 21:29:22.391 cuopt_sh_client.cuopt_self_host_client INFO Optimal
{'response': {'solver_response': {'status': 'Optimal', ...}}}

Health Endpoints#

  • Liveness: GET /v2/health/live

  • Readiness: GET /v2/health/ready

curl http://localhost:8000/v2/health/ready

Cleanup#

Using the Script#

./deploy.sh --uninstall

Manual Cleanup#

kubectl delete -f cuopt-nimservice.yaml
kubectl delete secret ngc-secret ngc-api-secret -n nim-service
kubectl delete namespace nim-service

Troubleshooting#

Pod Not Starting#

Check pod events:

kubectl describe pod -l app=cuopt-service -n nim-service

Check GPU operator status:

kubectl get pods -n gpu-operator

Image Pull Errors#

Verify your NGC credentials:

kubectl get secret ngc-secret -n nim-service -o yaml

Health Check Failures#

Check the readiness and liveness probe logs:

kubectl logs -f deployment/cuopt-service -n nim-service | grep -i health

Next Steps#

See configuration for advanced configuration options.