Upgrade#

How to upgrade Nsight Operator.

Upgrading Nsight Operator#

Nsight Operator is upgraded with helm upgrade. Because CRD versions and Helm chart values can evolve between releases, follow the steps below.

Compatibility#

  • Always install the Helm chart (currently 26.2.1) that ships with the product release you want (currently 26.2.1); mixing a chart from one release with images from another is not supported.

  • The nsight-systems-cli image used for injection is pinned by the chart; do not pin a different version unless advised by NVIDIA support.

  • CRDs are installed by the chart as standard Helm templates. Helm does not upgrade CRDs whose schemas changed; manual handling may be required on major version bumps (see the CRD schema pre-flight check below).

Upgrade Procedure#

Pre-flight Check: CRD Schema Changes#

When a new release changes a CRD schema in a non-additive way, Helm’s helm upgrade will not apply the change automatically. Before running the helm upgrade step below, apply the updated CRDs from the new chart:

helm pull https://helm.ngc.nvidia.com/nvidia/devtools/charts/nsight-operator-26.2.1.tgz --untar --untardir /tmp
kubectl apply -f /tmp/nsight-operator/crds/

This preserves existing CRs and their data.

  1. End the current profiling session:

    python3 nsight_operator.py session-end
    
  2. Apply the upgrade, reusing the values from the installed release. --wait blocks until the controller is ready:

    helm upgrade --wait --reuse-values nsight-operator \
        https://helm.ngc.nvidia.com/nvidia/devtools/charts/nsight-operator-26.2.1.tgz

    Note

    --reuse-values keeps your existing overrides but omits any new required chart values introduced by the upgrade. For major upgrades, review the release notes and either pass any new required values explicitly (--set key=value) or merge your current values with the new chart defaults manually:

    helm get values nsight-operator -n nsight-operator > old-values.yaml
    # Fetch the new chart defaults:
    # helm show values https://helm.ngc.nvidia.com/nvidia/devtools/charts/nsight-operator-<VERSION>.tgz > new-defaults.yaml
    # Merge old-values.yaml into new-defaults.yaml by hand (or with yq)
    # to produce merged-values.yaml, then:
    
    helm upgrade --wait -f merged-values.yaml nsight-operator \
        https://helm.ngc.nvidia.com/nvidia/devtools/charts/nsight-operator-26.2.1.tgz
  3. Verify CRs reconcile. The operator controller reconciles any CR schema changes automatically:

    kubectl get nsightcoordinator,nsightgateway,nsightanalysis -A
    kubectl get pods -n nsight-operator
    
  4. Restart profiled workloads so they pick up any changes to injected binaries, init containers, or environment:

    kubectl rollout restart deployment,statefulset -n <target-ns>
    
  5. Re-run autoconfigure on any client machine that has cached the old CLI configuration:

    python3 nsight_operator.py autoconfigure -n <namespace>
    

Multi-Tenant Upgrades#

In multi-tenant mode, the operator controller and webhook are upgraded cluster-wide, while tenant-scoped resources (Coordinator, Gateway, Storage, etc.) continue to reconcile automatically. Coordinate with tenants to:

  • Announce brief profiler pauses.

  • Restart their workloads after the upgrade so they pick up any new injector changes.