The Switch OS Upgrade workflow upgrades a single Cumulus Linux switch to a new firmware version. It computes the version delta against the device’s intended firmware in Nautobot, pauses for a human to approve the upgrade, backs up the current configuration, updates device-context state, drives the image install + reboot + ZTP cycle, and validates that the device came back on the right version. Multiple short-circuit paths exit safely when no work is required.
For NVLink switches, use NVLink Switch Firmware Upgrade instead.
Before running, complete the pre-work that drives the upgrade. The workflow does nothing until the firmware target moves.
location-firmware-targets schema attached at the location level.Also confirm:
After submission, a status page appears showing the four stages. The workflow blocks at the approval stage until a human acts.
The workflow runs four stages in order. Only approve_upgrade requires approval — and even that gate can short-circuit when no upgrade is needed.
approve_upgrade — Compute the delta and request approval.
Reads the device’s current firmware off the running configuration and compares it against the intended firmware in Nautobot. Three outcomes:
requires_approval to false at runtime and short-circuits; downstream stages are marked UNREACHABLE.PENDING_APPROVAL and waits indefinitely on workflow.wait_condition until a reviewer approves or rejects. Rejection marks perform_backup, update_device_configuration, and perform_upgrade as UNREACHABLE; no change is made.perform_backup — Capture a pre-upgrade backup.
Starts the Configuration Backup workflow as a child workflow with trigger=WORKFLOW. After it completes, the workflow runs check_recorded_config_drift: if the just-recorded backup differs from what was last known to Config Manager, the upgrade halts with the downstream stages marked UNREACHABLE. Drift means the device has been configured out of band; resolve the drift before re-running.
update_device_configuration — Refresh device context for the upgrade.
Calls validate_rendered_image_change (6-minute start-to-close timeout, 2-minute heartbeat) to ensure the rendered configuration is consistent with the upgrade target, and updates Config Manager’s internal state to track the new firmware.
perform_upgrade — Install the image and wait for the device to come back.
Pushes the image to the device via the ZTP service and reboots. Two long-poll activities then wait for the device to converge:
poll_image — 35-minute timeout (30 + 5 buffer), 3-minute heartbeat. If the post-reboot image does not match the intended firmware, raises ApplicationError.
poll_ztp_status — 15-minute timeout (10 + 5 buffer). If ZTP does not complete in that window, raises ApplicationError.
Retry policy: 3 attempts, with FirmwareUpgradeException non-retryable.
The workflow returns True on full success, False on early-exit branches (no-upgrade-needed, rejected approval, drift).
After the workflow reports success, confirm:
approve_upgrade green and downstream UNREACHABLE for the short-circuit paths).nv show platform on the device reports the intended firmware version./etc/os-release, or your environment’s equivalent).Cumulus Linux does not have a native one-shot OS rollback. The recovery options are:
Stage is stuck on “Waiting for approval”.
approve_upgrade blocks indefinitely. Open the workflow page and approve or reject.
perform_backup reports configuration drift and the workflow halts.
The running config diverges from what Config Manager last knew about. Investigate (someone applied a change out of band; an automated agent on the device is modifying state). Once drift is reconciled — typically by running Configuration Backup and accepting the new baseline, or by reverting the out-of-band change — re-run the upgrade.
poll_image times out or reports a version mismatch.
The device rebooted but did not come up on the intended firmware. This is most often because the wrong image was uploaded to the ZTP service for the target version, or because the device has hardware-specific image quirks. Confirm the right image is on the ZTP service and that it matches the platform; if needed, use Device Reprovision to recover.
poll_ztp_status times out.
The image install succeeded but ZTP did not complete within 10 minutes after reboot. Use the Monitoring DHCP and ZTP section of New Site Bringup to investigate.
FirmwareUpgradeException is raised.
A non-retryable failure in the upgrade activity. Read the error message — the activity surfaces it directly. Common causes: image download failure, image checksum mismatch, install-time error reported by the device. Resolve and re-run.