NVIDIA UFM High-Availability User Guide v6.1.1

UFM High-Level Architecture

The below figure illustrates the UFM high-level architecture.

Picture3-version-1-modificationdate-1762693467800-api-v2.png

Support of Active-Standby HA approach. UFM is not designed to run with multiple instances (active-active mode). There are several constraints:

  1. Single SM

  2. Single SharpAM

  3. Single UFM Telemetry

  4. UFM is stateful and manages its internal state (cluster topology model) in RAM

Persistent storage usage is required for the following:

  1. Configuration files (UFM, SM, SharpAM, UFM Telemetry, Apache)

  2. DB (SQlite) – history telemetry + configuration + app state

  3. Operation history – logs, events, alarms

FR#1

Develop “ufm operator” examples, refer to:

FR2#

1. KVS DB (etcd), Config Maps

2. 3rd party Cache\DB with load-balancing HA built-in (Redis, MongoDB, etc)

© Copyright 2025, NVIDIA. Last updated on Nov 10, 2025