UFM Infra
The UFM Infra feature introduces a structured architecture where services are divided into two categories, each deployed differently based on functionality:
UFM Infra: A set of persistent infrastructure services that run on all nodes. These services support system-level operations and ensure distributed availability.
UFM Enterprise: Services that run exclusively on the master node, responsible for management, orchestration, and user-facing functionality.
Faster API Availability after Failover : By limiting service transitions during node failures, recovery times are significantly reduced.
Improved Modularity: Separating core infrastructure from enterprise logic simplifies maintenance and troubleshooting.
Enhanced Scalability: Services can be scaled and managed independently across nodes.
Users can enable or disable the UFM Infra feature without requiring a reinstallation of the UFM system. For more information, refer to Enabling or Disabling UFM Infra.
Installation instructions are available at UFM Infra Installation.
The Redis image must be loaded, or the is_external_redis flag must be enabled in gv.cfg.
ufm-infra.service
Manages the following infrastructure components:
Component | Description |
Redis Server | Inter-node communication and topology storage |
Apache Web Server | HTTP/HTTPS web server for UFM API and UI |
Authentication Server | User authentication and session management |
UFM Health (Infra) | Infrastructure health monitoring |
Infra Plugins | Plugins running in infra context (e.g., Fast API) |
UTM Telemetry | Telemetry services (when UTM mode enabled) |
ufm-enterprise.service
Manages the following enterprise components:
Component | Description |
OpenSM | Subnet Manager for InfiniBand fabric |
UFM Main Process | Core UFM fabric management engine |
Enterprise Plugins | Plugins running in enterprise context |
Topology Publishing | Publishes fabric topology to Redis (Infra mode) |
Shared Resources
In Infra mode, the following resources are shared between services:
Docker Volume (
ufm-shared-data) (ufm-shared-data): Shared Apache configuration between containers
Shared Configuration Files:
opt/ufm/files/mounted to both containers
Redis: Used for topology publishing and inter-service communication
Key | Type | Default Value | Description |
| boolean | false | Enable or disable UFM Infra mode |
| string | localhost | Redis server hostname or IP address |
| integer | 6379 | Redis server port number |
| integer | 5 | Redis connection timeout in seconds |
| boolean | false | Use external Redis server instead of internal |
| boolean | false | Enable TLS encryption for Redis connections |
Fast-API configuration
The following parameters can be modified within the Fast API configuration file:
Section | Default Value | Description |
|
| Default Time-to-live (TTL) for SM-related transactions before expiration (in seconds) |
|
| Default Time-to-live (TTL) for SHARP-related transactions before expiration (in seconds) |
UFM Infra mode can be enabled or disabled after installation using the ufm_infra_feature_flag.py script.
Script Location
/opt/ufm/files/scripts/ufm_infra_feature_flag.py
Command Line Options
Usage:
ufm_infra_feature_flag.py[-h](
-e | -d)[--rootless][--log - level{DEBUG, INFO, WARNING, ERROR, CRITICAL}]
[--timeout - seconds TIMEOUT_SECONDS][--ufm - user UFM_USER]
[--force][--skip - ha - validation]
[--infra - plugins - dir<path>] Control UFM Infra feature flags
Flag | Description |
| Enable the Infra feature |
| Disable the Infra feature |
| Use rootless Podman mode (default: root Docker mode) |
| Set logging level (default: INFO) |
| Timeout for waiting for containers to stop (default: 120) |
| User for rootless Podman commands (default: ufmadm) |
| Automatically stop/start UFM services |
| Skip HA configuration validation |
| Directory containing plugin images to load and install |
Enabling Infra Mode
Standalone Mode (Docker)
Without Automatic Service Management
Stop UFM services manually:
systemctl stop ufm-enterprise systemctl stop ufm-infra
Enable Infra mode:
cd /opt/ufm/files/scripts/ ./ufm_infra_feature_flag.py --enable
Start UFM services manually:
systemctl start ufm-infra systemctl start ufm-enterprise
NoteThe script automatically detects whether the system is running in HA mode and manages cluster resources accordingly.
Disabling Infra Mode
Standalone Mode (Docker)
cd /opt/ufm/files/scripts/ ./ufm_infra_feature_flag.py --disable --force
Standalone Mode (Rootless Podman)
cd /opt/ufm/files/scripts/ ./ufm_infra_feature_flag.py --disable --rootless --force
High Availability (HA) Mode
cd /opt/ufm/files/scripts/ ./ufm_infra_feature_flag.py --disable --force
Script Behavior
When Enabling Infra Mode
The script performs the following actions:
Stops UFM services (standalone) or the HA cluster
Waits for all UFM containers to stop
Updates
gv.cfgto set:[UFMInfra] enabled =
trueUpdates the Redis trigger file to
enabledValidates HA resources (if running in HA mode)
Loads and installs Infra plugins if
--infra-plugins-diris specifiedRestarts UFM services or the HA cluster
When Disabling Infra Mode
The script performs the following actions:
Stops UFM services (standalone) or the HA cluster
Waits for all UFM containers to stop
Updates
gv.cfgto set:[UFMInfra] enabled =
falseUpdates the Redis trigger file to
disabledRestarts UFM services or the HA cluster
As part of the updated architecture, a FAST-API plugin can be deployed as an Infra Plugin and a Redis server is required for inter-service communication. Redis can be configured in two ways:
As an internal service (installed with UFM)
As an external Redis instance, depending on deployment needs.
The following sequence describes how communication is handled between Fast API, Redis, and SM/SHARP components:
Request Submission via Fast API
Users send REST API requests (e.g., for PKey creation or SHARP reservation actions) to the Fast API. These requests are placed into Redis queues, and a Transaction ID (TID) is returned to the user for tracking purposes.
Processing by Communicators
The SM Communicator or SHARP Communicator monitors Redis queues for new requests.
Upon receiving a request, the communicator forwards it to the relevant component (SM or SHARP) for execution.
After processing, the communicator captures the response and status.
Status Updates
The communicators update the status of each request back into Redis. Users can query the status of their transaction using the TID provided during request submission.
Configuration Storage and Retrieval
Communicators store the configuration in Redis.
This allows the Fast API to retrieve and expose configuration data via REST APIs, giving users access to the configuration via REST APIs to understand cluster-level settings.