This page is the Day-1 configuration guide for InfiniBand partitions in NICo. It describes how an operator points NICo at UFM, how partitions are allocated and assigned to tenant instances, and how to verify that a host has ended up in the partitions it should. Tenant isolation is a property of how partitions are assigned — for the cross-fabric isolation picture, see Network Isolation.
The InfiniBand fabric itself — UFM installation, gv.cfg / opensm.conf
tuning, M_Key and SA_Key configuration, and static topology files — is
covered separately in the InfiniBand Setup
runbook. That runbook is a prerequisite for
this page: NICo’s partition guarantees rest on a properly hardened UFM and
subnet manager.
Related pages
InfiniBand partitioning in NICo is built on the InfiniBand-native P_Key mechanism enforced by the subnet manager. The operator-facing chain is:
Read it top to bottom:
IbPartition by ID — this
is the NICo object the tenant manipulates.IbPartition corresponds to a single P_Key on UFM. NICo either
allocates the P_Key value from a configured range or honours an
explicit P_Key the operator supplied.Two instances in different P_Keys cannot send any IB traffic to each other; two instances in the same P_Key can. There is no operator-visible “peering” concept for InfiniBand the way there is for Ethernet VPCs — sharing requires placing both instances’ interfaces in the same partition.
NICo treats UFM as the authoritative source for observed fabric state. It does not cache UFM partition membership separately from what the monitor last read. This means that direct out-of-band changes to UFM (an operator editing partitions in the UFM UI, for example) will be detected on the next monitor iteration and reconciled back to NICo’s intended state.
InfiniBand splits cleanly between operator site setup and tenant partition management. See Network Isolation → Who configures what, and how for the role and interface model.
The UFM-facing setup (the first two rows) is the operator’s responsibility and
is described in Configuring NICo to Talk to UFM.
Everything a tenant does — creating partitions and attaching interfaces — goes
through the REST API or nicocli; the gRPC nico-admin-cli rows are operator
triage and break-glass paths that the REST API does not expose.
Two TOML blocks are involved. Both live in the API server’s config file.
[ib_fabrics.<name>] — Endpoints and P_Key PoolEach named entry defines one InfiniBand fabric that NICo manages.
Fields:
P_Key ranges may be extended in future restarts but never shrunk: removing or narrowing a range under live tenants would orphan allocated partitions. Plan the pool with headroom.
[ib_config] — Fabric TogglesUFM API credentials are not stored in TOML. They are read from the configured secrets backend (Vault under standard deployments) by the UFM client during initialisation. Rotate them at the secrets backend; NICo picks up the new value on its next client re-initialisation.
When a tenant creates an InfiniBand partition (REST
POST …/nico/infiniband-partition, or nicocli infiniband-partition create),
the request may include or omit a desired P_Key:
pkey omitted. NICo allocates a free P_Key from the configured
pool ranges and returns it on the response. This is the normal tenant
flow.pkey specified (hex, for example "0x76b"). NICo accepts the
request only if the requested value falls inside a pool range that is
not marked as auto-assigned, and is otherwise free. Otherwise the
request is rejected.The model’s IbPartition object retains the allocated P_Key for the
lifetime of the partition. There is no “renumber” operation; to change a
P_Key, delete the partition and create a new one (which the tenant flow
handles via instance reconfiguration).
Updating a partition (nicocli infiniband-partition update) is restricted to
fields other than P_Key (name, MTU, rate limit, and so on). Deleting one
(nicocli infiniband-partition delete) requires that no instance still
references the partition.
UFM distinguishes full and limited P_Key membership. Full members can communicate with both full and limited members of the same P_Key; limited members can only talk to full members.
NICo’s posture:
IbPortMembership enum is read-only from NICo’s perspective.
The monitor records whatever UFM reports for each port-in-partition
binding.default_membership = limited hardening in the IB runbook) is what
appears on a NICo-managed binding.IbFabricMonitorIbFabricMonitor is the background reconciler inside the API server.
Every iteration it:
InstanceInfinibandConfig.bind_ib_ports() to add
it.unbind_ib_ports() to remove it.configs_synced.infiniband field on
InstanceStatus.Cadence is set by fabric_monitor_run_interval (default 60 seconds). After
applying any UFM changes, the monitor accelerates the next iteration to
~1 second so that convergence shows up quickly in observed state. Once a
steady iteration completes with no changes, the monitor returns to the
configured interval.
The monitor exposes metrics under the nico_ib_monitor_* namespace:
For a tenant instance with an IB interface attached to a partition:
PATCH …/nico/instance, or
nicocli instance update) with an InfiniBand interface configuration
that references the desired partition ID for each IB port.IbFabricMonitor iteration observes the new desired state
and issues bind_ib_ports() for each host GUID that is not already a
member of the expected P_Key.InstanceStatus::infiniband::configs_synced flips to true once
observed UFM state matches desired state. The aggregate
configs_synced and therefore the instance’s Ready state follow.Tenants observe the in-flight state as Configuring and the
InstanceStatus machine remains in WaitingForNetworkConfig until the
monitor reports convergence.
When an instance is released or its host is force-deleted, NICo clears
the IB interfaces from the instance config. The reconciler then sees host
GUIDs in NICo-managed P_Keys that no longer correspond to any live
instance and removes them via unbind_ib_ports().
NICo tracks the cleanup with an IbCleanupPending health alert on the
machine. The alert is set when cleanup is required and cleared once the
monitor confirms that every GUID has been removed from UFM-side
partitions. A machine with an outstanding IbCleanupPending alert is
ineligible for reuse by another tenant: this is the IB equivalent of the
Ethernet termination guard described in
Default Isolation.
There is no dedicated “force-delete IB partition” operation. Partitions
persist in the NICo database independent of instance churn; their membership
is what is reconciled against UFM. To remove a partition entirely, every
instance referencing it must release first, then the tenant’s
nicocli infiniband-partition delete (REST DELETE …/nico/infiniband-partition/{id})
will succeed.
default_membership = limited, M_Key / SA_Key hardening, and
any required static topology configuration.[ib_config].enabled = true and any fabric-wide MTU / rate /
service-level defaults.[ib_fabrics.<name>] with the UFM
endpoint(s) and one or more pkeys ranges. Size the ranges with
room for future growth.IbFabricMonitor begins its periodic
reconciliation on the next tick.All tenant steps use the REST API or nicocli; none require TOML or
nico-admin-cli.
nicocli infiniband-partition create (REST POST …/nico/infiniband-partition)
for each isolation domain the tenant needs (typically one per workload).nicocli instance update (REST PATCH …/nico/instance) to attach each
IB interface to the appropriate partition.configs_synced.infiniband = true to converge.NICo does not ship a single “is IB healthy” command. Verification is a short, repeatable checklist.
nico_ib_monitor_iteration_latency is being recorded (the monitor
is running) and that UFM error counters are flat.nicocli infiniband-partition list (REST GET …/nico/infiniband-partition)
and confirm each is in a converged state. For deeper internal state during
triage — the state-machine outcome field that surfaces UFM sync failures —
an operator can use nico-admin-cli ib_partition show (--id,
--tenant-org-id, or --name), which the REST API does not expose.infiniband_status_observation via the machine debug
tooling and confirm each managed port reports membership in the
intended P_Key. Cross-check with the live UFM partition table for
the same partition.nico_ib_monitor_machines_with_missing_pkeys_count and
nico_ib_monitor_machines_with_unexpected_pkeys_count should both
be 0 in steady state. Either being non-zero is a divergence between
intent and UFM state and warrants investigation.IbCleanupPending alert.The IB integration is in production but with the following gaps:
endpoints is
used at runtime. UFM HA is handled by UFM itself (virtual IP / HA
pair); the multi-endpoint config field exists for forward
compatibility.ping-style health command.