Topograph enriches Kubernetes nodes with labels and annotations that describe their physical network topology. This reference covers every label and annotation key written by Topograph, how values are derived, and how to configure them.
Labels are set by the Kubernetes engine (engine: k8s) and the Slinky engine (engine: slinky). They are intended for use by workload schedulers (e.g. KAI Scheduler, gang-scheduling plugins, topology-aware bin-packers) and observability tools to reason about network locality.
Labels are additive: a node that belongs to both a block topology (NVLink domain) and a tree topology (switch fabric) carries both accelerator and leaf/spine/core simultaneously.
Not all providers produce both topology types:
Relationship to nvidia.com/gpu.clique: The GPU Operator device plugin sets nvidia.com/gpu.clique on nodes with Multi-Node NVLink (MNNVL) GPUs. The infiniband-bm and infiniband-k8s providers derive their accelerator value from the same ClusterUUID.CliqueId hardware identifiers, so the values are directly comparable. The netq provider uses a DomainUUID from the NMX management API — a different identifier that refers to the same physical domain but cannot be compared as a string.
NVIDIA Fabric Manager runs at node init on MNNVL-capable hardware, discovers the NVLink fabric across GPUs, and registers each GPU with NVML (NVIDIA Management Library — a C API that exposes per-GPU state). The GPU Operator’s IMEX labeler writes nvidia.com/gpu.clique only once NVML reports the node’s fabric state as GPU_FABRIC_STATE_COMPLETED — meaning Fabric Manager finished initialization successfully and the node is part of an NVLink domain.
On non-MNNVL systems (e.g., DGX B200, B300), the GPU fabric never reaches GPU_FABRIC_STATE_COMPLETED, so nvidia.com/gpu.clique is not set at all. On these systems, Topograph with an InfiniBand provider is the only source of network topology for scheduling decisions.
accelerator and nvidia.com/gpu.clique for schedulingWorkload schedulers consuming topology labels may need to choose between Topograph’s network.topology.nvidia.com/accelerator and the NVIDIA GPU Operator’s nvidia.com/gpu.clique. The right choice depends on the provider and the desired granularity:
nvidia.com/gpu.clique. On the AWS provider this is finer granularity than accelerator (which carries the CapacityBlockId, i.e., the NVL Domain). On DRA, InfiniBand, and Lambda AI providers the two labels carry the same value.nvidia.com/gpu.clique is absent. Use network.topology.nvidia.com/accelerator.topology.conf directly.Caveats when preferring nvidia.com/gpu.clique:
The label encodes node identity within MNNVL domains, not fabric proximity between them. NVL Partition is encoded as the full <ClusterUUID>.<CliqueID> value; NVL Domain is encoded as the ClusterUUID prefix. A scheduler can therefore distinguish racks — two nodes with different ClusterUUID are in different NVL Domains — and act on that distinction (same-Domain affinity to pack a job onto a single rack, cross-Domain anti-affinity to spread independent jobs across racks). What the label does not encode is the physical proximity between Domains: ClusterUUIDs are opaque identifiers, so the label cannot tell a scheduler which racks share a top-of-rack switch, an aggregation tier, or a core. For cross-rack proximity-aware placement, Topograph populates the following labels from the InfiniBand or NetQ providers regardless of whether gpu.clique is present:
leaf label.spine label.core label.These labels are also relevant for mixed-workload fragmentation avoidance (see docs/engines/k8s.md § Mixed Workload Considerations).
The label is refreshed by GPU Feature Discovery at its configured interval (the k8s-device-plugin default is 60s) rather than propagated instantly. Fabric-state changes in the window between refreshes are not yet reflected in the label.
Persistence of ClusterUUID / CliqueID across node reboots is administratively controlled via Fabric Manager’s FABRIC_MODE_RESTART configuration (default: preserve partition configurations). Deployments that disable preservation may see identifiers change across restarts, which can invalidate scheduler state cached on those values.
Label values are used as-is when they are 63 characters or shorter (the Kubernetes label value limit). Values longer than 63 characters are replaced with their FNV-64a hash rendered as an x-prefixed lowercase hex string (e.g., x3e4f1a2b3c4d5e6f) to stay within the limit. This means two nodes with the same long switch identifier will carry the same hash value — locality is preserved, but the original identifier is not recoverable from the label alone.
The default network.topology.nvidia.com/ prefix is configurable via the Helm topologyNodeLabels value. If you need to map topograph’s topology layers to a custom label schema, override the keys at deploy time. The label values (topology identifiers) are always derived from the provider’s topology discovery and cannot be configured.
An active Kubernetes Enhancement Proposal (KEP), KEP-4962: Standardizing the Representation of Cluster Network Topology (draft in PR #4965), advocates reserved label keys under the topology.kubernetes.io/ namespace for a standardized representation of cluster network topology. The KEP is pre-GA and still under upstream review. Topograph’s current network.topology.nvidia.com/* keys predate any potential upstream standard and are presently vendor-scoped — the KEP’s framing allows vendor prefixes and standard labels to coexist rather than replace one another. If KEP-4962 reaches GA with stable keys, Topograph will evaluate aligning or providing both; for now, the network.topology.nvidia.com/* keys remain authoritative for Topograph-deployed clusters.
When Topograph is not deployed, the labels commonly available for topology-aware scheduling are:
These labels are set by cloud provider integrations and the NVIDIA GPU Operator’s GPU Feature Discovery (GFD) component — not by Topograph.
Topograph sets the following annotations on nodes as internal bookkeeping metadata. These are not intended for scheduler use but may be useful for debugging and observability.
Additional annotations are set on topology ConfigMaps (used by the Slinky engine):
NVSentinel’s Metadata Augmentor enriches health events with node labels from a configurable allowedLabels list. As of NVSentinel #1226 (merged 2026-04-23; shipping in the next NVSentinel release), the four network.topology.nvidia.com/* labels are included in the default allowedLabels — so on clusters where Topograph is deployed, NVSentinel propagates topology into health event metadata automatically, with no operator configuration required. Downstream consumers — fault-quarantine CEL rules, remediation custom resources, dashboards, blast-radius analysis — can then reason about topological locality at NVL Partition, NVL Domain, or switch-hierarchy level.
NVSentinel’s Metadata Augmentor skips labels that aren’t present on a node, so nodes without Topograph (or MNNVL-only labels on non-MNNVL hardware) behave cleanly — no configuration conditionals needed.
Operators on earlier NVSentinel versions, or operators running a customized allowedLabels list, can add the Topograph labels explicitly in distros/kubernetes/nvsentinel/values.yaml:
See NVSentinel’s docs/INTEGRATIONS.md § Topology Awareness (Topograph).