For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Blog
DocsAPI Reference
DocsAPI Reference
    • AIStore
    • Documentation
  • Core Documentation
    • In-depth Overview
    • Terminology and core abstractions
    • Getting Started
    • Networking model
    • Buckets: design, operations, namespaces, and system buckets
    • Observability overview
    • CLI overview
    • Production deployment
    • Technical Blog
  • APIs, SDKs, and Compatibility
    • Go API
    • Python SDK
    • PyPI package
    • Python SDK reference guide
    • PyTorch integration
    • TensorFlow integration
    • HTTP API reference
    • curl examples
    • Easy URL
    • S3 compatibility
    • s3cmd quick start
    • Presigned S3 requests
    • Boto3 support
  • Command-Line Interface
    • CLI overview
    • ais help
    • CLI reference guide
    • Bucket operations
    • Cluster and remote-cluster management
    • Storage and mountpath management
    • Monitoring and ais show
    • Downloads
    • Jobs
    • Authentication and access control
    • Configuration via CLI
    • ETL CLI
    • Distributed shuffle CLI
    • ML / get-batch CLI
    • GCP credentials
    • TLS certificate management
  • Storage and Data Management
    • Storage services
    • Buckets: design, operations, namespaces, and system buckets
    • Native Bucket Inventory (NBI)
    • Backend providers
    • On-disk layout
    • Virtual directories
    • System files
    • Evicting remote buckets and cached data
  • Cluster Operations
    • Node lifecycle: maintenance, shutdown, decommission
    • Global rebalance
    • Resilver
    • AIS in Containerized Environments
    • Highly available control plane
    • Information Center (IC)
    • Out-of-band updates
    • Troubleshooting
  • Configuration and Security
    • Configuration
    • Environment variables
    • Feature flags
    • AuthN and access control
    • Authentication validation
    • HTTPS and certificates
    • Switching a cluster to HTTPS
  • ETL and Advanced Workflows
    • ETL overview
    • ETL CLI docs
    • ETL Python SDK examples
    • Custom transformers
    • ETL Python webserver SDK
    • ETL Go webserver package
    • Archives: read, write, and list
    • Distributed shuffle (dsort)
    • Initial sharding utility (ishard)
    • Downloader
    • Blob Downloader
    • Batch object retrieval (get-batch)
    • Batch operations
    • Tools and utilities
    • Extended actions (xactions)
  • Observability, Monitoring, and Performance
    • Observability overview
    • Monitoring with CLI
    • Logs
    • Prometheus integration
    • Metrics reference
    • Grafana dashboards
    • Kubernetes monitoring
    • Distributed tracing
    • Monitoring get-batch
    • AIS load generator (aisloader)
    • Benchmarking AIStore
    • Performance tuning and testing
    • Performance monitoring via CLI
    • Rate limiting
    • Checksumming
    • Filesystem Health Checker (FSHC)
    • Traffic patterns
  • Networking
    • Networking: multi-homing, network separation, IPv6
    • HTTPS configuration
    • Switching to HTTPS
    • Idle connections
    • MessagePack protocol
  • Deployment
    • AIStore on Kubernetes
    • Kubernetes Operator
    • Ansible playbooks
    • Helm charts
    • Deployment monitoring
    • Docker
  • Developer Resources
    • Development guide
    • aisnode command line
    • Build tags
  • Object and Bucket Naming
    • Unicode and special symbols in object and bucket names
    • Extremely long object names
Blog
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoAIStore
On this page
  • Table of Contents
  • Highly Available Control Plane
  • Bootstrap
  • Election
  • Non-electable gateways
  • Metasync
  • Data Plane Availability
  • References
Cluster Operations

Highly available control plane

||View as Markdown|
Previous

AIS in Containerized Environments

Next

Information Center (IC)

Table of Contents

  • Highly Available Control Plane
    • Bootstrap
    • Election
    • Non-electable gateways
    • Metasync
  • Data Plane Availability
  • References

Highly Available Control Plane

AIStore cluster will survive a loss of any storage target and any gateway including the primary gateway (leader). New gateways and targets can join at any time – including the time of electing a new leader. Each new node joining a running cluster will get updated with the most current cluster-level metadata. Failover – that is, the election of a new leader – is carried out automatically on failure of the current/previous leader. Failback – that is, administrative selection of the leading (likely, an originally designated) gateway – is done manually via AIStore API

It is, therefore, recommended that AIStore cluster is deployed with multiple proxies aka gateways (the terms that are interchangeably used throughout the source code and this README).

When there are multiple proxies, only one of them acts as the primary while all the rest are, respectively, non-primaries. The primary proxy’s (primary) responsibility is serializing updates of the cluster-level metadata (which is also versioned and immutable).

Further:

  • Each proxy/gateway stores a local copy of the cluster map (Smap)
  • Each Smap instance is immutable and versioned; the versioning is monotonic (increasing)
  • Only the current primary (leader) proxy distributes Smap updates to all other clustered nodes

Bootstrap

The proxy’s bootstrap sequence initiates by executing the following three main steps:

  • step 1: load a local copy of the cluster map (Smap) and try to use it for the discovery of the current one;
  • step 2: use the local configuration and the local Smap to perform the discovery of the cluster-level metadata;
  • step 3: use all of the above and, optionally, AIS_PRIMARY_EP to figure out whether this proxy must keep starting up as a primary;
    • otherwise, join as a non-primary (a.k.a. secondary).

The rules to determine whether a given starting-up proxy is the primary one in the cluster - are simple. In fact, it’s a single switch statement in the namesake function:

  • determineRole.

Further, the (potentially) primary proxy executes more steps:

  • (i) initialize empty Smap;
  • (ii) wait a configured time for other nodes to join;
  • (iii) merge the Smap containing newly joined nodes with the Smap that was previously discovered;
  • (iiii) and use the latter to rediscover cluster-wide metadata and resolve remaining conflicts, if any.

If during any of these steps the proxy finds out that it must be joining as a non-primary then it simply does so.

Election

The primary proxy election process is as follows:

  • A candidate to replace the current (failed) primary is selected;
  • The candidate is notified that an election is commencing;
  • After the candidate (proxy) confirms that the current primary proxy is down, it broadcasts vote requests to all other nodes;
  • Each recipient node confirms whether the current primary is down and whether the candidate proxy has the HRW (Highest Random Weight) according to the local Smap;
  • If confirmed, the node responds with Yes, otherwise it’s a No;
  • If and when the candidate receives a majority of affirmative responses it performs the commit phase of this two-phase process by distributing an updated cluster map to all nodes.

Non-electable gateways

AIStore cluster can be stretched to collocate its redundant gateways with the compute nodes. Those non-electable local gateways (AIStore configuration) will only serve as access points but will never take on the responsibility of leading the cluster.

Metasync

By design, AIStore does not have a centralized (SPOF) shared cluster-level metadata. The metadata consists of versioned objects: cluster map, buckets (names and properties), authentication tokens. In AIStore, these objects are consistently replicated across the entire cluster – the component responsible for this is called metasync. AIStore metasync makes sure to keep cluster-level metadata in-sync at all times.

Data Plane Availability

While the control plane handles node membership and consistency of the cluster-level metadata, the data plane has its own resilience mechanisms:

  • Filesystem Health Checker - detects and isolates faulty storage
  • Data Protection - data protection across nodes
  • Global Rebalancing - automatic data redistribution upon node lifecycle events

References

  • Node lifecycle: maintenance mode, rebalance/rebuild, shutdown, decommission
  • Blog: Split-brain is Inevitable
  • Filesystem Health Checker