ais cluster command

View as Markdown

The ais cluster command is the main tool for monitoring and managing an AIS (AIStore) cluster. It provides functionalities to

  • add or remove nodes
  • change the primary gateway
  • join (or merge) two AIS clusters, and
  • perform a variety of administrative operations.

The command has the following subcommands:

1$ ais cluster <TAB-TAB>
2
3show rebalance decommission reload-backend-creds
4dashboard set-primary add-remove-nodes
5remote-attach download-logs reset-stats
6remote-detach shutdown drop-lcache

Important: with the single exception of add-remove-nodes, all the other the commands listed above operate on the level of the entire cluster. Node level operations (e.g., shutting down a given selected node, etc.) can be found under add-remove-nodes.

Alternatively, use --help to show subcommands with brief descriptions:

1$ ais cluster --help
2NAME:
3 ais cluster - Monitor and manage AIS cluster: add/remove nodes, change primary gateway, etc.
4
5USAGE:
6 ais cluster command [arguments...] [command options]
7
8COMMANDS:
9 show Main dashboard: show cluster at-a-glance (nodes, software versions, utilization, capacity, memory and more)
10 remote-attach Attach remote ais cluster
11 remote-detach Detach remote ais cluster
12 rebalance Administratively start and stop global rebalance; show global rebalance
13 set-primary Select a new primary proxy/gateway
14 download-logs Download all log archives from all clustered nodes (one TAR.GZ per node), e.g.:
15 - 'download-logs /tmp/www' - save log archives to /tmp/www directory
16 - 'download-logs --severity w' - errors and warnings to /tmp directory
17 (see related: 'ais log show', 'ais log get')
18 shutdown Shut down entire cluster
19 decommission Decommission entire cluster
20 add-remove-nodes Manage cluster membership (add/remove nodes, temporarily or permanently)
21 reset-stats Reset cluster or node stats (all cumulative metrics or only errors)
22 drop-lcache Drop (discard) in-memory object metadata cache
23 reload-backend-creds Reload (updated) backend credentials
24
25OPTIONS:
26 --help, -h Show help

As always, each subcommand will have its own help and usage examples, the latter possibly spread across multiple documents.

Note: for any keyword or text of any kind, you can easily look up examples and descriptions via a simple find or git grep, for instance:

1$ find . -type f -name "*.md" | xargs grep "ais.*mountpath"

Note that there is a single CLI command to grow a cluster, and multiple commands to scale it down.

Scaling down can be done gracefully or forcefully, and also temporarily or permanently.

For background, usage examples, and details, please see Node lifecycle: maintenance, shutdown, decommission.

Adding/removing nodes

The corresponding functionality can be found under the subcommand called add-remove-nodes:

1$ ais cluster add-remove-nodes --help
2NAME:
3 ais cluster add-remove-nodes - manage cluster membership (add/remove nodes, temporarily or permanently)
4
5USAGE:
6 ais cluster add-remove-nodes command [arguments...] [command options]
7
8COMMANDS:
9 join add a node to the cluster
10 start-maintenance put node in maintenance mode, temporarily suspend its operation
11 stop-maintenance activate node by taking it back from "maintenance"
12 decommission safely and permanently remove node from the cluster
13
14 shutdown shutdown a node, gracefully or immediately;
15 note: upon shutdown the node won't be decommissioned - it'll remain in the cluster map
16 and can be manually restarted to rejoin the cluster at any later time;
17 see also: 'ais advanced remove-from-smap'

Table of Contents

Cluster Dashboard

ais show dashboard (alias: ais cluster dashboard) provides an at-a-glance view of cluster health, performance, and configuration. The dashboard consolidates the most important operational metrics into a single command, making it ideal for quick status checks and continuous monitoring.

Command Overview

1$ ais show dashboard --help
2NAME:
3 ais show dashboard - Show cluster at-a-glance dashboard: node counts, capacity, performance, health, software version, and more
4
5USAGE:
6 ais show dashboard [NODE_ID] [command options]
7
8OPTIONS:
9 --refresh value interval for continuous monitoring;
10 valid time units: ns, us (or µs), ms, s (default), m, h
11 --count value used together with '--refresh' to limit the number of generated reports (default: 0)
12 --verbose, -v verbose output
13 --json, -j json input/output
14 --no-headers, -H display tables without headers
15 --help, -h show help

Output Sections

The dashboard displays two main sections:

Performance and Health:

MetricDescription
StateOverall cluster operational status (Operational, Critical, Maintenance, etc.)
ThroughputCurrent read/write throughput rates (shown only when active)
I/O ErrorsTotal disk I/O errors across all nodes
Load Avg1-minute load average (avg, min, max across all nodes)
Disk UsageAverage, minimum, and maximum disk usage percentages
NetworkNetwork health status
StorageTotal mountpaths and their health status
FilesystemsTypes and counts of filesystems in use
Running JobsCurrently active job types (if any)

Cluster:

MetricDescription
EndpointCluster endpoint URL
ProxiesNumber of proxy nodes and electability status
TargetsNumber of target nodes and total disks
CapacityUsed and available storage with percentages
Cluster MapVersion, UUID, and primary node information
SoftwareVersion and build information
BackendConfigured backend provider(s)
DeploymentDeployment type (K8s, standalone, etc.)
StatusOnline node count
RebalanceCurrent rebalance status
AuthenticationWhether AuthN is enabled
Version/BuildSoftware version and build timestamp

Examples

Basic dashboard view:

1$ ais show dashboard
2
3Performance and Health:
4 State: Operational
5 Throughput: Read 9.5GiB/s, Write 0B/s (1s avg)
6 I/O Errors: 0
7 Load Avg: avg 2.1, min 1.6, max 2.7 (1m)
8 Disk Usage: avg 19.3%, min 18.8%, max 20.1%
9 Network: healthy
10 Storage: 192 mountpaths (all healthy)
11 Filesystems: xfs(192)
12 Running Jobs: None
13
14Cluster:
15 Endpoint: https://asr.aistore.nvidia.com:51080
16 Proxies: 16 (all electable)
17 Targets: 16 (total disks: 192)
18 Capacity: used 591.40TiB (49%), available 595.89TiB
19 Cluster Map: version 103, UUID cwV4IkK3k, primary p[Euc2iyom3zhi6]
20 Software: 3.31.a210cc0 (build: 2025-07-25T22:44:30+0000)
21 Backend: AWS
22 Deployment: K8s
23 Status: 32 online
24 Rebalance: n/a
25 Authentication: disabled
26 Version: 3.31.a210cc0
27 Build: 2025-07-25T22:44:30+0000

Continuous (Throughput) monitoring:

1# Compute cluster throughput numbers over 30s intervals; refresh every 30 seconds
2$ ais show dashboard --refresh 30
3
4# Same as above but run 10 times (6m total)
5$ ais show dashboard --refresh 30 --count 10

JSON output:

1$ ais show dashboard --json

Verbose mode (shows detailed issue breakdown when problems detected):

1$ ais show dashboard --verbose
2
3Performance and Health:
4 State: Multiple issues (6 node(s) affected: 2 maintenance, 4 rebalancing)
5 ...
6
7CLUSTER HEALTH DETAILS:
8Maintenance (2/6): t[FFIt8090], t[zHut8091]
9Rebalancing (4/6): t[ZHHt8087], t[atEt8086], t[UTat8088], t[xgAt8089]

Cluster and Node status

The command has a rather long(ish) short description and multiple subcommands:

1$ ais show cluster --help
2
3NAME:
4 ais show cluster - main dashboard: cluster at-a-glance (nodes, software versions, utilization, capacity, memory and more)
5
6USAGE:
7 ais show cluster command [NODE_ID] | [target [NODE_ID]] | [proxy [NODE_ID]] | [command options]
8 [smap [NODE_ID]] | [bmd [NODE_ID]] | [config [NODE_ID]] | [stats [NODE_ID]]
9
10COMMANDS:
11 smap show cluster map (Smap)
12 bmd show bucket metadata (BMD)
13 config show cluster and node configuration
14 stats (alias for "ais show performance") show performance counters, throughput, latency and more (press <TAB-TAB> to select specific view)
15
16OPTIONS:
17 --refresh value interval for continuous monitoring;
18 valid time units: ns, us (or µs), ms, s (default), m, h
19 --count value used together with '--refresh' to limit the number of generated reports (default: 0)
20 --json, -j json input/output
21 --no-headers, -H display tables without headers
22 --help, -h show help

To quickly exemplify, let’s assume the cluster has a (target) node called t[xyz]. Then:

Main CLI dashboard: all storage nodes and gateways, deployed version, capacity, memory, and runtime stats:

1$ ais show cluster

same as above, with only targets selected

1$ ais show cluster target

show specific target

1$ ais show cluster target t[xyz]

ask specific target to show its cluster map

1$ ais show cluster smap t[xyz]

and so on and so forth.

Notes

The last example (above) may potentially make sense when troubleshooting. Otherwise, by design and implementation, cluster map (Smap), bucket metadata (BMD), and all other cluster-level metadata exists in identical protected and versioned replicas on all nodes at any given point in time.

Still, to display cluster map in its (JSON) fullness, run:

1$ ais show cluster smap --json

--json option is almost universally supported in CLI

Similar to all other show commands, ais cluster show is an alias for ais cluster show. Both can be used interchangeably.

Options

1$ ais show cluster smap --help
2
3NAME:
4 ais show cluster smap - Show cluster map (Smap)
5
6USAGE:
7 ais show cluster smap [NODE_ID] [command options]
8
9OPTIONS:
10 --count value Used together with '--refresh' to limit the number of generated reports, e.g.:
11 '--refresh 10 --count 5' - run 5 times with 10s interval (default: 0)
12 --json, -j JSON input/output
13 --no-headers, -H Display tables without headers
14 --refresh value Time interval for continuous monitoring; can be also used to update progress bar (at a given interval);
15 valid time units: ns, us (or µs), ms, s (default), m, h
16 --help, -h Show help

Examples

1$ ais show cluster
2PROXY MEM USED % MEM AVAIL UPTIME
3pufGp8080[P] 0.28% 15.43GiB 17m
4ETURp8083 0.26% 15.43GiB 17m
5sgahp8082 0.26% 15.43GiB 17m
6WEQRp8084 0.27% 15.43GiB 17m
7Watdp8081 0.26% 15.43GiB 17m
8
9TARGET MEM USED % MEM AVAIL CAP USED % CAP AVAIL CPU USED % REBALANCE UPTIME
10iPbHt8088 0.28% 15.43GiB 14.00% 1.178TiB 0.13% - 17m
11Zgmlt8085 0.28% 15.43GiB 14.00% 1.178TiB 0.13% - 17m
12oQZCt8089 0.28% 15.43GiB 14.00% 1.178TiB 0.14% - 17m
13dIzMt8086 0.28% 15.43GiB 14.00% 1.178TiB 0.13% - 17m
14YodGt8087 0.28% 15.43GiB 14.00% 1.178TiB 0.14% - 17m
15
16Summary:
17 Proxies: 5 (0 - unelectable)
18 Targets: 5
19 Primary Proxy: pufGp8080
20 Smap Version: 14
21 Deployment: dev

Show cluster map

ais show cluster smap [NODE_ID]

Show a copy of the cluster map (Smap) stored on NODE_ID.

If NODE_ID is not given, show cluster map from (primary or secondary) proxy “pointed to” by your local CLI configuration (ais config cli) or AIS_ENDPOINT environment.

Note that cluster map (Smap), bucket metadata (BMD), and all other cluster-level metadata exists in identical protected and versioned replicas on all nodes at any given point in time.

Useful variations include ais show cluster smap --json (to see the unabridged version), and also:

1$ ais show cluster smap --refresh 5

The latter will periodically (until Ctrl-C) show cluster map in 5-second intervals - might be useful in presence of any kind of membership changes (e.g., cluster startup).

Options

1$ ais show cluster smap --help
2
3NAME:
4 ais show cluster smap - Show cluster map (Smap)
5
6USAGE:
7 ais show cluster smap [NODE_ID] [command options]
8
9OPTIONS:
10 --count value Used together with '--refresh' to limit the number of generated reports, e.g.:
11 '--refresh 10 --count 5' - run 5 times with 10s interval (default: 0)
12 --json, -j JSON input/output
13 --no-headers, -H Display tables without headers
14 --refresh value Time interval for continuous monitoring; can be also used to update progress bar (at a given interval);
15 valid time units: ns, us (or µs), ms, s (default), m, h
16 --help, -h Show help

Examples

Show smap from a given node

Ask a specific node for its cluster map (Smap) replica:

1$ ais show cluster smap <TAB-TAB>
2... p[ETURp8083] ...
3
4$ ais show cluster smap p[ETURp8083]
5NODE TYPE PUBLIC URL
6ETURp8083 proxy http://127.0.0.1:8083
7WEQRp8084 proxy http://127.0.0.1:8084
8Watdp8081 proxy http://127.0.0.1:8081
9pufGp8080[P] proxy http://127.0.0.1:8080
10sgahp8082 proxy http://127.0.0.1:8082
11
12NODE TYPE PUBLIC URL
13YodGt8087 target http://127.0.0.1:8087
14Zgmlt8085 target http://127.0.0.1:8085
15dIzMt8086 target http://127.0.0.1:8086
16iPbHt8088 target http://127.0.0.1:8088
17oQZCt8089 target http://127.0.0.1:8089
18
19Non-Electable:
20
21Primary Proxy: pufGp8080
22Proxies: 5 Targets: 5 Smap Version: 14

Show cluster stats

ais show cluster stats is a alias for ais show performance.

The latter is the primary implementation, and the preferred way to investigate cluster performance, while ais show cluster stats is retained in part for convenience and in part for backward compatibility.

1$ ais show cluster stats <TAB-TAB>
2
3counters throughput latency capacity disk
1$ ais show cluster stats --help
2
3NAME:
4 ais show cluster stats - (alias for "ais show performance") Show performance counters, throughput, latency, disks, used/available capacities (press <TAB-TAB> to select specific view)
5
6USAGE:
7 ais show cluster stats command [TARGET_ID] [command options]
8
9COMMANDS:
10 counters Show (GET, PUT, DELETE, RENAME, EVICT, APPEND) object counts, as well as:
11 - numbers of list-objects requests;
12 - (GET, PUT, etc.) cumulative and average sizes;
13 - associated error counters, if any, and more.
14 throughput Show GET and PUT throughput, associated (cumulative, average) sizes and counters
15 latency Show GET, PUT, and APPEND latencies and average sizes
16 capacity Show target mountpaths, disks, and used/available capacity
17 disk Show disk utilization and read/write statistics
18
19OPTIONS:
20 --average-size Show average GET, PUT, etc. request size
21 --count value Used together with '--refresh' to limit the number of generated reports, e.g.:
22 '--refresh 10 --count 5' - run 5 times with 10s interval (default: 0)
23 --no-headers, -H Display tables without headers
24 --non-verbose, --nv Non-verbose (quiet) output, minimized reporting, fewer warnings
25 --refresh value Time interval for continuous monitoring; can be also used to update progress bar (at a given interval);
26 valid time units: ns, us (or µs), ms, s (default), m, h
27 --regex value Regular expression to select table columns (case-insensitive), e.g.:
28 --regex "put|err" - show PUT (count), PUT (total size), and all supported error counters;
29 --regex "Put|ERR" - same as above;
30 --regex "[a-z]" - show all supported metrics, including those that have zero values across all nodes;
31 --regex "(AWS-GET$|VERSION-CHANGE$)" - show the number object version changes (updates) and cold GETs from AWS
32 --regex "(gcp-get$|version-change$)" - same as above for Google Cloud ('gs://')
33 --units value Show statistics and/or parse command-line specified sizes using one of the following units of measurement:
34 iec - IEC format, e.g.: KiB, MiB, GiB (default)
35 si - SI (metric) format, e.g.: KB, MB, GB
36 raw - do not convert to (or from) human-readable format
37 --verbose, -v Verbose output
38 --help, -h Show help

See also:

Show disk stats

ais show storage disk [TARGET_ID] - show disk utilization and read/write statistics

1$ ais show storage disk --help
2NAME:
3 ais show storage disk - show disk utilization and read/write statistics
4
5USAGE:
6 ais show storage disk [TARGET_ID] [command options]
7
8OPTIONS:
9 --refresh value interval for continuous monitoring;
10 valid time units: ns, us (or µs), ms, s (default), m, h
11 --count value used together with '--refresh' to limit the number of generated reports (default: 0)
12 --no-headers, -H display tables without headers
13 --units value show statistics and/or parse command-line specified sizes using one of the following _units of measurement_:
14 iec - IEC format, e.g.: KiB, MiB, GiB (default)
15 si - SI (metric) format, e.g.: KB, MB, GB
16 raw - do not convert to (or from) human-readable format
17 --regex value regular expression to select table columns (case-insensitive), e.g.: --regex "put|err"
18 --summary tally up target disks to show per-target read/write summary stats and average utilizations
19 --help, -h show help

When TARGET_ID is not given, disk stats for all targets will be shown and aggregated.

Options

1$ ais show storage disk --help
2
3NAME:
4 ais show storage disk - Show disk utilization and read/write statistics
5
6USAGE:
7 ais show storage disk [TARGET_ID] [command options]
8
9OPTIONS:
10 --count value Used together with '--refresh' to limit the number of generated reports, e.g.:
11 '--refresh 10 --count 5' - run 5 times with 10s interval (default: 0)
12 --no-headers, -H Display tables without headers
13 --refresh value Time interval for continuous monitoring; can be also used to update progress bar (at a given interval);
14 valid time units: ns, us (or µs), ms, s (default), m, h
15 --regex value Regular expression to select table columns (case-insensitive), e.g.:
16 --regex "put|err" - show PUT (count), PUT (total size), and all supported error counters;
17 --regex "Put|ERR" - same as above;
18 --regex "[a-z]" - show all supported metrics, including those that have zero values across all nodes;
19 --regex "(AWS-GET$|VERSION-CHANGE$)" - show the number object version changes (updates) and cold GETs from AWS
20 --regex "(gcp-get$|version-change$)" - same as above for Google Cloud ('gs://')
21 --summary Tally up target disks to show per-target read/write summary stats and average utilizations
22 --units value Show statistics and/or parse command-line specified sizes using one of the following units of measurement:
23 iec - IEC format, e.g.: KiB, MiB, GiB (default)
24 si - SI (metric) format, e.g.: KB, MB, GB
25 raw - do not convert to (or from) human-readable format
26 --help, -h Show help

Examples

Display disk reports stats N times every M seconds

Display 5 reports of all targets’ disk statistics, with 10s intervals between each report.

1$ ais show storage disk --count 2 --refresh 10s
2Target Disk Read Write %Util
3163171t8088 sda 6.00KiB/s 171.00KiB/s 49
4948212t8089 sda 6.00KiB/s 171.00KiB/s 49
541981t8085 sda 6.00KiB/s 171.00KiB/s 49
6490062t8086 sda 6.00KiB/s 171.00KiB/s 49
7164472t8087 sda 6.00KiB/s 171.00KiB/s 49
8
9Target Disk Read Write %Util
10163171t8088 sda 1.00KiB/s 4.26MiB/s 96
1141981t8085 sda 1.00KiB/s 4.26MiB/s 96
12948212t8089 sda 1.00KiB/s 4.26MiB/s 96
13490062t8086 sda 1.00KiB/s 4.29MiB/s 96
14164472t8087 sda 1.00KiB/s 4.26MiB/s 96

Managing cluster membership

The ais cluster add-remove-nodes command supports adding, removing, and maintaining nodes within the cluster. It allows administrators to dynamically adjust the cluster’s composition, handle maintenance operations, and ensure availability and correctness during transitions when nodes are added or removed.

1$ ais cluster add-remove-nodes --help
2
3NAME:
4 ais cluster add-remove-nodes - Manage cluster membership (add/remove nodes, temporarily or permanently)
5
6USAGE:
7 ais cluster add-remove-nodes command [arguments...] [command options]
8
9COMMANDS:
10 join Add a node to the cluster
11 start-maintenance Put node in maintenance mode, temporarily suspend its operation
12 stop-maintenance Take node out of maintenance mode - activate
13 decommission Safely and permanently remove node from the cluster
14 shutdown Shutdown a node, gracefully or immediately;
15 note: upon shutdown the node won't be decommissioned - it'll remain in the cluster map
16 and can be manually restarted to rejoin the cluster at any later time;
17 see also: 'ais advanced remove-from-smap'
18
19OPTIONS:
20 --help, -h Show help

Join a node

1$ ais cluster add-remove-nodes join --help
2NAME:
3 ais cluster add-remove-nodes join - add a node to the cluster
4
5USAGE:
6 ais cluster add-remove-nodes join IP:PORT [command options]
7
8OPTIONS:
9 --role value role of this AIS daemon: proxy or target
10 --non-electable this proxy must not be elected as primary (advanced use)
11 --help, -h show help

AIStore has two kinds of node: proxie (gateways) and targets (storage nodes). That’s why --role is a mandatory option that must have one of the two values:

  • --role=proxy or
  • --role=target

Note: aisnode will try to join cluster using its persistent ID. If you need to specify an ID, you can do so via aisnode executable command line.

Example: join a proxy node

1$ ais cluster add-remove-nodes join --role=proxy 192.168.0.185:8086
2Proxy with ID "23kfa10f" successfully joined the cluster.

Any proxy can be potentially elected as primary; to mark certain proxies as non-electable, run (e.g.):

1$ ais cluster add-remove-nodes join 192.168.0.185:8086 --role=proxy --non-electable

Remove a node

Temporarily remove an existing node from the cluster

ais cluster add-remove-nodes start-maintenance NODE_ID ais cluster add-remove-nodes stop-maintenance NODE_ID

Starting maintenance puts the node in maintenance mode, and the cluster gradually transitions to operating without the specified node (which is labeled maintenance in the cluster map). Stopping maintenance will revert this.

ais cluster add-remove-nodes shutdown NODE_ID

Shutting down a node will put the node in maintenance mode first, and then shut down the aisnode process on the node.

Permanently remove an existing node from the cluster

ais cluster add-remove-nodes decommission NODE_ID

Decommissioning a node will safely remove a node from the cluster by triggering a cluster-wide rebalance first. This can be avoided by specifying --no-rebalance.

Options

1$ ais cluster add-remove-nodes decommission --help
2
3NAME:
4 ais cluster add-remove-nodes decommission - Safely and permanently remove node from the cluster
5
6USAGE:
7 ais cluster add-remove-nodes decommission NODE_ID [command options]
8
9OPTIONS:
10 --keep-initial-config Keep the original plain-text configuration the node was deployed with
11 (the option can be used to restart aisnode from scratch)
12 --no-rebalance Do _not_ run global rebalance after putting node in maintenance (caution: advanced usage only)
13 --no-shutdown Do not shutdown node upon decommissioning it from the cluster
14 --rm-user-data Remove all user data when decommissioning node from the cluster
15 --yes, -y Assume 'yes' to all questions
16 --help, -h Show help

Examples

Decommission node

Permananently remove proxy p[omWp8083] from the cluster:

1$ ais cluster add-remove-nodes decommission <TAB-TAB>
2p[cFOp8082] p[Hqhp8085] p[omWp8083] t[bFat8087] t[Icjt8089] t[ofPt8091]
3p[dpKp8084] p[NGVp8081] p[Uerp8080] t[erbt8086] t[IDDt8090] t[TKSt8088]
4
5$ ais cluster add-remove-nodes decommission p[omWp8083]
6
7Node "omWp8083" has been successfully removed from the cluster.

To terminate aisnode on a given machine, use the shutdown command, e.g.:

1$ ais cluster add-remove-nodes shutdown t[23kfa10f]

Similar to the maintenance option, shutdown triggers global rebalanceng then shuts down the corresponding aisnode process (target t[23kfa10f] in the example above).

Temporarily put node in maintenance

1$ ais show cluster
2PROXY MEM USED % MEM AVAIL UPTIME
3202446p8082 0.09% 31.28GiB 70s
4279128p8080[P] 0.11% 31.28GiB 80s
5
6TARGET MEM USED % MEM AVAIL CAP USED % CAP AVAIL CPU USED % REBALANCE UPTIME
7147665t8084 0.10% 31.28GiB 16% 2.458TiB 0.12% - 70s
8165274t8087 0.10% 31.28GiB 16% 2.458TiB 0.12% - 70s
9
10$ ais cluster add-remove-nodes start-maintenance 147665t8084
11$ ais show cluster
12PROXY MEM USED % MEM AVAIL UPTIME
13202446p8082 0.09% 31.28GiB 70s
14279128p8080[P] 0.11% 31.28GiB 80s
15
16TARGET MEM USED % MEM AVAIL CAP USED % CAP AVAIL CPU USED % REBALANCE UPTIME STATUS
17147665t8084 0.10% 31.28GiB 16% 2.458TiB 0.12% - 71s maintenance
18165274t8087 0.10% 31.28GiB 16% 2.458TiB 0.12% - 71s online

Take a node out of maintenance

1$ ais cluster add-remove-nodes stop-maintenance t[147665t8084]
2$ ais show cluster
3PROXY MEM USED % MEM AVAIL UPTIME
4202446p8082 0.09% 31.28GiB 80s
5279128p8080[P] 0.11% 31.28GiB 90s
6
7TARGET MEM USED % MEM AVAIL CAP USED % CAP AVAIL CPU USED % REBALANCE UPTIME
8147665t8084 0.10% 31.28GiB 16% 2.458TiB 0.12% - 80s
9165274t8087 0.10% 31.28GiB 16% 2.458TiB 0.12% - 80s

Remote AIS cluster

Given an arbitrary pair of AIS clusters A and B, cluster B can be attached to cluster A, thus providing (to A) a fully-accessible (list-able, readable, writeable) backend.

For background, terminology, and definitions, and for many more usage examples, please see:

Attach remote cluster

ais cluster remote-attach UUID=URL [UUID=URL...]

or

ais cluster remote-attach ALIAS=URL [ALIAS=URL...]

Attach a remote AIS cluster to a local one via the remote cluster public URL. Alias (a user-defined name) can be used instead of cluster UUID for convenience. For more details and background on remote clustering, please refer to this document.

Examples

Attach two remote clusters, the first - by its UUID, the second one - via user-friendly alias (two).

1$ ais cluster remote-attach a345e890=http://one.remote:51080 two=http://two.remote:51080`

Detach remote cluster

ais cluster remote-detach UUID|ALIAS

Detach a remote cluster using its alias or UUID.

Examples

Example below assumes that the remote has user-given alias two:

1$ ais cluster remote-detach two

Show remote clusters

ais show remote-cluster

Show details about attached remote clusters.

Examples

The following two commands attach and then show the remote cluster at the address my.remote.ais:51080:

1$ ais cluster remote-attach alias111=http://my.remote.ais:51080
2Remote cluster (alias111=http://my.remote.ais:51080) successfully attached
3$ ais show remote-cluster
4UUID URL Alias Primary Smap Targets Online
5eKyvPyHr my.remote.ais:51080 alias111 p[80381p11080] v27 10 yes

Notice that:

  • user can assign an arbitrary name (aka alias) to a given remote cluster
  • the remote cluster does not have to be online at attachment time; offline or currently unreachable clusters are shown as follows:
1$ ais show remote-cluster
2UUID URL Alias Primary Smap Targets Online
3eKyvPyHr my.remote.ais:51080 alias111 p[primary1] v27 10 no
4<alias222> <other.remote.ais:51080> n/a n/a n/a no

Notice the difference between the first and the second lines in the printout above: while both clusters appear to be currently offline (see the rightmost column), the first one was accessible at some earlier time and therefore we show that it has (in this example) 10 storage nodes and other details.

To detach any of the previously configured associations, simply run:

1$ ais cluster remote-detach alias111
2$ ais show remote-cluster
3UUID URL Alias Primary Smap Targets Online
4<alias222> <other.remote.ais:51080> n/a n/a n/a no

Reset (ie., zero out) stats counters and other metrics

ais cluster reset-stats

Example and usage

1$ ais cluster reset-stats --help
2
3NAME:
4 ais cluster reset-stats - reset cluster or node stats (all cumulative metrics or only errors)
5
6USAGE:
7 ais cluster reset-stats [NODE_ID] [command options]
8
9OPTIONS:
10 --errors-only reset only error counters
11 --help, -h show help

Let’s go ahead and reset all error counters:

1$ ais cluster reset-stats --errors-only
2
3Cluster error metrics successfully reset

Reload backend credentials

The ais cluster reload-backend-creds command provides for adding new or updating existing backend credentials at runtime.

This improvement addresses a common scenario we encountered prior to version 3.26:

  • A new potential user requests access to a given AIS cluster.
  • The user already has an S3 bucket (or it could be a GCP, Azure, or OCI bucket).
  • We verify that the cluster has network access to the user’s bucket.
  • We need to add the user’s credentials to allow AIS nodes to access the bucket.

Before version 3.26, the final step required a cluster restart. But now, with reload-backend-creds, you can seamlessly add or update credentials without any downtime.

1$ ais cluster reload-backend-creds --help
2NAME:
3 ais cluster reload-backend-creds - Reload (updated) backend credentials
4
5USAGE:
6 ais cluster reload-backend-creds [PROVIDER] [command options]
7
8OPTIONS:
9 --help, -h Show help

Download log archive

The command is ‘ais cluster download-logs’ or, same, ‘ais log get cluster’.

1NAME:
2 ais cluster download-logs - Download log archives from all clustered nodes (one TAR.GZ per node),
3 e.g.:
4 - 'ais download-logs /tmp/www' - save log archives to /tmp/www directory
5 - 'ais download-logs --severity w' - errors and warnings to /tmp directory
6 see related:
7 - 'ais log get --help'
8
9USAGE:
10 ais cluster download-logs [OUT_DIR] [command options]
11
12OPTIONS:
13 severity Log severity is either 'i' or 'info' (default, can be omitted), or 'error', whereby error logs contain
14 only errors and warnings, e.g.: '--severity info', '--severity error', '--severity e'
15 help, h Show help