AIStore Observability: CLI
AIStore Observability: CLI
The CLI is the fastest way to interrogate an AIS cluster from a terminal. This page is a jump‑table to the handful of commands every SRE or developer uses when triaging performance or capacity issues. For full syntax hit <kbd>—help</kbd> on any command or see the separate CLI reference.
Table of Contents
- Installation
- Cluster Status
- Node Alerts
- Live Performance Monitoring
- Log Management
- Common Command Examples
- Best Practices
- Troubleshooting Common Issues
- CLI Resources
- Related Documentation
Installation
There are several ways to install AIS CLI:
- Using the installation script (recommended):
This script installs aisloader and CLI from the latest or previous GitHub release and enables CLI auto-completions.
-
Follow the quick-start instructions.
-
For detailed introduction (including installation) and usage, see the CLI Overview.
After installation, configure your AIS endpoint via the ais config cli command or environment variables:
Cluster Status
Example: Node-level Alerts
Node Alerts
AIStore node states are categorized into three severity levels:
-
Red Alerts - Critical issues requiring immediate attention:
OOS- Out of space conditionOOM- Out of memory conditionOOCPU- Out of CPU resourcesDiskFault- Disk failures detectedNoMountpaths- No available mountpathsNumGoroutines- Excessive number of goroutinesCertificateExpired- TLS certificate has expiredCertificateInvalid- TLS certificate is invalid
-
Warning Alerts - Potential issues that may require attention:
Rebalancing- Rebalance operation in progressRebalanceInterrupted- Rebalance was interruptedResilvering- Resilvering operation in progressResilverInterrupted- Resilver was interruptedNodeRestarted- Node was restarted (powercycle, crash)MaintenanceMode- Node is in maintenance modeLowCapacity- Low storage capacity (OOS possible soon)LowMemory- Low memory condition (OOM possible soon)LowCPU- Low CPU availabilityCertWillSoonExpire- TLS certificate will expire soonKeepAliveErrors- Recent keep-alive errors detected
-
Information States - Normal operational states:
ClusterStarted- Cluster has started (primary) or node has joined clusterNodeStarted- Node has started (may not have joined cluster yet)VoteInProgress- Voting process is in progress
Node state flags are also exposed via Prometheus metrics - for details, see:
Live Performance Monitoring
ais performance (alias ais show performance) exposes five sub‑commands. The two most used are throughput and latency.
Key Flags
See
cli-performance.mdfor sub‑command specifics.
Log Management
For more details on log configuration and analysis, see Observability: Logs.
Common Command Examples
Here are some frequently used command combinations for everyday operations:
Flags such as
--refresh <duration>,--count <n>,--regex <re>,--no-headers, and--unitsare accepted by most monitoring commands; see--helpfor the definitive list.
Best Practices
- Regular Health Checks: Run
ais show clusterandais storage summarydaily to ensure cluster health and capacity - Performance Baselines: Establish baseline performance with
ais performance showafter initial deployment - Monitoring Script: Create a shell script with key monitoring commands for daily checks
- Alert Integration: Pipe CLI output to monitoring systems for automated alerting
- Log Collection: To collect logs, integrate with a Kubernetes monitoring stack or (at least) use
ais cluster download-logs
Troubleshooting Common Issues
CLI Resources
ais help- Reference guide
- Monitoring
- Cluster and node management
- Mountpath (disk) management
- Attach, detach, and monitor remote clusters
- Start, stop, and monitor downloads
- Distributed shuffle
- User account and access management
- Jobs
- AIS CLI Reference