AIStore & Amazon S3 Compatibility
AIStore & Amazon S3 Compatibility
AIStore (AIS) is a lightweight, distributed object storage system designed to scale linearly with each added storage node. It provides a uniform API across various storage backends while maintaining high performance for AI/ML and data analytics workloads.
AIS integrates with Amazon S3 on three fronts:
-
Backend storage – via the backend provider abstraction, AIStore can be utiized to access (and cache or reliably store in-cluster) a remote cloud bucket such as
s3://my-bucket(aws://is accepted as an alias). This provides seamless access to existing S3 data. -
Front‑end compatibility – every gateway speaks the S3 REST API. The default endpoint is
http(s)://gw-host:port/s3, but you can enable theS3-API-via-Rootfeature flag to serve requests at the cluster root (http(s)://gw-host:port/). The same API works uniformly across all bucket types—nativeais://, cloud‑backeds3://,gs://, and more. -
Presigned request offload – AIS can receive a presigned S3 URL, execute it, and store the resulting object in the cluster. This lets you leverage S3’s authentication while using AIS for storage.
Which interface should I use?
AIS exposes a pure S3 surface for seamless compatibility and a native API for advanced, cluster‑aware workloads. The table below helps decide which path fits your scenario:
Table of Contents
- Quick Start
- Configuring Clients
- Using s3cmd with AIS
- Supported Operations
- Use Native Bucket Inventory
- Deleting nonexistent object
- Compatibility Matrix
- Boto3 Examples
- FAQs & Troubleshooting
- Further reading
Environment assumption – Local Playground
The CLI examples below use
localhost:8080, which is the default endpoint when running AIS in the Local Playground.For other deployment modes (including Kubernetes Playground, Docker Compose, bare-metal cluster, or Kubernetes for production deployments) — replace the
host:portwith any AIS gateway endpoint.See Deployment Options and the main project Features list for a broader overview.
Quick Start
Quick start with aws CLI
Quick start with s3cmd
One‑liner (HTTP):
Tip — use a cluster‑specific
.s3cfgso you can drop the--host*flags. See the Example .s3cfg section below.
Configuring Clients
Finding the AIS endpoint
Choose any gateway’s host:port and append /s3, e.g. 10.10.0.1:51080/s3. All gateways accept reads and writes, so you can connect to any of them.
In fact, AIS gateways are completely equivalent, API-wise.
Checksum considerations
Amazon S3’s ETag is MD5 (or a multipart hash); AIS defaults to xxhash for better performance. To avoid client mismatch warnings, set MD5 per bucket:
Setting the checksum type to MD5 ensures compatibility with S3 clients that validate checksums, though it comes with a minor performance cost compared to xxhash.
HTTPS vs HTTP
Enable TLS in ais.json (net.http.use_https=true) or pass --no-ssl/--insecure flags when using tools. By default, AIS uses HTTP, while many S3 clients expect HTTPS.
Using s3cmd with AIS
Interactive s3cmd --configure transcript
During the test, enter your AIS endpoint when prompted:
Example .s3cfg
Edit your ~/.s3cfg file to include these lines (replace with your actual gateway endpoint):
This configuration allows you to run s3cmd commands without having to specify the host parameters each time.
Multipart uploads with s3cmd
For large files, use multipart uploads to improve reliability and performance:
The optimal chunk size depends on your network conditions and file size, but 8-16MB chunks work well for most cases.
Authentication (JWT) tips
When AIStore Authentication is enabled, each request must include a JWT Bearer token in the Authorization header.
However, s3cmd’s built-in AWS signer overwrites any --add-header values, so you need to patch the client directly.
Edit the sign() method of the S3Request class in s3cmd/S3/S3.py.
Add the following line to override the Authorization header:
Replace <token> with your actual JWT token. This modification ensures the token is included in every request.
Supported Operations
PUT / GET / HEAD
Regular verbs work with aws, s3cmd, or the native ais CLI:
Range reads
S3 API supports byte range requests for partial object downloads:
This would download only the first 100 bytes of the file.
Multipart uploads with aws CLI
Example parts.json:
Presigned S3 requests
Presigned URLs allow temporary access to objects without sharing credentials:
-
Enable feature:
-
Generate URL (typically done on the system with AWS credentials):
-
Replace the host with
AIS_ENDPOINT/s3and add header when using path style:
This allows AIS to handle the authenticated S3 request on behalf of the client.
Use Native Bucket Inventory
The older S3-specific inventory integration has been removed in v4.4.
Use native bucket inventory (NBI) for fast, inventory-backed listing of large remote buckets, including (but not limited to) s3 buckets.
Note that S3-compatible clients may request NBI-backed listing via Ais-Bucket-Inventory: true, and optionally select a specific inventory via Ais-Inv-Name.
Deleting nonexistent object
AIStore does not emulate S3’s silent-success delete semantics. In AIS, deleting a missing object is reported as “not found” - with a single exception: when the bucket has an S3 backend. This does not violate HTTP idempotency: a repeated DELETE still has the same intended effect on server state, even though the response differs.
When the bucket does have an S3 backend and the object is missing, we return whatever the backend gives us - which, for S3, is 204 (no error).
Apart from this single exception, returning an error on delete of a nonexistent object is an intentional semantic choice for consistency with the rest of AIS.
Compatibility Matrix
Not yet supported: Regions, CORS, Website hosting, CloudFront; full ACL parity (AIS uses its own ACL model).
Boto3 Examples
Python applications can use Boto3 (the AWS SDK for Python) to connect to AIStore. Since AIStore implements S3 API compatibility, most standard Boto3 S3 operations work with minimal changes.
Prerequisites
For Boto3 to work with AIStore, you need to patch Boto3’s redirect handling:
This patch modifies Boto3’s HTTP client behavior to handle AIStore’s redirect-based load balancing. For details, see the Boto3 compatibility documentation.
Client Initialization
Basic Bucket Operations
Object Operations
Multipart Upload Example
For large files, you can use multipart uploads: