Bucket operations | NVIDIA AIStore

Background and Introduction

A bucket is a named container for objects - monolithic files or chunked representations - with associated metadata. It is the fundamental unit of data organization and data management.

AIS buckets are categorized by their provider and origin. Native ais:// buckets managed by this cluster are always created explicitly (via ais create or the respective Go and/or Python APIs).

Remote buckets (including s3://, gs://, etc., and ais:// buckets in remote AIS clusters) are usually discovered and auto-added on-the-fly on first access.

In a cluster, every bucket is assigned a unique, cluster-wide bucket ID (BID). Same-name remote buckets with different namespaces get different IDs. Every object a) belongs to exactly one bucket and b) is identified by a unique name within that bucket.

Bucket properties define data protection (checksums, mirroring, erasure coding), chunked representation, versioning and synchronization with remote sources, access control, backend linkage, feature flags, rate-limit settings, and more.

For types of supported buckets (AIS, Cloud, remote AIS, etc.), bucket identity, properties, lifecycle, and associated policies, storage services and usage examples, see the comprehensive:

AIS Buckets: Design and Operations

It is easy to see all CLI operations on buckets:

1 $ ais bucket <TAB-TAB>
2 
3 ls         validate   evict      show       cp         etl      rm
4 summary    lru        prefetch   create     archive    mv       props

For convenience, a few of the most popular verbs are also aliased:

1 $ ais alias | grep bucket
2 cp              bucket cp
3 create          bucket create
4 evict           bucket evict
5 ls              bucket ls
6 rmb             bucket rm

Create bucket

ais create BUCKET [BUCKET...]

Create bucket(s).

1 $ ais create --help
2 NAME:
3    ais create - (alias for "bucket create") Create AIS buckets or explicitly attach remote buckets with non-default credentials/properties.
4      Normally, AIS auto-adds remote buckets on first access (ls/get/put): when a user references a new bucket,
5      AIS looks it up behind the scenes, confirms its existence and accessibility, and "on-the-fly" updates its
6      cluster-wide global (BMD) metadata containing bucket definitions, management policies, and properties.
7      Use this command when you need to:
8        1) create an ais:// bucket in this cluster;
9        2) create a bucket in a remote AIS cluster (e.g., 'ais://@remais/BUCKET');
10        3) set up a cloud bucket with a custom profile and/or endpoint/region;
11        4) set bucket properties before first access;
12        5) attach multiple same-name cloud buckets under different namespaces (e.g., 's3://#ns1/bucket', 's3://#ns2/bucket');
13        6) and finally, register a cloud bucket that is not (yet) accessible (advanced-usage '--skip-lookup' option).
14    Examples:
15      - ais create ais://mybucket                                                                              - create AIS bucket 'mybucket' (must be done explicitly);
16      - ais create ais://@remais/BUCKET                                                                        - create a bucket in a remote AIS cluster referenced by the cluster's alias or UUID;
17      - ais create s3://mybucket                                                                               - add existing cloud (S3) bucket; normally AIS would auto-add it on first access;
18      - ais create s3://mybucket --props='extra.aws.profile=prod extra.aws.multipart_size=333M'                - add S3 bucket using a non-default cloud profile;
19      - ais create s3://#myaccount/mybucket --props='extra.aws.profile=swift extra.aws.endpoint=$S3_ENDPOINT'  - attach S3-compatible bucket via namespace '#myaccount';
20      - ais create oc://#phx/mybucket --props='extra.oci.region=us-phoenix-1'                                  - add OCI bucket using a non-default region and namespace '#phx';
21      - ais create s3://mybucket --skip-lookup --props='extra.aws.profile=...'                                 - advanced: register bucket without verifying its existence/accessibility (use with care);
22      - ais create gs://mybucket --skip-lookup --props='extra.gcp.application_creds=/mnt/vault/sa.json'        - GCS bucket with per-bucket service-account credentials.
23 
24 USAGE:
25    ais create BUCKET [BUCKET...] [command options]
26 
27 OPTIONS:
28    force,f       Force execution of the command (caution: advanced usage only)
29    ignore-error  Ignore "soft" failures such as "bucket already exists", etc.
30    props         Create bucket with the specified (non-default) properties, e.g.:
31                  * ais create ais://mmm --props="versioning.validate_warm_get=false versioning.synchronize=true"
32                  * ais create ais://nnn --props='mirror.enabled=true mirror.copies=4 checksum.type=md5'
33                  * ais create s3://bbb --props='extra.cloud.profile=prod extra.cloud.endpoint=https://s3.example.com'
34                  Tips:
35                    1) Use '--props' to override properties that a new bucket would normally inherit from cluster config at creation time.
36                    2) Use '--props' to set up an existing cloud bucket with a custom profile and/or custom endpoint/region.
37                  See also: 'ais bucket props show' and 'ais bucket props set'
38    skip-lookup   Do not execute HEAD(bucket) request to lookup remote bucket and its properties; possible usage scenarios include:
39                   1) adding remote bucket to aistore without first checking the bucket's accessibility
40                      (e.g., to configure the bucket's aistore properties with alternative security profile and/or endpoint)
41                   2) listing public-access Cloud buckets where certain operations (e.g., 'HEAD(bucket)') may be disallowed
42    help, h       Show help

Examples

Create AIS bucket

Create buckets bucket_name1 and bucket_name2, both with AIS provider.

1 $ ais create ais://bucket_name1 ais://bucket_name2
2 "ais://bucket_name1" bucket created
3 "ais://bucket_name2" bucket created

Create AIS bucket in local namespace

Create bucket bucket_name in ml namespace.

1 $ ais create ais://#ml/bucket_name
2 "ais://#ml/bucket_name" bucket created

Create bucket in remote AIS cluster

Create bucket bucket_name in global namespace of AIS remote cluster with Bghort1l UUID.

1 $ ais create ais://@Bghort1l/bucket_name
2 "ais://@Bghort1l/bucket_name" bucket created

Create bucket bucket_name in ml namespace of AIS remote cluster with Bghort1l UUID.

1 $ ais create ais://@Bghort1l#ml/bucket_name
2 "ais://@Bghort1l#ml/bucket_name" bucket created

Create bucket with custom properties

Create bucket bucket_name with custom properties specified.

1 $ # Key-value format
2 $ ais create ais://@Bghort1l/bucket_name --props="mirror.enabled=true mirror.copies=2"
3 "ais://@Bghort1l/bucket_name" bucket created
4 $
5 $ # JSON format
6 $ ais create ais://@Bghort1l/bucket_name --props='{"versioning": {"enabled": true, "validate_warm_get": true}}'
7 "ais://@Bghort1l/bucket_name" bucket created

Incorrect buckets creation

1 $ ais create aws://bucket_name
2 Create bucket "aws://bucket_name" failed: creating a bucket for any of the cloud or HTTP providers is not supported

Delete bucket

ais bucket rm BUCKET [BUCKET...]

Delete an ais bucket or buckets.

Examples

Remove AIS buckets

Remove AIS buckets bucket_name1 and bucket_name2.

1 $ ais bucket rm ais://bucket_name1 ais://bucket_name2
2 "ais://bucket_name1" bucket destroyed
3 "ais://bucket_name2" bucket destroyed

Remove AIS bucket in local namespace

Remove bucket bucket_name from ml namespace.

1 $ ais bucket rm ais://#ml/bucket_name
2 "ais://#ml/bucket_name" bucket destroyed

Remove bucket in remote AIS cluster

Remove bucket bucket_name from global namespace of AIS remote cluster with Bghort1l UUID.

1 $ ais bucket rm ais://@Bghort1l/bucket_name
2 "ais://@Bghort1l/bucket_name" bucket destroyed

Remove bucket bucket_name from ml namespace of AIS remote cluster with Bghort1l UUID.

1 $ ais bucket rm ais://@Bghort1l#ml/bucket_name
2 "ais://@Bghort1l#ml/bucket_name" bucket destroyed

Incorrect buckets removal

Removing remote buckets is not supported.

1 $ ais bucket rm aws://bucket_name
2 Operation "destroy-bck" is not supported by "aws://bucket_name"

List buckets

ais ls PROVIDER:[//BUCKET_NAME] [command options]

Notice the optional [//BUCKET_NAME]. When there’s no bucket, ais ls will list buckets. Otherwise, it’ll list objects.

Usage

1 $ ais ls --help
2 NAME:
3    ais ls - (alias for "bucket ls") List buckets, objects in buckets, and files in (.tar, .tgz, .tar.gz, .zip, .tar.lz4)-formatted objects,
4    e.g.:
5      * ais ls                                              - list all buckets in a cluster (all providers);
6      * ais ls ais://abc -props name,size,copies,location   - list objects with only these specific properties;
7      * ais ls ais://abc -props all                         - list objects with all available properties;
8      * ais ls ais://abc --page-size 20 --refresh 3s        - list large bucket (20 items per page), progress every 3s;
9      * ais ls ais://abc --page-size 20 --refresh 3         - same as above;
10      * ais ls ais                                          - list all ais buckets;
11      * ais ls s3                                           - list all s3 buckets present in the cluster;
12      * ais ls s3 --all                                     - list all s3 buckets (both in-cluster and remote).
13    list archive contents:
14      * ais ls ais://abc/sample.tar --archive   - list files inside a tar archive;
15    list in pages (continues until '--max-pages', '--limit', Ctrl-C, or end of bucket):
16      * ais ls s3://abc --paged --limit 1234000        - limited paged output (1234 pages), with default properties;
17      * ais ls s3://abc --paged --limit 1234000 --nr   - same as above, non-recursively (skips nested directories);
18    with template, regex, and/or prefix:
19      * ais ls gs: --regex "^abc" --all                          - list all accessible GCP buckets with names starting with "abc";
20      * ais ls ais://abc --regex "\.md$" --props size,checksum   - list markdown files with size and checksum;
21      * ais ls gs://abc --template images/                       - list all objects from virtual subdirectory "images";
22      * ais ls gs://abc --prefix images/                         - same as above (for more examples, see '--template' below);
23      * ais ls gs://abc/images/                                  - same as above.
24    with in-cluster vs remote content comparison (diff):
25      * ais ls s3://abc --check-versions            - for each remote object: check for identical in-cluster copy
26        →                                             and show missing objects;
27      * ais ls s3://abc --check-versions --cached   - for each in-cluster object: check for identical remote copy
28        →                                             and show deleted objects.
29    with summary (bucket sizes and numbers of objects):
30      * ais ls ais://nnn --summary --prefix=aaa/bbb   - summarize objects matching the given prefix;
31      * ais ls ais://nnn/aaa/bbb --summary            - same as above;
32      * ais ls az://azure-bucket --count-only         - fastest way to count objects in a bucket;
33      * ais ls s3 --summary                           - for each s3 bucket: print object count and total size;
34      * ais ls s3 --summary --all                     - summary report for all s3 buckets including remote/non-present;
35      * ais ls s3 --summary --all --dont-add          - same, without adding non-present buckets to cluster metadata.
36 ...
37 ...

Assorted options

The options are numerous. Here’s a non-exhaustive list (for the most recent update, run ais ls --help)

1 OPTIONS:
2    --all                depending on the context:
3                         - all objects in a given bucket, including misplaced and copies, or
4                         - all buckets, including accessible (visible) remote buckets that are _not present_ in the cluster
5    --cached             list only those objects from a remote bucket that are present ("cached")
6    --name-only          faster request to retrieve only the names of objects (if defined, '--props' flag will be ignored)
7    --props value        comma-separated list of object properties including name, size, version, copies and more; e.g.:
8                         --props all
9                         --props name,size,cached
10                         --props "ec, copies, custom, location"
11    --regex value        regular expression; use it to match either bucket names or objects in a given bucket, e.g.:
12                         ais ls --regex "(m|n)"         - match buckets such as ais://nnn, s3://mmm, etc.;
13                         ais ls ais://nnn --regex "^A"  - match object names starting with letter A
14    --summary            show object numbers, bucket sizes, and used capacity; applies _only_ to buckets and objects that are _present_ in the cluster
15    --units value        show statistics and/or parse command-line specified sizes using one of the following _units of measurement_:
16                         iec - IEC format, e.g.: KiB, MiB, GiB (default)
17                         si  - SI (metric) format, e.g.: KB, MB, GB
18                         raw - do not convert to (or from) human-readable format
19    --no-headers, -H     display tables without headers
20    --no-footers         display tables without footers

`ais ls --regex "ngn*"`

List all buckets matching the ngn* regex expression.

`ais ls aws:` or (same) `ais ls s3`

List all existing buckets for the specific provider.

`ais ls aws --all` or (same) `ais ls s3: --all`

List absolutely all buckets that cluster can “see” including those that are not necessarily present in the cluster.

`ais ls ais://` or (same) `ais ls ais`

List all AIS buckets.

`ais ls ais://#name`

List all buckets for the ais provider and name namespace.

`ais ls ais://@uuid#namespace`

List all remote AIS buckets that have uuid#namespace namespace. Note that:

the uuid must be the remote cluster UUID (or its alias)
while the namespace is optional name of the remote namespace

As a rule of thumb, when a (logical) #namespace in the bucket’s name is omitted we use the global namespace that always exists.

List objects

ais ls is one of those commands that only keeps growing, in terms of supported options and capabilities.

The command:

ais ls PROVIDER:[//BUCKET_NAME] [command options]

can conveniently list buckets (with or without “summarizing” object counts and sizes) and objects.

Notice the optional [//BUCKET_NAME]. When there’s no bucket, ais ls will list buckets. Otherwise, it’ll list objects.

The command’s inline help is also quite extensive, with (inline) examples followed by numerous supported options:

1 $ ais ls --help
2 NAME:
3    ais ls - (alias for "bucket ls") List buckets, objects in buckets, and files in (.tar, .tgz, .tar.gz, .zip, .tar.lz4)-formatted objects,
4    e.g.:
5      * ais ls                                              - list all buckets in a cluster (all providers);
6      * ais ls ais://abc -props name,size,copies,location   - list objects with only these specific properties;
7      * ais ls ais://abc -props all                         - list objects with all available properties;
8      * ais ls ais://abc --page-size 20 --refresh 3s        - list large bucket (20 items per page), progress every 3s;
9      * ais ls ais://abc --page-size 20 --refresh 3         - same as above;
10      * ais ls ais                                          - list all ais buckets;
11      * ais ls s3                                           - list all s3 buckets present in the cluster;
12      * ais ls s3 --all                                     - list all s3 buckets (both in-cluster and remote).
13    list archive contents:
14      * ais ls ais://abc/sample.tar --archive   - list files inside a tar archive;
15    list in pages (continues until '--max-pages', '--limit', Ctrl-C, or end of bucket):
16      * ais ls s3://abc --paged --limit 1234000        - limited paged output (1234 pages), with default properties;
17      * ais ls s3://abc --paged --limit 1234000 --nr   - same as above, non-recursively (skips nested directories);
18    with template, regex, and/or prefix:
19      * ais ls gs: --regex "^abc" --all                          - list all accessible GCP buckets with names starting with "abc";
20      * ais ls ais://abc --regex "\.md$" --props size,checksum   - list markdown files with size and checksum;
21      * ais ls gs://abc --template images/                       - list all objects from virtual subdirectory "images";
22      * ais ls gs://abc --prefix images/                         - same as above (for more examples, see '--template' below);
23      * ais ls gs://abc/images/                                  - same as above.
24    with in-cluster vs remote content comparison (diff):
25      * ais ls s3://abc --check-versions            - for each remote object: check for identical in-cluster copy
26        →                                             and show missing objects;
27      * ais ls s3://abc --check-versions --cached   - for each in-cluster object: check for identical remote copy
28        →                                             and show deleted objects.
29    with summary (bucket sizes and numbers of objects):
30      * ais ls ais://nnn --summary --prefix=aaa/bbb   - summarize objects matching the given prefix;
31      * ais ls ais://nnn/aaa/bbb --summary            - same as above;
32      * ais ls az://azure-bucket --count-only         - fastest way to count objects in a bucket;
33      * ais ls s3 --summary                           - for each s3 bucket: print object count and total size;
34      * ais ls s3 --summary --all                     - summary report for all s3 buckets including remote/non-present;
35      * ais ls s3 --summary --all --dont-add          - same, without adding non-present buckets to cluster metadata.
36 
37 USAGE:
38    ais ls [BUCKET[/PREFIX]] [PROVIDER] [command options]
39 
40 OPTIONS:
41    --all                  Depending on the context, list:
42                           - all buckets, including accessible (visible) remote buckets that are not in-cluster
43                           - all objects in a given accessible (visible) bucket, including remote objects and misplaced copies
44    --archive              List archived content (see docs/archive.md for details)
45    --cached               Only list in-cluster objects, i.e., objects from the respective remote bucket that are present ("cached") in the cluster
46    --count-only           Print only the resulting number of listed objects and elapsed time
47    --diff                 Perform a bidirectional diff between in-cluster and remote content, which further entails:
48                           - detecting remote version changes (a.k.a. out-of-band updates), and
49                           - remotely deleted objects (out-of-band deletions (*));
50                             the option requires remote backends supporting some form of versioning (e.g., object version, checksum, and/or ETag);
51                           see related:
52                                (*) options: --cached; --latest
53                                commands:    'ais get --latest'; 'ais cp --sync'; 'ais prefetch --latest'
54    --dont-add             List remote bucket without adding it to cluster's metadata - e.g.:
55                             - let's say, s3://abc is accessible but not present in the cluster (e.g., 'ais ls' returns error);
56                             - then, if we ask aistore to list remote buckets: `ais ls s3://abc --all'
57                               the bucket will be added (in effect, it'll be created);
58                             - to prevent this from happening, either use this '--dont-add' flag or run 'ais evict' command later
59    --dont-wait            When _summarizing_ buckets do not wait for the respective job to finish -
60                           use the job's UUID to query the results interactively
61    --inv-id value         Bucket inventory ID (optional; by default, we use bucket name as the bucket's inventory ID)
62    --inv-name value       Bucket inventory name (optional; system default name is '.inventory')
63    --inventory            List objects using _bucket inventory_ (docs/s3compat.md); requires s3:// backend; will provide significant performance
64                           boost when used with very large s3 buckets; e.g. usage:
65                             1) 'ais ls s3://abc --inventory'
66                             2) 'ais ls s3://abc --inventory --paged --prefix=subdir/'
67                           (see also: docs/s3compat.md)
68    --limit value          The maximum number of objects to list, get, or otherwise handle (0 - unlimited; see also '--max-pages'),
69                           e.g.:
70                           - 'ais ls gs://abc/dir --limit 1234 --cached --props size,custom,atime'  - list no more than 1234 objects
71                           - 'ais get gs://abc /dev/null --prefix dir --limit 1234'                 - get --/--
72                           - 'ais scrub gs://abc/dir --limit 1234'                                  - scrub --/-- (default: 0)
73    --max-pages value      Maximum number of pages to display (see also '--page-size' and '--limit')
74                           e.g.: 'ais ls az://abc --paged --page-size 123 --max-pages 7 (default: 0)
75    --name-only            Faster request to retrieve only the names of objects (if defined, '--props' flag will be ignored)
76    --no-dirs              Do not return virtual subdirectories (applies to remote buckets only)
77    --no-footers, -F       Display tables without footers
78    --no-headers, -H       Display tables without headers
79    --non-recursive, --nr  Non-recursive operation, e.g.:
80                           - 'ais ls gs://bucket/prefix --nr'   - list objects and/or virtual subdirectories with names starting with the specified prefix;
81                           - 'ais ls gs://bucket/prefix/ --nr'  - list contained objects and/or immediately nested virtual subdirectories _without_ recursing into the latter;
82                           - 'ais prefetch s3://bck/abcd --nr'  - prefetch a single named object (see 'ais prefetch --help' for details);
83                           - 'ais rmo gs://bucket/prefix --nr'  - remove a single object with the specified name (see 'ais rmo --help' for details)
84    --page-size value      Maximum number of object names per page; when the flag is omitted or 0 (zero)
85                           the maximum is defined by the corresponding backend; see also '--max-pages' and '--paged' (default: 0)
86    --paged                List objects page by page - one page at a time (see also '--page-size' and '--limit')
87                           note: recommended for use with very large buckets
88    --prefix value         List objects with names starting with the specified prefix, e.g.:
89                           '--prefix a/b/c' - list virtual directory a/b/c and/or objects from the virtual directory
90                           a/b that have their names (relative to this directory) starting with the letter 'c'
91    --props value          Comma-separated list of object properties including name, size, version, copies, and more; e.g.:
92                           --props all
93                           --props name,size,cached
94                           --props "ec, copies, custom, location"
95    --refresh value        Time interval for continuous monitoring; can be also used to update progress bar (at a given interval);
96                           valid time units: ns, us (or µs), ms, s (default), m, h
97    --regex value          Regular expression; use it to match either bucket names or objects in a given bucket, e.g.:
98                           ais ls --regex "(m|n)"         - match buckets such as ais://nnn, s3://mmm, etc.;
99                           ais ls ais://nnn --regex "^A"  - match object names starting with letter A
100    --show-unmatched       List also objects that were not matched by regex and/or template (range)
101    --silent               Server-side flag, an indication for aistore _not_ to log assorted errors (e.g., HEAD(object) failures)
102    --skip-lookup          Do not execute HEAD(bucket) request to lookup remote bucket and its properties; possible usage scenarios include:
103                            1) adding remote bucket to aistore without first checking the bucket's accessibility
104                               (e.g., to configure the bucket's aistore properties with alternative security profile and/or endpoint)
105                            2) listing public-access Cloud buckets where certain operations (e.g., 'HEAD(bucket)') may be disallowed
106    --start-after value    List bucket's content alphabetically starting with the first name _after_ the specified
107    --summary              Show object numbers, bucket sizes, and used capacity;
108                           note: applies only to buckets and objects that are _present_ in the cluster
109    --template value       Template to match object or file names; may contain prefix (that could be empty) with zero or more ranges
110                           (with optional steps and gaps), e.g.:
111                           --template "" # (an empty or '*' template matches everything)
112                           --template 'dir/subdir/'
113                           --template 'shard-{1000..9999}.tar'
114                           --template "prefix-{0010..0013..2}-gap-{1..2}-suffix"
115                           and similarly, when specifying files and directories:
116                           --template '/home/dir/subdir/'
117                           --template "/abc/prefix-{0010..9999..2}-suffix"
118    --units value          Show statistics and/or parse command-line specified sizes using one of the following units of measurement:
119                           iec - IEC format, e.g.: KiB, MiB, GiB (default)
120                           si  - SI (metric) format, e.g.: KB, MB, GB
121                           raw - do not convert to (or from) human-readable format
122    --help, -h             Show help

Assorted options

Name	Type	Description	Default
`--regex`	`string`	regular expression to match and select items in question	`""`
`--template`	`string`	template for matching object names, e.g.: ‘shard-{900..999}.tar’	`""`
`--prefix`	`string`	list objects matching a given prefix	`""`
`--page-size`	`int`	maximum number of names per page (0 - the maximum is defined by the corresponding backend)	`0`
`--props`	`string`	comma-separated list of object properties including name, size, version, copies, EC data and parity info, custom metadata, location and more; to include all properties, type ‘—props all’ (default: “name,size”)	`"name,size"`
`--limit`	`int`	limit object name count (0 - unlimited)	`0`
`--show-unmatched`	`bool`	list objects that were not matched by regex and/or template	`false`
`--all`	`bool`	depending on context: all objects (including misplaced ones and copies) or all buckets (including remote buckets that are not present in the cluster)	`false`
-no-headers, -H	`bool`	display tables without headers	`false`
—no-footers	`bool`	display tables without footers	`false`
`--paged`	`bool`	list objects page by page, one page at a time (see also ‘—page-size’ and ‘—limit’)	`false`
`--max-pages`	`int`	display up to this number pages of bucket objects (default: 0)	`0`
`--marker`	`string`	list bucket’s content alphabetically starting with the first name after the specified	`""`
`--start-after`	`string`	Object name (marker) after which the listing should start	`""`
`--cached`	`bool`	list only those objects from a remote bucket that are present (“cached”)	`false`
`--skip-lookup`	`bool`	list public-access Cloud buckets that may disallow certain operations (e.g., `HEAD(bucket)`); use this option for performance or to read Cloud buckets that allow anonymous access	`false`
`--archive`	`bool`	list archived content	`false`
`--check-versions`	`bool`	check whether listed remote objects and their in-cluster copies are identical, ie., have the same versions; applies to remote backends that maintain at least some form of versioning information (e.g., version, checksum, ETag)	`false`
`--summary`	`bool`	show bucket sizes and used capacity; by default, applies only to the buckets that are present in the cluster (use ‘—all’ option to override)	`false`
`--bytes`	`bool`	show sizes in bytes (ie., do not convert to KiB, MiB, GiB, etc.)	`false`
`--name-only`	`bool`	fast request to retrieve only the names of objects in the bucket; if defined, all comma-separated fields in the `--props` flag will be ignored with only two exceptions: `name` and `status`	`false`

When listing objects, a footer will be displayed showing:

Total number of objects listed
For remote buckets with --cached option: number of objects present in-cluster
For --paged option: current page number
For --count-only option: time elapsed to fetch the list

Examples of footer variations:

Listed 12345 names
Listed 12345 names (in-cluster: 456)
Page 123: 1000 names (in-cluster: none)

Examples

List AIS and Cloud buckets with all defaults

1. List objects in the AIS bucket bucket_name.

1 $ ais ls ais://bucket_name
2 NAME		SIZE
3 shard-0.tar	16.00KiB
4 shard-1.tar	16.00KiB
5 ...

2. List objects in the remote bucket bucket_name.

1 ais ls aws://bucket_name
2 NAME		SIZE
3 shard-0.tar	16.00KiB
4 shard-1.tar	16.00KiB
5 ...

3. List objects from a remote AIS cluster with a namespace:

$ ais ls ais://@Bghort1l#ml/bucket_name
NAME                SIZE        VERSION
shard-0.tar         16.00KiB    1
shard-1.tar         16.00KiB    1
...

4. List objects with paged output (showing page numbers):

$ ais ls ais://bucket_name --paged --limit 100
[... object listing ...]
Page 1: 100 names

5. List cached objects from a remote bucket:

$ ais ls s3://bucket_name --cached
[... listing of only in-cluster objects ...]
Listed 456789 names

6. Count objects in a bucket:

$ ais ls s3://bucket_name/aprefix --count-only
Listed 28,230 names in 5.62s

7. Count objects with paged output:

$ ais ls s3://bucket_name/bprefix --count-only --paged
Page 1: 1,000 names in 772ms
Page 2: 1,000 names in 180ms
Page 3: 1,000 names in 265ms
...
Page 29: 230 names in 130ms

Notes:

When using --paged with remote buckets, the footer will show both page number and in-cluster object count when applicable
The --diff option requires remote backends supporting some form of versioning (e.g., object version, checksum, and/or ETag)
For more information on working with archived content, see docs/archive.md
To fully synchronize in-cluster content with remote backend, see documentation on out-of-band updates

Include all properties

1 # ais ls gs://webdataset-abc --skip-lookup --props all
2 NAME                             SIZE          CHECKSUM                           ATIME   VERSION                 CACHED  TARGET URL            STATUS  COPIES
3 coco-train2014-seg-000000.tar    958.48MiB     bdb89d1b854040b6050319e80ef44dde           1657297128665686        no      http://aistore:8081   ok      0
4 coco-train2014-seg-000001.tar    958.47MiB     8b94939b7d166114498e794859fb472c           1657297129387272        no      http://aistore:8081   ok      0
5 coco-train2014-seg-000002.tar    958.47MiB     142a8e81f965f9bcafc8b04eda65a0ce           1657297129904067        no      http://aistore:8081   ok      0
6 coco-train2014-seg-000003.tar    958.22MiB     113024d5def81365cbb6c404c908efb1           1657297130555590        no      http://aistore:8081   ok      0
7 ...

List bucket from AIS remote cluster

List objects in the bucket bucket_name and ml namespace contained on AIS remote cluster with Bghort1l UUID.

1 $ ais ls ais://@Bghort1l#ml/bucket_name
2 NAME		SIZE		VERSION
3 shard-0.tar	16.00KiB	1
4 shard-1.tar	16.00KiB	1
5 ...

With prefix

List objects which match given prefix.

1 $ ais ls ais://bucket_name --prefix "shard-1"
2 NAME		SIZE		VERSION
3 shard-1.tar	16.00KiB	1
4 shard-10.tar	16.00KiB	1

Bucket inventory

Here’s a quick 4-steps sequence to demonstrate the functionality:

1. In the beginning, the bucket is accessible (notice --all) and empty, as far as its in-cluster content

1 $ ais ls s3://abc --cached --all
2 NAME                     SIZE

2. The first (remote) list-objects will have the side-effect of loading remote inventory

1 $ ais ls s3://abc --inventory --count-only
2 Note: listing remote objects in s3://abc may take a while
3 (Tip: use '--cached' to speed up and/or '--paged' to show pages)
4 
5 Listed 2,319,231 names in 23.91s

3. The second and later list-objects will run much faster

1 $ ais ls s3://abc --inventory --count-only
2 Listed 2,319,231 names in 4.18s

4. Finally, observe that at in-cluster content now includes the inventory (.csv) itself

1 $ ais ls s3://abc --cached
2 NAME                     SIZE
3 .inventory/ais-vm.csv    143.61MiB

List archived content

1 $ ais ls ais://abc/ --prefix log
2 NAME             SIZE
3 log.tar.gz      3.11KiB
4 
5 $ ais ls ais://abc/ --prefix log --archive
6 NAME                                             SIZE
7 log.tar.gz                                       3.11KiB
8     log2.tar.gz/t_2021-07-27_14-08-50.log        959B
9     log2.tar.gz/t_2021-07-27_14-10-36.log        959B
10     log2.tar.gz/t_2021-07-27_14-12-18.log        959B
11     log2.tar.gz/t_2021-07-27_14-13-23.log        295B
12     log2.tar.gz/t_2021-07-27_14-13-31.log        1.02KiB
13     log2.tar.gz/t_2021-07-27_14-14-16.log        1.71KiB
14     log2.tar.gz/t_2021-07-27_14-15-15.log        1.90KiB

List anonymously (i.e., list public-access Cloud bucket)

1 $ ais ls gs://webdataset-abc --skip-lookup
2 NAME                             SIZE
3 coco-train2014-seg-000000.tar    958.48MiB
4 coco-train2014-seg-000001.tar    958.47MiB
5 coco-train2014-seg-000002.tar    958.47MiB
6 coco-train2014-seg-000003.tar    958.22MiB
7 coco-train2014-seg-000004.tar    958.56MiB
8 coco-train2014-seg-000005.tar    958.19MiB
9 ...

Use ‘—prefix’ that crosses shard boundary

For starters, we archive all aistore docs:

1 $ ais put docs ais://A.tar --archive -r

To list a certain virtual subdirectory inside this newly created shard:

1 $ ais archive ls ais://nnn --prefix "A.tar/tutorials"
2 NAME                                             SIZE
3     A.tar/tutorials/README.md                    561B
4     A.tar/tutorials/etl/compute_md5.md           8.28KiB
5     A.tar/tutorials/etl/etl_imagenet_pytorch.md  4.16KiB
6     A.tar/tutorials/etl/etl_webdataset.md        3.97KiB
7 Listed: 4 names

or, same:

1 $ ais ls ais://nnn --prefix "A.tar/tutorials" --archive
2 NAME                                             SIZE
3     A.tar/tutorials/README.md                    561B
4     A.tar/tutorials/etl/compute_md5.md           8.28KiB
5     A.tar/tutorials/etl/etl_imagenet_pytorch.md  4.16KiB
6     A.tar/tutorials/etl/etl_webdataset.md        3.97KiB
7 Listed: 4 names

Evict remote bucket

AIS supports multiple storage backends:

Type	Description	Example Name
AIS Bucket	Native bucket managed by AIS	`ais://mybucket`
Remote AIS Bucket	Bucket in a remote AIS cluster	`ais://@cluster/mybucket`
Cloud Bucket	Remote bucket (e.g., S3, GCS, Azure)	`s3://dataset`
Backend Bucket	AIS bucket linked to a remote bucket	`ais://cachebucket → s3://x`

See Unified Namespace for details on remote AIS clusters.

One major distinction between an AIS bucket (e.g., ais://mybucket) and a remote bucket (e.g., ais://@cluster/mybucket, s3://dataset, etc.) boils down to the fact that - for a variety of real-life reasons - in-cluster content of the remote bucket may be different from its remote content.

Note that the terms in-cluster and cached are used interchangeably throughout the entire documentation and CLI.

Remote buckets can be prefetched and evicted from AIS, entirely or selectively:

CLI: Three Ways to Evict Remote Bucket

Some of the supported functionality can be quickly demonstrated with the following examples:

1 $ ais bucket evict aws://abc
2 "aws://abc" bucket evicted
3 
4 # Dry run: the cluster will not be modified
5 $ ais bucket evict --dry-run aws://abc
6 [DRY RUN] No modifications on the cluster
7 EVICT: "aws://abc"
8 
9 # Only evict the remote bucket's data (AIS will retain the bucket's metadata)
10 $ ais bucket evict --keep-md aws://abc
11 "aws://abc" bucket evicted

Here’s a more complete example that lists remote bucket, then reads and evicts a given object:

1 $ ais ls gs://wrQkliptRt
2 NAME             SIZE
3 TDXBNBEZNl.tar   8.50KiB
4 qFpwOOifUe.tar   8.50KiB
5 thmdpZXetG.tar   8.50KiB
6 
7 $ ais get gcp://wrQkliptRt/qFpwOOifUe.tar /tmp/qFpwOOifUe.tar
8 GET "qFpwOOifUe.tar" from bucket "gcp://wrQkliptRt" as "/tmp/qFpwOOifUe.tar" [8.50KiB]
9 
10 $ ais ls gs://wrQkliptRt --props all
11 NAME             SIZE            CHECKSUM                                ATIME                   VERSION                 CACHED  STATUS  COPIES
12 TDXBNBEZNl.tar   8.50KiB         33345a69bade096a30abd42058da4537                                1622133976984266        no      ok      0
13 qFpwOOifUe.tar   8.50KiB         47dd59e41f6b7723                        28 May 21 12:02 PDT     1622133846120151        yes     ok      1
14 thmdpZXetG.tar   8.50KiB         cfe0c386e91daa1571d6a659f49b1408                                1622137609269706        no      ok      0
15 
16 $ ais bucket evict gcp://wrQkliptRt
17 "gcp://wrQkliptRt" bucket evicted
18 
19 $ ais ls gs://wrQkliptRt --props all
20 NAME             SIZE            CHECKSUM                                ATIME   VERSION                 CACHED  STATUS  COPIES
21 TDXBNBEZNl.tar   8.50KiB         33345a69bade096a30abd42058da4537                1622133976984266        no      ok      0
22 qFpwOOifUe.tar   8.50KiB         8b5919c0850a07d931c3c46ed9101eab                1622133846120151        no      ok      0
23 thmdpZXetG.tar   8.50KiB         cfe0c386e91daa1571d6a659f49b1408                1622137609269706        no      ok      0

Move or Rename a bucket

ais bucket mv BUCKET NEW_BUCKET

Move (ie. rename) an AIS bucket. If the NEW_BUCKET already exists, the mv operation will not proceed.

Cloud bucket move is not supported.

Examples

Move AIS bucket

Move AIS bucket bucket_name to AIS bucket new_bucket_name.

1 $ ais bucket mv ais://bucket_name ais://new_bucket_name
2 Moving bucket "ais://bucket_name" to "ais://new_bucket_name" in progress.
3 To check the status, run: ais show job xaction mvlb ais://new_bucket_name

Copy (list, range, and/or prefix) selected objects or entire (in-cluster or remote) buckets

ais cp SRC_BUCKET[/OBJECT_NAME_or_TEMPLATE] DST_BUCKET [command options]

1 $ ais cp --help
2 NAME:
3    ais cp - (alias for "bucket cp") Copy entire bucket or selected objects (to select, use '--list', '--template', or '--prefix'),
4      e.g.:
5      - 'ais cp gs://webdataset-coco ais://dst'                                  - copy entire Cloud bucket;
6      - 'ais cp s3://abc ais://nnn --all'                                        - copy Cloud bucket that may _not_ be present in cluster (and create destination if doesn't exist);
7      - 'ais cp s3://abc ais://nnn --all --num-workers 16'                       - same as above employing 16 concurrent workers;
8      - 'ais cp s3://abc ais://nnn --all --num-workers 16 --prefix dir/subdir/'  - same as above, but limit copying to a given virtual subdirectory;
9      - 'ais cp s3://abc gs://xyz --all'                                         - copy Cloud bucket to another Cloud.
10      similar to prefetch:
11      - 'ais cp s3://data s3://data --all'  - copy remote source (and create namesake destination in-cluster bucket if doesn't exist).
12      synchronize with out-of-band updates:
13      - 'ais cp s3://abc ais://nnn --latest'  - copy Cloud bucket; make sure that already present in-cluster copies are updated to the latest versions;
14      - 'ais cp s3://abc ais://nnn --sync'    - same as above, but in addition delete in-cluster copies that do not exist (any longer) in the remote source.
15      with template, prefix, and progress:
16      - 'ais cp s3://abc ais://nnn --prepend backup/'                                              - copy objects into 'backup/' virtual subdirectory in destination bucket;
17      - 'ais cp ais://nnn/111 ais://mmm'                                                           - copy all ais://nnn objects that match prefix '111';
18      - 'ais cp gs://webdataset-coco ais:/dst --template d-tokens/shard-{000000..000999}.tar.lz4'  - copy up to 1000 objects that share the specified prefix;
19      - 'ais cp gs://webdataset-coco ais:/dst --prefix d-tokens/ --progress --all'                 - show progress while copying virtual subdirectory 'd-tokens';
20      - 'ais cp gs://webdataset-coco/d-tokens/ ais:/dst --progress --all'                          - same as above;
21      - 'ais cp s3://abc/dir/ ais://dst --nr'                                                      - copy only immediate contents of 'dir/' (non-recursive).
22 
23 USAGE:
24    ais cp SRC_BUCKET[/OBJECT_NAME_or_TEMPLATE] DST_BUCKET [command options]
25 
26 OPTIONS:
27    --all                  Copy all objects from a remote bucket including those that are not present (not cached) in cluster
28    --cont-on-err          Keep running archiving xaction (job) in presence of errors in a any given multi-object transaction
29    --dry-run              Show total size of new objects without really creating them
30    --force, -f            Force execution of the command (caution: advanced usage only)
31    --latest               Check in-cluster metadata and, possibly, GET, download, prefetch, or otherwise copy the latest object version
32                           from the associated remote bucket;
33                           the option provides operation-level control over object versioning (and version synchronization)
34                           without the need to change the corresponding bucket configuration: 'versioning.validate_warm_get';
35                           see also:
36                             - 'ais show bucket BUCKET versioning'
37                             - 'ais bucket props set BUCKET versioning'
38                             - 'ais ls --check-versions'
39                           supported commands include:
40                             - 'ais cp', 'ais prefetch', 'ais get'
41    --list value           Comma-separated list of object or file names, e.g.:
42                           --list 'o1,o2,o3'
43                           --list "abc/1.tar, abc/1.cls, abc/1.jpeg"
44                           or, when listing files and/or directories:
45                           --list "/home/docs, /home/abc/1.tar, /home/abc/1.jpeg"
46    --non-recursive, --nr  Non-recursive operation, e.g.:
47                           - 'ais ls gs://bucket/prefix --nr'   - list objects and/or virtual subdirectories with names starting with the specified prefix;
48                           - 'ais ls gs://bucket/prefix/ --nr'  - list contained objects and/or immediately nested virtual subdirectories _without_ recursing into the latter;
49                           - 'ais prefetch s3://bck/abcd --nr'  - prefetch a single named object (see 'ais prefetch --help' for details);
50                           - 'ais rmo gs://bucket/prefix --nr'  - remove a single object with the specified name (see 'ais rmo --help' for details)
51    --non-verbose, --nv    Non-verbose (quiet) output, minimized reporting, fewer warnings
52    --num-workers value    Number of concurrent workers (readers); defaults to a number of target mountpaths if omitted or zero;
53                           use (-1) to indicate single-threaded serial execution (ie., no workers);
54                           any positive value will be adjusted _not_ to exceed the number of target CPUs (default: 0)
55    --prefix value         Select virtual directories or objects with names starting with the specified prefix, e.g.:
56                           '--prefix a/b/c'   - matches names 'a/b/c/d', 'a/b/cdef', and similar;
57                           '--prefix a/b/c/'  - only matches objects from the virtual directory a/b/c/
58    --prepend value        Prefix to prepend to every object name during operation (copy or transform), e.g.:
59                           --prepend=abc   - prefix all object names with "abc"
60                           --prepend=abc/  - use "abc" as a virtual directory (note trailing filepath separator)
61                                           - during 'copy', this flag applies to copied objects
62                                           - during 'transform', this flag applies to transformed objects
63    --progress             Show progress bar(s) and progress of execution in real time
64    --refresh value        Time interval for continuous monitoring; can be also used to update progress bar (at a given interval);
65                           valid time units: ns, us (or µs), ms, s (default), m, h
66    --sync                 Fully synchronize in-cluster content of a given remote bucket with its (Cloud or remote AIS) source;
67                           the option is, effectively, a stronger variant of the '--latest' (option):
68                           in addition to bringing existing in-cluster objects in-sync with their respective out-of-band updates (if any)
69                           it also entails removing in-cluster objects that are no longer present remotely;
70                           like '--latest', this option provides operation-level control over synchronization
71                           without requiring to change the corresponding bucket configuration: 'versioning.synchronize';
72                           see also:
73                             - 'ais show bucket BUCKET versioning'
74                             - 'ais bucket props set BUCKET versioning'
75                             - 'ais ls --check-versions'
76    --template value       Template to match object or file names; may contain prefix (that could be empty) with zero or more ranges
77                           (with optional steps and gaps), e.g.:
78                           --template "" # (an empty or '*' template matches everything)
79                           --template 'dir/subdir/'
80                           --template 'shard-{1000..9999}.tar'
81                           --template "prefix-{0010..0013..2}-gap-{1..2}-suffix"
82                           and similarly, when specifying files and directories:
83                           --template '/home/dir/subdir/'
84                           --template "/abc/prefix-{0010..9999..2}-suffix"
85    --timeout value        Maximum time to wait for a job to finish; if omitted: wait forever or until Ctrl-C;
86                           valid time units: ns, us (or µs), ms, s (default), m, h
87    --wait                 Wait for an asynchronous operation to finish (optionally, use '--timeout' to limit the waiting time)
88    --help, -h             Show help

Source bucket must exist. When the destination bucket is remote (e.g. in the Cloud) it must also exist and be writeable.

NOTE: there’s no requirement that either of the buckets is present in aistore.

NOTE: not to confuse in-cluster presence and existence. Remote object may exist (remotely), etc.

NOTE: to fully synchronize in-cluster content with remote backend, please refer to out of band updates.

Moreover, when the destination is AIS (ais://) or remote AIS (ais://@remote-alias) bucket, the existence is optional: the destination will be created on the fly, with bucket properties copied from the source (SRC_BUCKET).

NOTE: similar to delete, evict and prefetch operations, cp also supports embedded prefix - see disambiguating multi-object operation

Finally, the option to copy remote bucket onto itself is also supported - syntax-wise. Here’s an example that’ll shed some light:

1 ## 1. at first, we don't have any gs:// buckets in the cluster
2 
3 $ ais ls gs
4 No "gs://" buckets in the cluster. Use '--all' option to list matching remote buckets, if any.
5 
6 ## 2. notwithstanding, we go ahead and start copying gs://coco-dataset
7 
8 $ ais cp gs://coco-dataset gs://coco-dataset --prefix d-tokens --progress --all
9 Copied objects:                  282/393 [===========================================>------------------] 72 %
10 Copied size:    719.48 MiB / 1000.08 MiB [============================================>-----------------] 72 %
11 
12 ## 3. and done: all 393 objects from the remote bucket are now present ("cached") in the cluster
13 
14 $ ais ls gs://coco-dataset --cached | grep Listed
15 Listed: 393 names

Incidentally, notice the --cached difference:

1 $ ais ls gs://coco-dataset --cached | grep Listed
2 Listed: 393 names
3 
4 ## vs _all_ including remote:
5 
6 $ ais ls gs://coco-dataset | grep Listed
7 Listed: 2,290 names

Examples

Copy non-existing remote bucket to a non-existing in-cluster destination

1 $ ais ls s3
2 No "s3://" buckets in the cluster. Use '--all' option to list matching remote buckets, if any.
3 
4 $ ais cp s3://abc ais://nnn --all
5 Warning: destination ais://nnn doesn't exist and will be created with configuration copied from the source (s3://abc))
6 Copying s3://abc => ais://nnn. To monitor the progress, run 'ais show job tco-JcTKbhvFy'

Copy AIS bucket

Copy AIS bucket src_bucket to AIS bucket dst_bucket.

1 $ ais cp ais://src_bucket ais://dst_bucket
2 Copying bucket "ais://bucket_name" to "ais://dst_bucket" in progress.
3 To check the status, run: ais show job xaction copy-bck ais://dst_bucket

Copy AIS bucket and wait until the job finishes

The same as above, but wait until copying is finished.

1 $ ais cp ais://src_bucket ais://dst_bucket --wait

Copy cloud bucket to another cloud bucket

Copy AWS bucket src_bucket to AWS bucket dst_bucket.

1 # Make sure that both buckets exist.
2 $ ais ls aws://
3 AWS Buckets (2)
4   aws://src_bucket
5   aws://dst_bucket
6 $ ais cp aws://src_bucket aws://dst_bucket
7 Copying bucket "aws://src_bucket" to "aws://dst_bucket" in progress.
8 To check the status, run: ais show job xaction copy-bck aws://dst_bucket

Use (list, range, and/or prefix) options to copy selected objects

Example 1. Copy objects obj1.tar and obj1.info from bucket ais://bck1 to ais://bck2, and wait until the operation finishes

1 $ ais cp ais://bck1 ais://bck2 --list obj1.tar,obj1.info --wait
2 copying objects operation ("ais://bck1" => "ais://bck2") is in progress...
3 copying objects operation succeeded.

Example 2. Copy objects matching Bash brace-expansion obj{2..4}, do not wait for the operation is done.

1 $ ais cp ais://bck1 ais://bck2 --template "obj{2..4}"
2 copying objects operation ("ais://bck1" => "ais://bck2") is in progress...
3 To check the status, run: ais show job xaction copy-bck ais://bck2

Example 3. Use --sync option to copy remote virtual subdirectory

1 $ ais cp gs://coco-dataset --sync --prefix d-tokens
2 Copying objects gs://coco-dataset. To monitor the progress, run 'ais show job tco-kJPUtYJld'

In the example, --sync synchronizes destination bucket with its remote (e.g., Cloud) source.

In particular, the option will make sure that aistore has the latest versions of remote objects and may also entail removing of the objects that no longer exist remotely

Example copying buckets

This example demonstrates how to copy objects between buckets using the AIStore CLI, and how to monitor the progress of the copy operation. AIStore supports all possible permutations of copying: Cloud to AIStore, Cloud to another (or same) Cloud, AIStore to Cloud, and between AIStore buckets.

To copy all objects with a common prefix from an S3 bucket to an AIStore bucket:

1 $ ais cp s3://src-bucket/a ais://dst-bucket --all
2 
3 Warning: destination ais://dst-bucket doesn't exist and will be created with configuration copied from the source (s3://src-bucket))
4 Copying objects s3://src-bucket => ais://dst-bucket. To monitor the progress, run 'ais show job tco-goDbhCxtf'

Note: The “Warning” message is benign and will only appear if the destination bucket does not exist.

Monitoring progress

You can monitor the progress of the copy operation using the ais show job copy command. Add the --refresh flag followed by a time in seconds to get automatic updates:

1 $ ais show job copy --refresh 10
2 
3 copy-objects[tco-goDbhCxtf] (ctl: s3://src-bucket=>ais://dst-bucket prefix:a, parallelism: w[6])
4 NODE             ID              KIND            SRC BUCKET      DST BUCKET      OBJECTS         BYTES           START           END     STATE
5 KactABCD         tco-goDbhCxtf   copy-listrange  s3://src-bucket ais://dst-bucket 82              11.00MiB        18:04:15        -       Running
6 XXytEFGH         tco-goDbhCxtf   copy-listrange  s3://src-bucket ais://dst-bucket 80              8.00MiB         18:04:15        -       Running
7 YMjtIJKL         tco-goDbhCxtf   copy-listrange  s3://src-bucket ais://dst-bucket 104             23.00MiB        18:04:15        -       Running
8 oJXtMNOP         tco-goDbhCxtf   copy-listrange  s3://src-bucket ais://dst-bucket 134             18.00MiB        18:04:15        -       Running
9 vWrtQRST         tco-goDbhCxtf   copy-listrange  s3://src-bucket ais://dst-bucket 118             12.00MiB        18:04:15        -       Running
10 ybTtUVWX         tco-goDbhCxtf   copy-listrange  s3://src-bucket ais://dst-bucket 71              10.02MiB        18:04:15        -       Running
11                                 Total:                                          589             82.02MiB ✓

The output shows statistics for each node in the AIStore cluster:

NODE: The name of the node
ID: The job ID
KIND: The type of operation
SRC BUCKET: Source bucket
DST BUCKET: Destination bucket
OBJECTS: Number of objects processed
BYTES: Amount of data transferred
START: Job start time
END: Job end time (empty if job is still running)
STATE: Current job state

The output also includes a “Total” row at the bottom that provides cluster-wide aggregated values for the number of objects processed and bytes transferred. The checkmark (✓) indicates that all nodes are reporting byte statistics.

Stopping all jobs

To stop all in-progress jobs:

1 $ ais stop --all
2 Stopped copy-listrange[tco-goDbhCxtf]

In our example, there’d be a single job ID tco-goDbhCxtf

Example copying buckets and multi-objects with simultaneous synchronization

There’s a script that we use for testing. When run, it produces the following output:

1 $ ./ais/test/scripts/cp-sync-remais-out-of-band.sh --bucket gs://abc
2 
3  1. generate and write 500 random shards => gs://abc
4  2. copy gs://abc => ais://dst-9408
5  3. remove 10 shards from the source
6  4. copy gs://abc => ais://dst-9408 w/ synchronization ('--sync' option)
7  5. remove another 10 shards
8  6. copy multiple objects using bash-expansion defined range and '--sync'
9  #
10  # out of band DELETE using remote AIS (remais)
11  #
12  7. use remote AIS cluster ("remais") to out-of-band remove 10 shards from the source
13  8. copy gs://abc => ais://dst-9408 w/ --sync
14  9. when copying, we always synchronize content of the in-cluster source as well
15 10. use remais to out-of-band remove 10 more shards from gs://abc source
16 11. copy a range of shards from gs://abc to ais://dst-9408, and compare
17 12. and again: when copying, we always synchronize content of the in-cluster source as well
18  #
19  # out of band ADD using remote AIS (remais)
20  #
21 13. use remais to out-of-band add (i.e., PUT) 17 new shards
22 14. copy a range of shards from gs://abc to ais://dst-9408, and check whether the destination has new shards
23 15. compare the contents but NOTE: as of v3.22, this part requires multi-object copy (using '--list' or '--template')

The script executes a sequence of steps (above).

Notice a certain limitation (that also shows up as the last step #15):

As of the version 3.22, aistore cp commands will always synchronize deleted and updated remote content.
However, to see an out-of-band added content, you currently need to run multi-object copy, with multiple source objects specified using --list or --template.

Show bucket summary

`ais storage summary PROVIDER:[//BUCKET_NAME] - show bucket sizes and the respective percentages of used capacity on a per-bucket basis [command options]

ais bucket summary - same as above.

Options

1 $ ais storage summary --help
2 
3 NAME:
4    ais storage summary - Show bucket sizes and %% of used capacity on a per-bucket basis
5 
6 USAGE:
7    ais storage summary [BUCKET[/PREFIX]] [PROVIDER] [command options]
8 
9 OPTIONS:
10    --cached          Only list in-cluster objects, i.e., objects from the respective remote bucket that are present ("cached") in the cluster
11    --count value     Used together with '--refresh' to limit the number of generated reports, e.g.:
12                       '--refresh 10 --count 5' - run 5 times with 10s interval (default: 0)
13    --dont-wait       When _summarizing_ buckets do not wait for the respective job to finish -
14                      use the job's UUID to query the results interactively
15    --no-headers, -H  Display tables without headers
16    --prefix value    For each bucket, select only those objects (names) that start with the specified prefix, e.g.:
17                      '--prefix a/b/c' - sum up sizes of the virtual directory a/b/c and objects from the virtual directory
18                      a/b that have names (relative to this directory) starting with the letter c
19    --refresh value   Time interval for continuous monitoring; can be also used to update progress bar (at a given interval);
20                      valid time units: ns, us (or µs), ms, s (default), m, h
21    --units value     Show statistics and/or parse command-line specified sizes using one of the following units of measurement:
22                      iec - IEC format, e.g.: KiB, MiB, GiB (default)
23                      si  - SI (metric) format, e.g.: KB, MB, GB
24                      raw - do not convert to (or from) human-readable format
25    --verbose, -v     Verbose output
26    --help, -h        Show help

If BUCKET is omitted, the command applies to all AIS buckets.

The output includes the total number of objects in a bucket, the bucket’s size (bytes, megabytes, etc.), and the percentage of the total capacity used by the bucket.

A few additional words must be said about --validate. The option is provided to run integrity checks, namely: locations of objects, replicas, and EC slices in the bucket, the number of replicas (and whether this number agrees with the bucket configuration), and more.

Location of each stored object must at any point in time correspond to the current cluster map and, within each storage target, to the target’s mountpaths. A failure to abide by location rules is called misplacement; misplaced objects - if any - must be migrated to their proper locations via automated processes called global rebalance and resilver:

Notes

--validate may take considerable time to execute (depending, of course, on sizes of the datasets in question and the capabilities of the underlying hardware); non-zero misplaced objects in the (validated) output is a direct indication that the cluster requires rebalancing and/or resilvering; an alternative way to execute validation is to run ais storage validate or (simply) ais scrub:

1 $ ais scrub --help
2 
3 NAME:
4    ais scrub - (alias for "storage validate") Check in-cluster content for misplaced objects, objects that have insufficient numbers of copies, zero size, and more
5    e.g.:
6      * ais storage validate                 - validate all in-cluster buckets;
7      * ais scrub                            - same as above;
8      * ais storage validate ais             - validate (a.k.a. scrub) all ais:// buckets;
9      * ais scrub s3                         - ditto, all s3:// buckets;
10      * ais scrub s3 --refresh 10            - same as above while refreshing runtime counter(s) every 10s;
11      * ais scrub gs://abc/images/           - validate part of the gcp bucket under 'images/`;
12      * ais scrub gs://abc --prefix images/  - same as above.
13 
14 USAGE:
15    ais scrub [BUCKET[/PREFIX]] [PROVIDER] [command options]
16 
17 OPTIONS:
18    --all-columns          Show all columns, including those with only zero values
19    --cached               Only visit in-cluster objects, i.e., objects from the respective remote bucket that are present ("cached") in the cluster
20    --count value          Used together with '--refresh' to limit the number of generated reports, e.g.:
21                            '--refresh 10 --count 5' - run 5 times with 10s interval (default: 0)
22    --large-size value     Count and report all objects that are larger or equal in size  (e.g.: 4mb, 1MiB, 1048576, 128k; default: 5 GiB)
23    --limit value          The maximum number of objects to list, get, or otherwise handle (0 - unlimited; see also '--max-pages'),
24                           e.g.:
25                           - 'ais ls gs://abc/dir --limit 1234 --cached --props size,custom,atime'  - list no more than 1234 objects
26                           - 'ais get gs://abc /dev/null --prefix dir --limit 1234'                 - get --/--
27                           - 'ais scrub gs://abc/dir --limit 1234'                                  - scrub --/-- (default: 0)
28    --max-pages value      Maximum number of pages to display (see also '--page-size' and '--limit')
29                           e.g.: 'ais ls az://abc --paged --page-size 123 --max-pages 7 (default: 0)
30    --no-headers, -H       Display tables without headers
31    --non-recursive, --nr  Non-recursive operation, e.g.:
32                           - 'ais ls gs://bucket/prefix --nr'   - list objects and/or virtual subdirectories with names starting with the specified prefix;
33                           - 'ais ls gs://bucket/prefix/ --nr'  - list contained objects and/or immediately nested virtual subdirectories _without_ recursing into the latter;
34                           - 'ais prefetch s3://bck/abcd --nr'  - prefetch a single named object (see 'ais prefetch --help' for details);
35                           - 'ais rmo gs://bucket/prefix --nr'  - remove a single object with the specified name (see 'ais rmo --help' for details)
36    --page-size value      Maximum number of object names per page; when the flag is omitted or 0
37                           the maximum is defined by the corresponding backend; see also '--max-pages' and '--paged' (default: 0)
38    --prefix value         For each bucket, select only those objects (names) that start with the specified prefix, e.g.:
39                           '--prefix a/b/c' - sum up sizes of the virtual directory a/b/c and objects from the virtual directory
40                           a/b that have names (relative to this directory) starting with the letter c
41    --refresh value        Time interval for continuous monitoring; can be also used to update progress bar (at a given interval);
42                           valid time units: ns, us (or µs), ms, s (default), m, h
43    --small-size value     Count and report all objects that are smaller or equal in size (e.g.: 4, 4b, 1k, 128kib; default: 0)
44    --help, -h             Show help

For details and additional examples, please see:

Validate in-cluster content for misplaced objects and missing copies

Examples

1 # 1. show summary for a specific bucket
2 $ ais bucket summary ais://abc
3 NAME             OBJECTS         SIZE ON DISK    USAGE(%)
4 ais://abc        10902           5.38GiB         1%
5 
6 For min/avg/max object sizes, use `--fast=false`.

1 # 2. "summarize" all buckets(*)
2 $ ais bucket summary
3 NAME             OBJECTS         SIZE ON DISK    USAGE(%)
4 ais://abc        10902           5.38GiB         1%
5 ais://nnn        49873           200.00MiB       0%

1 # 3.  "summarize" all s3:// buckets; count both "cached" and remote objects:
2 $ ais bucket summary s3: --all

1 # 4. same as above with progress updates every 3 seconds:
2 $ ais bucket summary s3: --all --refresh 3

1 # 4. "summarize" a given gs:// bucket; start the job and exit without waiting for it to finish
2 # (see prompt below):
3 $ ais bucket summary gs://abc --all --dont-wait
4 
5 Job summary[wl-s5lIWA] has started. To monitor, run 'ais storage summary gs://abc wl-s5lIWA --dont-wait' or 'ais show job wl-s5lIWA;
6 see '--help' for details'

Start N-way Mirroring

ais start mirror BUCKET --copies <value>

Start an extended action to bring a given bucket to a certain redundancy level (value copies). Read more about this feature here.

Options

1 $ ais start mirror --help
2 
3 NAME:
4    ais start mirror - Configure (or unconfigure) bucket as n-way mirror, and run the corresponding batch job, e.g.:
5      - 'ais start mirror ais://m --copies 3'  - configure ais://m as a 3-way mirror;
6      - 'ais start mirror ais://m --copies 1'  - configure ais://m for no redundancy (no extra copies).
7    (see also: 'ais start ec-encode')
8 
9 USAGE:
10    ais start mirror BUCKET [command options]
11 
12 OPTIONS:
13    --copies value       Number of object replicas (default: 1)
14    --non-verbose, --nv  Non-verbose (quiet) output, minimized reporting, fewer warnings
15    --help, -h           Show help

Start Erasure Coding

ais start ec-encode BUCKET --data-slices <value> --parity-slices <value>

Start an extended action that encodes and recovers all objects and slices in a given bucket. The action enables erasure coding if it is disabled, and runs the encoding for all objects in the bucket in the background. If erasure coding for the bucket was enabled beforehand, the extended action recovers missing objects and slices if possible.

In case of running the extended action for a bucket that has already erasure coding enabled, you must pass the correct number of parity and data slices in the command-line. Run ais bucket props show <bucket-name> ec to get the current erasure coding settings. Read more about this feature here.

Options

1 $ ais start ec-encode --help
2 
3 NAME:
4    ais start ec-encode - Erasure code entire bucket, e.g.:
5      - 'ais start ec-encode ais://nnn -d 8 -p 2'                          - erasure-code ais://nnn for 8 data and 2 parity slices;
6      - 'ais start ec-encode ais://nnn --data-slices 8 --parity-slices 2'  - same as above;
7      - 'ais start ec-encode ais://nnn --recover'                          - check and make sure that every ais://nnn object is properly erasure-coded.
8    see also: 'ais start mirror'
9 
10 USAGE:
11    ais start ec-encode BUCKET [command options]
12 
13 OPTIONS:
14    --data-slices value, -d value    Number of data slices (default: 2)
15    --non-verbose, --nv              Non-verbose (quiet) output, minimized reporting, fewer warnings
16    --parity-slices value, -p value  Number of parity slices (default: 2)
17    --recover                        Check and make sure that each and every object is properly erasure coded
18    --help, -h                       Show help

All options are required and must be greater than 0.

Show bucket properties

Overall, the topic called “bucket properties” is rather involved and includes sub-topics “bucket property inheritance” and “cluster-wide global defaults”. For background, please first see:

Now, as far as CLI, run the following to list properties of the specified bucket. By default, a certain compact form of bucket props sections is presented.

ais bucket props show BUCKET [PROP_PREFIX] [command options]

When PROP_PREFIX is set, only props that start with PROP_PREFIX will be displayed. Useful PROP_PREFIX are: access, checksum, ec, lru, mirror, provider, versioning.

ais bucket show is an alias for ais show bucket - both can be used interchangeably.

Options

1 $ ais bucket props show --help
2 
3 NAME:
4    ais bucket props show - Show bucket properties
5 
6 USAGE:
7    ais bucket props show BUCKET [PROP_PREFIX] [command options]
8 
9 OPTIONS:
10    --add             Add remote bucket to cluster's metadata
11                        - let's say, s3://abc is accessible but not present in the cluster (e.g., 'ais ls' returns error);
12                        - most of the time, there's no need to worry about it as aistore handles presence/non-presence
13                          transparently behind the scenes;
14                        - but if you do want to (explicltly) add the bucket, you could also use '--add' option
15    --compact, -c     Display properties grouped in human-readable mode
16    --json, -j        JSON input/output
17    --no-headers, -H  Display tables without headers
18    --help, -h        Show help

Examples

Show bucket props with provided section

Show only lru section of bucket props for bucket_name bucket.

1 $ ais bucket props show s3://bucket-name --compact
2 PROPERTY	 VALUE
3 access		 GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT,HEAD-BUCKET,LIST-OBJECTS,PATCH,SET-BUCKET-ACL,LIST-BUCKETS,SHOW-CLUSTER,CREATE-BUCKET,DESTROY-BUCKET,MOVE-BUCKET,ADMIN
4 checksum	 Type: xxhash | Validate: Nothing
5 created		 2024-01-31T15:42:59-08:00
6 ec		 Disabled
7 lru		 lru.dont_evict_time=2h0m, lru.capacity_upd_time=10m
8 mirror		 Disabled
9 present		 yes
10 provider	 aws
11 versioning	 Disabled
12 
13 $ ais bucket props show s3://bucket_name lru --compact
14 PROPERTY	 VALUE
15 lru		 lru.dont_evict_time=2h0m, lru.capacity_upd_time=10m
16 
17 $ ais bucket props show s3://ais-abhishek lru
18 PROPERTY		 VALUE
19 lru.capacity_upd_time	 10m
20 lru.dont_evict_time	 2h0m
21 lru.enabled		 true

Set bucket properties

ais bucket props set [OPTIONS] BUCKET JSON_SPECIFICATION|KEY=VALUE [KEY=VALUE...]

Set bucket properties. For the available options, see bucket-properties.

If JSON_SPECIFICATION is used, all properties of the bucket are set based on the values in the JSON object.

Options

1 $ ais bucket props set --help
2 
3 NAME:
4    ais bucket props set - Update bucket properties; the command accepts both JSON-formatted input and plain Name=Value pairs, e.g.:
5      * ais bucket props set ais://nnn backend_bck=s3://mmm
6      * ais bucket props set ais://nnn backend_bck=none
7      * ais bucket props set gs://vvv versioning.validate_warm_get=false versioning.synchronize=true
8      * ais bucket props set gs://vvv mirror.enabled=true mirror.copies=4 checksum.type=md5
9      * ais bucket props set s3://mmm ec.enabled true ec.data_slices 6 ec.parity_slices 4 --force
10      References:
11      * for details and many more examples, see docs/cli/bucket.md
12      * to show bucket properties (names and current values), use 'ais bucket show'
13 
14 USAGE:
15    ais bucket props set BUCKET JSON-formatted-KEY-VALUE | KEY=VALUE [KEY=VALUE...] [command options]
16 
17 OPTIONS:
18    --force, -f    Force execution of the command (caution: advanced usage only)
19    --skip-lookup  Do not execute HEAD(bucket) request to lookup remote bucket and its properties; possible usage scenarios include:
20                    1) adding remote bucket to aistore without first checking the bucket's accessibility
21                       (e.g., to configure the bucket's aistore properties with alternative security profile and/or endpoint)
22                    2) listing public-access Cloud buckets where certain operations (e.g., 'HEAD(bucket)') may be disallowed
23    --help, -h     Show help

When JSON specification is not used, some properties support user-friendly aliases:

Property	Value alias	Description
access	`ro`	Disables bucket modifications: denies PUT, DELETE, and ColdGET requests
access	`rw`	Enables object modifications: allows PUT, DELETE, and ColdGET requests
access	`su`	Enables full access: all `rw` permissions, bucket deletion, and changing bucket permissions

Examples

Enable mirroring for a bucket

Set the mirror.enabled and mirror.copies properties to true and 2 respectively, for the bucket bucket_name

1 $ ais bucket props set ais://bucket_name 'mirror.enabled=true' 'mirror.copies=2'
2 Bucket props successfully updated
3 "mirror.enabled" set to:"true" (was:"false")

Make a bucket read-only

Set read-only access to the bucket bucket_name. All PUT and DELETE requests will fail.

1 $ ais bucket props set ais://bucket_name 'access=ro'
2 Bucket props successfully updated
3 "access" set to:"GET,HEAD-OBJECT,HEAD-BUCKET,LIST-OBJECTS" (was:"<PREV_ACCESS_LIST>")

Configure custom AWS S3 endpoint

When a bucket is hosted by an S3 compliant backend (such as, e.g., minio), we may want to specify an alternative S3 endpoint, so that AIS nodes use it when reading, writing, listing, and generally, performing all operations on remote S3 bucket(s).

Globally, S3 endpoint can be overridden for all S3 buckets via “S3_ENDPOINT” environment. If you decide to make the change, you may need to restart AIS cluster while making sure that “S3_ENDPOINT” is available for the AIS nodes when they are starting up.

But it can be also be done - and will take precedence over the global setting - on a per-bucket basis.

Here are some examples:

1 # Let's say, there exists a bucket called s3://abc:
2 $ ais ls s3://abc
3 NAME             SIZE
4 README.md        8.96KiB
5 
6 # First, we override empty the endpoint property in the bucket's configuration.
7 # To see that a non-empty value *applies* and works, we will use the default AWS S3 endpoint: https://s3.amazonaws.com
8 $ ais bucket props set s3://abc extra.aws.endpoint=s3.amazonaws.com
9 Bucket "aws://abc": property "extra.aws.endpoint=s3.amazonaws.com", nothing to do
10 $ ais ls s3://abc
11 NAME             SIZE
12 README.md        8.96KiB
13 
14 # Second, set the endpoint=foo (or, it could be any other invalid value), and observe that the bucket becomes unreachable:
15 $ ais bucket props set s3://abc extra.aws.endpoint=foo
16 Bucket props successfully updated
17 "extra.aws.endpoint" set to: "foo" (was: "s3.amazonaws.com")
18 $ ais ls s3://abc
19 RequestError: send request failed: dial tcp: lookup abc.foo: no such host
20 
21 # Finally, revert the endpoint back to empty, and check that the bucket is visible again:
22 $ ais bucket props set s3://abc extra.aws.endpoint=""
23 Bucket props successfully updated
24 "extra.aws.endpoint" set to: "" (was: "foo")
25 $ ais ls s3://abc
26 NAME             SIZE
27 README.md        8.96KiB

Global export S3_ENDPOINT=... override is static and readonly. Use it with extreme caution as it applies to all buckets.

On the other hand, for any given s3://bucket its S3 endpoint can be set, unset, and otherwise changed at any time - at runtime. As shown above.

Connect/Disconnect AIS bucket to/from cloud bucket

Set backend bucket for AIS bucket bucket_name to the GCP cloud bucket cloud_bucket. Once the backend bucket is set, operations (get, put, list, etc.) with ais://bucket_name will be exactly as we would do with gcp://cloud_bucket. It’s like a symlink to a cloud bucket. The only difference is that all objects will be cached into ais://bucket_name (and reflected in the cloud as well) instead of gcp://cloud_bucket.

1 $ ais bucket props set ais://bucket_name backend_bck=gcp://cloud_bucket
2 Bucket props successfully updated
3 "backend_bck.name" set to: "cloud_bucket" (was: "")
4 "backend_bck.provider" set to: "gcp" (was: "")

To disconnect cloud bucket do:

1 $ ais bucket props set ais://bucket_name backend_bck=none
2 Bucket props successfully updated
3 "backend_bck.name" set to: "" (was: "cloud_bucket")
4 "backend_bck.provider" set to: "" (was: "gcp")

Ignore non-critical errors

To create an erasure-encoded bucket or enable EC for an existing bucket, AIS requires at least ec.data_slices + ec.parity_slices + 1 targets. At the same time, for small objects (size is less than ec.objsize_limit) it is sufficient to have only ec.parity_slices + 1 targets. Option --force allows creating erasure-encoded buckets when the number of targets is not enough but the number exceeds ec.parity_slices.

Note that if the number of targets is less than ec.data_slices + ec.parity_slices + 1, the cluster accepts only objects smaller than ec.objsize_limit. Bigger objects are rejected on PUT.

In examples a cluster with 6 targets is used:

1 $ # Creating a bucket
2 $ ais create ais://bck --props "ec.enabled=true ec.data_slices=6 ec.parity_slices=4"
3 Create bucket "ais://bck" failed: EC config (6 data, 4 parity) slices requires at least 11 targets (have 6)
4 $
5 $ ais create ais://bck --props "ec.enabled=true ec.data_slices=6 ec.parity_slices=4" --force
6 "ais://bck" bucket created
7 $
8 $ # If the number of targets is less than or equal to ec.parity_slices even `--force` does not help
9 $
10 $ ais bucket props set ais://bck ec.enabled true ec.data_slices 6 ec.parity_slices 8
11 EC config (6 data, 8 parity)slices requires at least 15 targets (have 6). To show bucket properties, run "ais show bucket BUCKET -v".
12 $
13 $ ais bucket props set ais://bck ec.enabled true ec.data_slices 6 ec.parity_slices 8 --force
14 EC config (6 data, 8 parity)slices requires at least 15 targets (have 6). To show bucket properties, run "ais show bucket BUCKET -v".
15 $
16 $ # Use force to enable EC if the number of target is sufficient to keep `ec.parity_slices+1` replicas
17 $
18 $ ais bucket props set ais://bck ec.enabled true ec.data_slices 6 ec.parity_slices 4
19 EC config (6 data, 8 parity)slices requires at least 11 targets (have 6). To show bucket properties, run "ais show bucket BUCKET -v".
20 $
21 $ ais bucket props set ais://bck ec.enabled true ec.data_slices 6 ec.parity_slices 4 --force
22 Bucket props successfully updated
23 "ec.enabled" set to: "true" (was: "false")
24 "ec.parity_slices" set to: "4" (was: "2")

Once erasure encoding is enabled for a bucket, the number of data and parity slices cannot be modified. The minimum object size ec.objsize_limit can be changed on the fly. To avoid accidental modification when EC for a bucket is enabled, the option --force must be used.

1 $ ais bucket props set ais://bck ec.enabled true
2 Bucket props successfully updated
3 "ec.enabled" set to: "true" (was: "false")
4 $
5 $ ais bucket props set ais://bck ec.objsize_limit 320000
6 P[dBbfp8080]: once enabled, EC configuration can be only disabled but cannot change. To show bucket properties, run "ais show bucket BUCKET -v".
7 $
8 $ ais bucket props set ais://bck ec.objsize_limit 320000 --force
9 Bucket props successfully updated
10 "ec.objsize_limit" set to:"320000" (was:"262144")

Set bucket properties with JSON

Set all bucket properties for bucket_name bucket based on the provided JSON specification.

$ $ ais bucket props set ais://bucket_name '{
>     "provider": "ais",
>     "versioning": {
>       "enabled": true,
>       "validate_warm_get": false
>     },
>     "checksum": {
>       "type": "xxhash",
>       "validate_cold_get": true,
>       "validate_warm_get": false,
>       "validate_obj_move": false,
>       "enable_read_range": false
>     },
>     "lru": {
>       "dont_evict_time": "20m",
>       "capacity_upd_time": "1m",
>       "enabled": true
>     },
>     "mirror": {
>       "copies": 2,
>       "burst_buffer": 512,
>       "enabled": false
>     },
>     "ec": {
>         "objsize_limit": 256000,
>         "data_slices": 2,
>         "parity_slices": 2,
>         "enabled": true
>     },
>     "access": "255"
> }'
$ "access" set to: "GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT" (was: "GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT,HEAD-BUCKET,LIST-OBJECTS,PATCH,SET-BUCKET-ACL,LIST-BUCKETS,SHOW-CLUSTER,CREATE-BUCKET,DESTROY-BUCKET,MOVE-BUCKET,ADMIN")
$ "ec.enabled" set to: "true" (was: "false")
$ "ec.objsize_limit" set to: "256000" (was: "262144")
$ "lru.capacity_upd_time" set to: "1m" (was: "10m")
$ "lru.dont_evict_time" set to: "20m" (was: "1s")
$ "lru.enabled" set to: "true" (was: "false")
$ "mirror.enabled" set to: "false" (was: "true")
$ 
$ Bucket props successfully updated.

1 $ ais show bucket ais://bucket_name --compact
2 PROPERTY	 VALUE
3 access		 GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT
4 checksum	 Type: xxhash | Validate: ColdGET
5 created		 2024-02-02T12:57:17-08:00
6 ec		 2:2 (250KiB)
7 lru		 lru.dont_evict_time=20m, lru.capacity_upd_time=1m
8 mirror		 Disabled
9 present		 yes
10 provider	 ais
11 versioning	 Enabled | Validate on WarmGET: no

If not all properties are mentioned in the JSON, the missing ones are set to zero values (empty / false / nil):

$ $ ais bucket props set ais://bucket-name '{
>   "mirror": {
>     "enabled": true,
>     "copies": 2
>   },
>   "versioning": {
>     "enabled": true,
>     "validate_warm_get": true
>   }
> }'
$ "mirror.enabled" set to: "true" (was: "false")
$ "versioning.validate_warm_get" set to: "true" (was: "false")
$ 
$ Bucket props successfully updated.
$ 
$ $ ais show bucket ais://bucket-name --compact
$ PROPERTY	 VALUE
$ access		 GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT,HEAD-BUCKET,LIST-OBJECTS,PATCH,SET-BUCKET-ACL,LIST-BUCKETS,SHOW-CLUSTER,CREATE-BUCKET,DESTROY-BUCKET,MOVE-BUCKET,ADMIN
$ checksum	 Type: xxhash | Validate: Nothing
$ created		 2024-02-02T12:52:30-08:00
$ ec		     Disabled
$ lru   		 lru.dont_evict_time=2h0m, lru.capacity_upd_time=10m
$ mirror		 2 copies
$ present		 yes
$ provider	 ais
$ versioning Enabled | Validate on WarmGET: yes

Archive multiple objects

ais archive bucket - Archive selected or matching objects from SRC_BUCKET[/OBJECT_NAME_or_TEMPLATE] as (.tar, .tgz or .tar.gz, .zip, .tar.lz4)-formatted object (a.k.a. shard).

1 $ ais archive bucket --help
2 NAME:
3    ais archive bucket - Archive selected or matching objects from SRC_BUCKET[/OBJECT_NAME_or_TEMPLATE] as
4    (.tar, .tgz or .tar.gz, .zip, .tar.lz4)-formatted object (a.k.a. "shard"):
5      - 'ais archive bucket ais://src gs://dst/a.tar.lz4 --template "trunk-{001..997}"'       - archive (prefix+range) matching objects from ais://src;
6      - 'ais archive bucket "ais://src/trunk-{001..997}" gs://dst/a.tar.lz4'                  - same as above (notice double quotes);
7      - 'ais archive bucket "ais://src/trunk-{998..999}" gs://dst/a.tar.lz4 --append-or-put'  - add two more objects to an existing shard;
8      - 'ais archive bucket s3://src/trunk-00 ais://dst/b.tar'                                - archive "trunk-00" prefixed objects from an s3 bucket as a given TAR destinati
9 on
10 
11 USAGE:
12    ais archive bucket SRC_BUCKET[/OBJECT_NAME_or_TEMPLATE] DST_BUCKET/SHARD_NAME [command options]
13 
14 OPTIONS:
15    append-or-put     Append to an existing destination object ("archive", "shard") iff exists; otherwise PUT a new archive (shard);
16                      note that PUT (with subsequent overwrite if the destination exists) is the default behavior when the flag is omitted
17    cont-on-err       Keep running archiving xaction (job) in presence of errors in a any given multi-object transaction
18    dry-run           Preview the results without really running the action
19    include-src-bck   Prefix the names of archived files with the source bucket name
20    list              Comma-separated list of object or file names, e.g.:
21                      --list 'o1,o2,o3'
22                      --list "abc/1.tar, abc/1.cls, abc/1.jpeg"
23                      or, when listing files and/or directories:
24                      --list "/home/docs, /home/abc/1.tar, /home/abc/1.jpeg"
25    non-recursive,nr  Non-recursive operation, e.g.:
26                      - 'ais ls gs://bck/sub --nr'               - list objects and/or virtual subdirectories with names starting with the specified prefix;
27                      - 'ais ls gs://bck/sub/ --nr'              - list only immediate contents of 'sub/' subdirectory (non-recursive);
28                      - 'ais prefetch s3://bck/abcd --nr'        - prefetch a single named object;
29                      - 'ais evict gs://bck/sub/ --nr'           - evict only immediate contents of 'sub/' subdirectory (non-recursive);
30                      - 'ais evict gs://bck --prefix=sub/ --nr'  - same as above
31    prefix            Select virtual directories or objects with names starting with the specified prefix, e.g.:
32                      '--prefix a/b/c'   - matches names 'a/b/c/d', 'a/b/cdef', and similar;
33                      '--prefix a/b/c/'  - only matches objects from the virtual directory a/b/c/
34    skip-lookup       Skip checking source and destination buckets' existence (trading off extra lookup for performance)
35 
36    template   Template to match object or file names; may contain prefix (that could be empty) with zero or more ranges
37               (with optional steps and gaps), e.g.:
38               --template "" # (an empty or '*' template matches everything)
39               --template 'dir/subdir/'
40               --template 'shard-{1000..9999}.tar'
41               --template "prefix-{0010..0013..2}-gap-{1..2}-suffix"
42               and similarly, when specifying files and directories:
43               --template '/home/dir/subdir/'
44               --template "/abc/prefix-{0010..9999..2}-suffix"
45    wait       Wait for an asynchronous operation to finish (optionally, use '--timeout' to limit the waiting time)
46    help, h    Show help

See also:

Build and summarize shard indexes

ais bucket shard-index builds and summarizes shard indexes for TAR objects. Indexes allow AIS to read archived files via direct lookup instead of scanning the TAR object.

For motivation, access semantics, and implementation details, see Shard Index.

1 $ ais bucket shard-index build ais://mybucket --prefix shards/ --wait
2 $ ais bucket shard-index summary ais://mybucket --prefix shards/
3 BUCKET             TAR OBJECTS  TAR SIZE  SHARDS  SHARD SIZE  NOT INDEXED  ARCHIVED OBJECTS  STALE  INVALID
4 ais://mybucket     100          600MiB    100     600MiB      0              409600            0      0

Use --refresh to print periodic progress while waiting:

1 $ ais bucket shard-index summary ais://mybucket --refresh 1s
2 ais://mybucket: 42/100 indexed (252MiB indexed, 600MiB total)

Use --dont-wait to start the summary job asynchronously, then poll with the returned job ID:

1 $ ais bucket shard-index summary ais://mybucket --dont-wait
2 Job shard-summary[abcDEF123] has started. To monitor, run 'ais bucket shard-index summary ais://mybucket abcDEF123 --dont-wait'
3 
4 $ ais bucket shard-index summary ais://mybucket abcDEF123 --dont-wait

Show and set AWS-specific properties

AIStore supports AWS-specific configuration on a per s3 bucket basis. Any bucket that is backed up by an AWS S3 bucket (**) can be configured to use alternative:

named AWS profiles (with alternative credentials and/or region)
alternative s3 endpoints

For background and usage examples, please see AWS-specific bucket configuration.

(**) Terminology-wise, “s3 bucket” is a shortcut phrase indicating a bucket in an AIS cluster that either (A) has the same name (e.g. s3://abc) or (B) a differently named AIS bucket that has backend_bck property that specifies the s3 bucket in question.

Reset bucket properties to cluster defaults

ais bucket props reset BUCKET

Reset bucket properties to cluster defaults.

Examples

1 $ ais bucket props reset bucket_name
2 Bucket props successfully reset

Show bucket metadata

ais show cluster bmd

Show bucket metadata (BMD).

Examples

1 $ ais show cluster bmd
2 PROVIDER  NAMESPACE  NAME        BACKEND  COPIES  EC(D/P, minsize)  CREATED
3 ais                  test                 2                         25 Mar 21 18:28 PDT
4 ais                  validation                                     25 Mar 21 18:29 PDT
5 ais                  train                                          25 Mar 21 18:28 PDT
6 
7 Version:        9
8 UUID:           jcUfFDyTN

Background and Introduction

Table of Contents

Create bucket

Examples

Create AIS bucket

Create AIS bucket in local namespace

Create bucket in remote AIS cluster

Create bucket with custom properties

Incorrect buckets creation

See also

Delete bucket

Examples

Remove AIS buckets

Remove AIS bucket in local namespace

Remove bucket in remote AIS cluster

Incorrect buckets removal

List buckets

Usage

Assorted options

ais ls --regex "ngn*"

ais ls aws: or (same) ais ls s3

ais ls aws --all or (same) ais ls s3: --all

ais ls ais:// or (same) ais ls ais

ais ls ais://#name

ais ls ais://@uuid#namespace

List objects

Assorted options

Footer Information:

Examples

List AIS and Cloud buckets with all defaults

Notes:

Include all properties

List bucket from AIS remote cluster

With prefix

Bucket inventory

List archived content

List anonymously (i.e., list public-access Cloud bucket)

Use ‘—prefix’ that crosses shard boundary

Evict remote bucket

See also

Move or Rename a bucket

Examples

Move AIS bucket

Copy (list, range, and/or prefix) selected objects or entire (in-cluster or remote) buckets

Examples

Copy non-existing remote bucket to a non-existing in-cluster destination

Copy AIS bucket

Copy AIS bucket and wait until the job finishes

Copy cloud bucket to another cloud bucket

Use (list, range, and/or prefix) options to copy selected objects

See also

Example copying buckets

Monitoring progress

Stopping all jobs

Example copying buckets and multi-objects with simultaneous synchronization

See also

Show bucket summary

Options

Notes

Examples

Start N-way Mirroring

Options

Start Erasure Coding

Options

Show bucket properties

Options

Examples

Show bucket props with provided section

Set bucket properties

Options

Examples

Enable mirroring for a bucket

Make a bucket read-only

Configure custom AWS S3 endpoint

Connect/Disconnect AIS bucket to/from cloud bucket

Ignore non-critical errors

Set bucket properties with JSON

Archive multiple objects

Build and summarize shard indexes

Show and set AWS-specific properties

`ais ls --regex "ngn*"`

`ais ls aws:` or (same) `ais ls s3`

`ais ls aws --all` or (same) `ais ls s3: --all`

`ais ls ais://` or (same) `ais ls ais`

`ais ls ais://#name`

`ais ls ais://@uuid#namespace`