Cli

Aws Profile Endpoint

View as Markdown

AIStore supports vendor-specific configuration on a per bucket basis. For instance, any bucket backed up by an AWS S3 bucket (**) can be configured to use alternative:

  • named AWS profiles (with alternative credentials and/or AWS region)
  • s3 endpoints

(**) Terminology-wise, when we say “s3 bucket” or “google cloud bucket” we in fact reference a bucket in an AIS cluster that is either:

  • (A) denoted with the respective s3: or gs: protocol schema, or
  • (B) is a differently named AIS (that is, ais://) bucket that has its backend_bck property referencing the s3 (or google cloud) bucket in question.

For supported backends (that include, but are not limited, to AWS S3), see also:

Table of Contents

Viewing vendor-specific properties

While ais show bucket will show all properties (which is a lengthy list), the way to maybe focus on vendor-specific extension is to look for the section called “extra”. For example:

1$ ais show bucket s3://abc | grep extra

or, same:

1$ ais bucket props show s3://abc extra

The typical result would include the following defaults:

1extra.aws.cloud_region us-east-2
2extra.aws.endpoint
3extra.aws.max_pagesize 0
4extra.aws.multipart_size 0B
5extra.aws.profile

Notice that the bucket’s region (cloud_region above) is automatically populated when AIS looks up the bucket in s3. But the other two variables are settable and can provide alternative credentials and/or access endpoint.

Environment variables

AIStore supports the well-known S3_ENDPOINT and AWS_PROFILE environment. While S3_ENDPOINT is often used to utilize AIS cluster as s3-providing service, configurable AWS_PROFILE specifies what’s called a named configuration profile:

The rule is simple:

  • S3_ENDPOINT and AWS_PROFILE are loaded once upon AIS node startup.
  • Bucket configuration takes precedence over the environment and can be changed at any time.

Setting profile with alternative access/secret keys and/or region

Assuming, on the one hand:

1$ cat ~/.aws/config
2[default]
3region = us-east-2
4
5[profile prod]
6region = us-west-1

and

1$ cat ~/.aws/credentials
2[default]
3aws_access_key_id = foo
4aws_secret_access_key = bar
5
6[prod]
7aws_access_key_id = 123
8aws_secret_access_key = 456

on the other, we can then go ahead and set the “prod” profile directly into the bucket:

1$ ais bucket props set s3://abc extra.aws.profile prod
2"extra.aws.profile" set to: "prod" (was: "")

and show resulting “extra.aws” configuration:

1$ ais show bucket s3://abc | grep extra
2extra.aws.cloud_region us-west-1
3extra.aws.endpoint
4extra.aws.profile prod

From this point on, all calls to read, write, list s3://abc and get/set its properties will use AWS “prod” profile (see above).

When bucket does not exist

But what if we need to set alternative profile (with alternative access and secret keys) on a bucket that does not yet exist in the cluster?

That must be a fairly common situation, and the way to resolve it is to use --skip-lookup option:

1$ ais create --help
2...
3OPTIONS:
4 --props value bucket properties, e.g. --props="mirror.enabled=true mirror.copies=4 checksum.type=md5"
5 --skip-lookup add Cloud bucket to aistore without checking the bucket's accessibility and getting its Cloud properties
6 (usage must be limited to setting up bucket's aistore properties with alternative profile and/or endpoint)
7
8
9$ ais create s3://abc --skip-lookup
10"s3://abc" created

Once this is done (**), we simply go ahead and run ais bucket props set s3://abc extra.aws.profile (as shown above). Assuming, the updated profile contains correct access keys, the bucket will then be fully available for reading, writing, listing, and all the rest operations.

(**) ais create command results in adding the bucket to aistore BMD - a protected, versioned, and replicated bucket metadata that is further used to update properties of any bucket in the cluster, including certainly the one that we have just added.

Configuring custom AWS S3 endpoint

When a bucket is hosted by an S3 compliant backend (such as, e.g., minio), we may want to specify an alternative S3 endpoint, so that AIS nodes use it when reading, writing, listing, and generally, performing all operations on remote S3 bucket(s).

Globally, S3 endpoint can be overridden for all S3 buckets via “S3_ENDPOINT” environment.

If you decide to make the change, you may need to restart AIS cluster while making sure that “S3_ENDPOINT” is available for the AIS nodes when they are starting up.

But it can be also be done - and will take precedence over the global setting - on a per-bucket basis.

Here are some examples:

1# Let's say, there exists a bucket called s3://abc:
2$ ais ls s3://abc
3NAME SIZE
4README.md 8.96KiB

First, we override empty the endpoint property in the bucket’s configuration. To see that a non-empty value applies and works, we will use the default AWS S3 endpoint: https://s3.amazonaws.com

1$ ais bucket props set s3://abc extra.aws.endpoint=s3.amazonaws.com
2Bucket "aws://abc": property "extra.aws.endpoint=s3.amazonaws.com", nothing to do
3$ ais ls s3://abc
4NAME SIZE
5README.md 8.96KiB

Second, set the endpoint=foo (or, it could be any other invalid value), and observe that the bucket becomes unreachable:

1$ ais bucket props set s3://abc extra.aws.endpoint=foo
2Bucket props successfully updated
3"extra.aws.endpoint" set to: "foo" (was: "s3.amazonaws.com")
4
5$ ais ls s3://abc
6RequestError: send request failed: dial tcp: lookup abc.foo: no such host

Finally, revert the endpoint back to empty, and check that the bucket is visible again:

1$ ais bucket props set s3://abc extra.aws.endpoint=""
2Bucket props successfully updated
3"extra.aws.endpoint" set to: "" (was: "foo")
4
5$ ais ls s3://abc
6NAME SIZE
7README.md 8.96KiB

Global export S3_ENDPOINT=... override is static and readonly. Use it with extreme caution as it applies to all buckets.

On the other hand, for any given s3://bucket its S3 endpoint can be set, unset, and otherwise changed at any time - at runtime. As shown above.

Multipart size threshold

Multipart upload size threshold is, effectively, yet another performance tunable that, according to Amazon documentation, must be greater than or equal to 5MB.

Further,

In addition, you can always configure a different value on a per-bucket basis, as follows:

1$ ais bucket props set s3://abc extra.aws.multipart_size
2PROPERTY VALUE
3extra.aws.multipart_size 0B
4
5$ ais bucket props set s3://abc extra.aws.multipart_size 1gb
6"extra.aws.multipart_size" set to: "1GiB" (was: "0B")
7
8Bucket props successfully updated.

To show all AWS-specific settings for the bucket:

1$ ais bucket props set s3://abc extra.aws
2PROPERTY VALUE
3extra.aws.cloud_region us-east-2
4extra.aws.endpoint
5extra.aws.max_pagesize 0
6extra.aws.multipart_size 1GiB
7extra.aws.profile

Disabling MultiPart Uploads

Some 3rd party s3 providers don’t fully support the aws-chunked-encoding used by the AWS SDK for multipart upload.

With recent updates, the default for the s3 SDK is now to provide the checksum for any supported API, and to do so using their proprietary aws-chunked-encoding. AWS provides a way to disable this with the option request_checksum_calculation=when_required. However, as of writing, the s3 manager tool does not support this option. For these backend buckets, the checksum cannot be read from the client and users will see this error:

1InvalidArgument: x-amz-content-sha256 must be UNSIGNED-PAYLOAD, or a valid sha256 value

To disable multipart uploads for compatibility with these backends (or any other reason), you can set

1extra.aws.multipart_size -1

NOTE: Setting this to false will result in much slower “single-part” uploads

References