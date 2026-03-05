Install the AWS CLI version 2. Follow the instructions on aws-cli-getting-started
Set the configuration settings and credentials of the AWS CLI by creating credentials and config files as described in aws-cli-configure-files.
If the AWS CLI configuration isn’t set to the default values, then make sure to explicitly set some environment variables tp be picked up by the tools cmd such as:
AWS_PROFILE,
AWS_DEFAULT_REGION,
AWS_CONFIG_FILE,
AWS_SHARED_CREDENTIALS_FILE. Refer to the full list of variables in aws-cli-configure-envvars
It’s important to configure with the correct region for the bucket being used on S3. If region isn’t set, the AWS SDK will choose a default value that may not be valid. In addition, the tools CLI by inspects
AWS_ACCESS_KEY_IDand
AWS_SECRET_ACCESS_KEYenvironment variables if the credentials couldn’t be pulled from the credential files.
The Spark event logs are stored in HDFS in the EMR cluster at the path
/var/log/spark/apps/. Please make sure the logs are copied to S3 (or a local directory) and specify that path before running the Qualification tool.
Note
In order to be able to run tools that require SSH on the EMR nodes (that is, bootstrap), then:
make sure that you have SSH access to the cluster nodes; and
create a key pair using Amazon EC2 through the AWS CLI command
aws ec2 create-key-pairas instructed in aws-cli-create-key-pairs.