User can pass CLUSTER_PROPS - the path to cluster property file (in json/yaml formats) to the command. This is useful if the cluster isn’t accessible or permanently deleted.
spark_rapids qualification --cluster $CLUSTER_PROPS [flags]
User defines the cluster configuration of on-prem platform. The following is a sample cluster property file CLUSTER_PROPS in yaml format.
config:
masterConfig:
numCores: 2
memory: 7680MiB
workerConfig:
numCores: 8
memory: 7680MiB
numWorkers: 2
target_platform is required for on-prem clusters. Currently only Dataproc is supported.
Given Dataproc CLUSTER_NAME, user can generate its cluster property file CLUSTER_PROPS using the following command.
(Refer to gcloud CLI docs)
gcloud dataproc clusters describe $CLUSTER_NAME > $CLUSTER_PROPS
Given EMR CLUSTER_ID, user can generate its cluster property file CLUSTER_PROPS using the following command.
(Refer to AWS CLI docs)
aws emr describe-cluster --cluster-id $CLUSTER_ID > $CLUSTER_PROPS
Given Databricks CLUSTER_ID, user can generate its cluster property file CLUSTER_PROPS using the following command.
(Refer to Databricks CLI docs)
databricks clusters get $CLUSTER_ID > $CLUSTER_PROPS
Given Databricks CLUSTER_ID, user can generate its cluster property file CLUSTER_PROPS using the following command.
(Refer to Databricks CLI docs)
databricks clusters get $CLUSTER_ID > $CLUSTER_PROPS