spark-rapids/user-guide/24.08.01/partials/tools-cluster-by-property-file.html
User can pass CLUSTER_PROPS
- the path to cluster property file (in json/yaml formats) to the command. This is useful if the cluster isn’t accessible or permanently deleted.
spark_rapids qualification --cluster $CLUSTER_PROPS [flags]
User defines the cluster configuration of on-prem platform. The following is a sample cluster property file CLUSTER_PROPS
in yaml format.
config:
masterConfig:
numCores: 2
memory: 7680MiB
workerConfig:
numCores: 8
memory: 7680MiB
numWorkers: 2
target_platform
is required for on-prem clusters. Currently only Dataproc is supported.
Given Dataproc CLUSTER_NAME
, user can generate its cluster property file CLUSTER_PROPS
using the following command.
(Refer to gcloud CLI docs)
gcloud dataproc clusters describe $CLUSTER_NAME > $CLUSTER_PROPS
Given EMR CLUSTER_ID
, user can generate its cluster property file CLUSTER_PROPS
using the following command.
(Refer to AWS CLI docs)
aws emr describe-cluster --cluster-id $CLUSTER_ID > $CLUSTER_PROPS
Given Databricks CLUSTER_ID
, user can generate its cluster property file CLUSTER_PROPS
using the following command.
(Refer to Databricks CLI docs)
databricks clusters get $CLUSTER_ID > $CLUSTER_PROPS
Given Databricks CLUSTER_ID
, user can generate its cluster property file CLUSTER_PROPS
using the following command.
(Refer to Databricks CLI docs)
databricks clusters get $CLUSTER_ID > $CLUSTER_PROPS