User Guide (24.08.01)
RAPIDS Accelerator for Apache Spark - User Guide (24.08.01)

spark-rapids/user-guide/24.08.01/partials/tools-cluster-by-property-file.html

User can pass CLUSTER_PROPS - the path to cluster property file (in json/yaml formats) to the command. This is useful if the cluster isn’t accessible or permanently deleted.

spark_rapids qualification --cluster $CLUSTER_PROPS [flags]

User defines the cluster configuration of on-prem platform. The following is a sample cluster property file CLUSTER_PROPS in yaml format.

config:
  masterConfig:
    numCores: 2
    memory: 7680MiB
  workerConfig:
    numCores: 8
    memory: 7680MiB
    numWorkers: 2

target_platform is required for on-prem clusters. Currently only Dataproc is supported.

Given Dataproc CLUSTER_NAME, user can generate its cluster property file CLUSTER_PROPS using the following command. (Refer to gcloud CLI docs)

gcloud dataproc clusters describe $CLUSTER_NAME > $CLUSTER_PROPS

Given EMR CLUSTER_ID, user can generate its cluster property file CLUSTER_PROPS using the following command. (Refer to AWS CLI docs)

aws emr describe-cluster --cluster-id $CLUSTER_ID > $CLUSTER_PROPS

Given Databricks CLUSTER_ID, user can generate its cluster property file CLUSTER_PROPS using the following command. (Refer to Databricks CLI docs)

databricks clusters get $CLUSTER_ID > $CLUSTER_PROPS

Given Databricks CLUSTER_ID, user can generate its cluster property file CLUSTER_PROPS using the following command. (Refer to Databricks CLI docs)

databricks clusters get $CLUSTER_ID > $CLUSTER_PROPS
© Copyright 2024, NVIDIA. Last updated on Aug 29, 2024.