Currently, the Auto-Tuner calculates a set of configurations that impact the performance of Apache Spark apps executing on GPU. Those calculations can leverage cluster information (for example, memory, cores, Spark default configurations) as well as information processed in the application event logs. The tool also will recommend settings for the application assuming that the job will be able to use all the cluster resources (CPU and GPU) when it’s running. The values loaded from the app logs have higher precedence than the default configs.
Note
Auto-Tuner limitations:
It’s assumed that all the worker nodes on the cluster are homogenous.
To run the Auto-Tuner, enable the auto-tuner flag. Optionally, provide target cluster information using --target-cluster-info <FILE_PATH> to specify the GPU worker node configuration for generating optimized recommendations. The file path can be local or remote (for example, HDFS).
If the --target-cluster-info argument isn’t supplied, the Auto-Tuner will use platform-specific default worker instance types for tuning recommendations. See AutoTuner Configuration for details on default instance types, supported platforms, and how to customize AutoTuner behavior.