spark-rapids/user-guide/24.04.01/partials/tools-autotuner-recommendations-output.html
The Auto-Tuner output has 2 main sections:
Spark Properties: A list of Apache Spark configurations to tune the performance of the app. The list is the result of
diff
between the existing app configurations and the recommended ones. Therefore, a recommendation matches the existing app configuration, it will not show up in the list.Comments: A list of messages to highlight properties that were missing in the app configurations, or the cause of failure to generate the recommendations.
Examples
1Spark Properties:
2
3–conf spark.executor.cores=16
4–conf spark.executor.instances=8
5–conf spark.executor.memory=32768m
6–conf spark.executor.memoryOverhead=7372m
7–conf spark.rapids.memory.pinnedPool.size=4096m
8–conf spark.rapids.sql.concurrentGpuTasks=2
9–conf spark.sql.files.maxPartitionBytes=512m
10–conf spark.sql.shuffle.partitions=200
11–conf spark.task.resource.gpu.amount=0.0625
12
13Comments:
14
15- ‘spark.executor.instances’ was not set.
16- ‘spark.executor.cores’ was not set.
17- ‘spark.task.resource.gpu.amount’ was not set.
18- ‘spark.rapids.sql.concurrentGpuTasks’ was not set.
19- ‘spark.executor.memory’ was not set.
20- ‘spark.rapids.memory.pinnedPool.size’ was not set.
21- ‘spark.executor.memoryOverhead’ was not set.
22- ‘spark.sql.files.maxPartitionBytes’ was not set.
23- ‘spark.sql.shuffle.partitions’ was not set.
24- ‘spark.sql.adaptive.enabled’ should be enabled for better
25 performance.
1Spark Properties:
2
3--conf spark.executor.instances=8
4--conf spark.sql.shuffle.partitions=200
5
6Comments:
7
8- 'spark.sql.shuffle.partitions' was not set.
1Cannot recommend properties. See Comments.
2
3Comments:
4- java.io.FileNotFoundException: File worker-info.yaml does not exist
5- 'spark.executor.memory' should be set to at least 2GB/core.
6- 'spark.executor.instances' should be set to (gpuCount * numWorkers).
7- 'spark.task.resource.gpu.amount' should be set to Max(1, (numCores / gpuCount)).
8- 'spark.rapids.sql.concurrentGpuTasks' should be set to Min(4, (gpuMemory / 7.5G)).
9- 'spark.rapids.memory.pinnedPool.size' should be set to 2048m.
10- 'spark.rapids.sql.enabled' should be true to enable SQL operations on the GPU.
11- 'spark.sql.adaptive.enabled' should be enabled for better performance.