AutoTuner Configuration#

The RAPIDS Accelerator tools include an AutoTuner module that automatically generates optimized Spark configuration recommendations for GPU clusters. The AutoTuner can be customized through two types of YAML configuration files to better match your specific cluster and workload requirements.

These configuration options are available for both the Qualification and Profiling tools, and can be used with either the Tools JAR or the Tools CLI.

Target Cluster Information#

The --target-cluster-info argument provides a platform-aware way to specify cluster configuration. It accepts simplified cluster information such as instance types, which the tool uses to automatically determine system specifications.

Usage#

  • When using Tools JAR: --target-cluster-info /path/to/targetCluster.yaml

  • When using Tools CLI: --target_cluster_info /path/to/targetCluster.yaml

Examples#

For more sample target cluster configuration files, see the targetClusterInfo samples directory in the GitHub repository.

Example 1: CSP configuration with instance type#
1# Simple CSP configuration using instance type.
2# The tool will automatically determine system specifications based
3# on the instance type.
4workerInfo:
5  instanceType: g2-standard-24

Dataproc n1-standard instances support 1, 2, or 4 GPUs, so the GPU count must be specified explicitly in the configuration:

Example 2: Dataproc n1-standard with explicit GPU count#
1workerInfo:
2  instanceType: n1-standard-16
3  gpu:
4    count: 1
Example 3: OnPrem configuration with custom Spark properties#
 1# OnPrem configuration with explicit resource specifications
 2# and Spark property controls.
 3workerInfo:
 4  cpuCores: 8
 5  memoryGB: 40
 6  gpu:
 7    count: 1
 8    name: l4
 9sparkProperties:
10  # Enforced properties override AutoTuner recommendations
11  enforced:
12    spark.rapids.sql.concurrentGpuTasks: 2
13    spark.executor.cores: 8
14  # Properties preserved from the source application
15  preserve:
16    - spark.sql.shuffle.partitions
17    - spark.sql.files.maxPartitionBytes
18  # Properties excluded from AutoTuner recommendations
19  exclude:
20    - spark.rapids.shuffle.multiThreaded.reader.threads
21    - spark.rapids.shuffle.multiThreaded.writer.threads
22    - spark.rapids.sql.multiThreadedRead.numThreads

Note

The sparkProperties section is optional but allows fine-grained control over which properties are enforced, preserved from the source cluster, or excluded from tuning recommendations.

Instance Types by Platform#

The following table shows the default instance types used when --target-cluster-info is not provided, as well as the supported instance types that can be specified in the target cluster configuration file:

Default and Supported Instance Types#

Platform

Default Instance Type

Supported Instance Types

EMR

g6.4xlarge

G6 series: g6.xlarge, g6.2xlarge, g6.4xlarge, g6.8xlarge, g6.12xlarge, g6.16xlarge

Databricks AWS

g5.8xlarge

G5 series: g5.xlarge, g5.2xlarge, g5.4xlarge, g5.8xlarge, g5.12xlarge, g5.16xlarge

Databricks Azure

Standard_NC8as_T4_v3

Standard_NC*as_T4_v3 series: Standard_NC4as_T4_v3, Standard_NC8as_T4_v3, Standard_NC16as_T4_v3, Standard_NC64as_T4_v3

Dataproc

g2-standard-16

g2-standard series and n1-standard series with GPU attachments. For n1-standard instances, GPU count must be specified explicitly (see Example 2)

Dataproc-GKE

g2-standard-16

g2-standard series and n1-standard series with GPU attachments. For n1-standard instances, GPU count must be specified explicitly (see Example 2)

Dataproc-Serverless

g2-standard-16

g2-standard series

OnPrem

16 cores with L4 GPU

Any configuration using cpuCores, memoryGB, and gpu properties (see Example 3)

Note

Support for additional instance types is planned for future releases. In the meantime, for unsupported instance types on CSP platforms, you can use the OnPrem configuration format by specifying cpuCores, memoryGB, and gpu properties directly (see Example 3).

Custom Tuning Configurations#

The --tuning-configs argument allows you to override default AutoTuner tuning parameters. The AutoTuner uses a set of predefined constants for calculations such as memory allocation, GPU task concurrency, and partition sizing. You can customize these values to match your specific workload requirements.

Usage#

  • When using Tools JAR: --tuning-configs /path/to/custom.yaml

  • When using Tools CLI: --tuning_configs /path/to/custom.yaml

Example#

The default tuning configuration parameters and their descriptions are available in the tuningConfigs.yaml file. For more examples, see the customTuningConfigs.yaml in the GitHub repository.

Example custom tuning configuration file#
1# Custom tuning configurations override default AutoTuner parameters.
2# Only specify parameters that need to be changed from defaults.
3# Description and usedBy fields are optional.
4default:
5  - name: CONC_GPU_TASKS
6    max: 1
7  - name: HEAP_PER_CORE
8    default: 0.8g

Note

When providing custom tuning configurations, you only need to specify the parameters you want to override. All other parameters will use their default values.