User Guide (24.02)

The Profiling tool analyzes both CPU or GPU-generated event logs and generates information which can be used for debugging and profiling Apache Spark applications.

In addition, the Profiling tool can process GPU a driver log to list any unsupported operators.

The output information contains the Spark version, executor details, properties, etc. The tool also will recommend setting for the application assuming that the job will be able to use all the cluster resources (CPU and GPU) when it is running. The Profiling tool optionally provides optimized RAPIDS configurations based on the worker’s information (see Auto-Tuner Support).

Auto-Tuner aims at optimizing Apache Spark applications by recommending a set of configurations to tune the performance of Rapids accelerator.

Currently, the Auto-Tuner calculates a set of configurations that impact the performance of Apache Spark apps executing on GPU. Those calculations can leverage cluster information (e.g. memory, cores, Spark default configurations) as well as information processed in the application event logs. Note that the tool also will recommend settings for the application assuming that the job will be able to use all the cluster resources (CPU and GPU) when it is running.


Auto-Tuner limitations:

RAPIDS Accelerator for Apache Spark CLI tool

The simplest way to run the Profiling tool. In running the Profiling tool standalone on Spark event logs, the tool can be run as a user tool command via a pip package for CSP environments (Google Dataproc, AWS EMR, Databricks-AWS, and Databricks-Azure) in addition to on-prem.

The tool will output the profile information per application along with recommended GPU cluster shape and RAPIDS configurations. Additionally, it provides a set of tuning recommendations specifically tailored for Spark applications running on GPU clusters, as part of the default output from the Auto-Tuner feature. For more information on running the Profiling tool from the pip-package, visit the quick start guide

Java API

The java API can be used for other environments that are not supported by the CLI tool.

This allows to run in three different ways:

Collection Mode

It simply collects information on each application individually and outputs a file per application. This is the default mode when no other options are specified. The tool generates a summary text file named profile.log along with a profile summary for each application under “rapids_4_spark_profile/{APP_ID}”.

Note that this is the only mode that supports the “auto-tuner” option described in more details in the Tuning Spark Properties For GPU Clusters section.

Combined Mode

Similar to the collection mode but then combines all the applications together and you get one file for all applications. The output goes into a summary text file rapids_4_spark_tools_combined.log located inside the subdirectory rapids_4_spark_profile/combined/.

Compare Mode

It combines all the applications information in the same tables into a single file and also adds in tables to compare stages and sql ids across all of those applications. This mode will use more memory if comparing lots of applications. The tool creates a summary text file named creates a summary text file named rapids_4_spark_tools_compare.log located inside the subdirectory rapids_4_spark_profile/compare/.

The Profiling tool generates some SQL metrics that can be found in the following docs:

Previous Best Practices on the RAPIDS Accelerator for Apache Spark
Next Quickstart
© Copyright 2023-2024, NVIDIA. Last updated on Mar 12, 2024.