Self Profiler#

The Spark RAPIDS plugin can automatically collect Nsight profiling via configuration settings. The toolset is packaged with the plugin - no other external dependencies are required. It will write out to a distributed file system to be collected after the job run. Spark RAPIDS will write Nsight profiles to the configured path while the job is running, and if we only profile some stages, it will finish writing when the last configured stage finishes. The result can be converted and then viewed with the Nsight Systems viewer.

Caveat: Enabling the self profiler will involve a 2%~3% performance regression.

Configuration#

To enable this feature, set the following Spark configurations:

spark.driver.extraJavaOptions=-Dai.rapids.cudf.nvtx.enabled=true
spark.executor.extraJavaOptions=-Dai.rapids.cudf.nvtx.enabled=true
spark.rapids.profile.pathPrefix=your-path <support distributed fs mount point>
spark.rapids.profile.compression=zstd <Specifies the compression codec to use when writing profile data, one of zstd or none>
spark.rapids.profile.stages=8 <Comma-separated list of stage IDs and hyphenated ranges of stage IDs>
spark.rapids.profile.taskLimitPerStage=10 <Limit the number of tasks to profile per stage. A value <= 0 will profile all tasks>
spark.rapids.profile.executors=0,1,2 <Comma-separated list of executors IDs and hyphenated ranges of executor IDs>

In the above example, profiling is enabled for executors 0, 1, and 2, and up to 10 tasks per stage will be profiled.

Profile Conversion#

The output file will need to be converted separately.

Building the Converter Tool#

The spark-rapids-jni repository has the spark_rapids_profile_converter tool. The converter can be built from source.

To clone and build the repository:

git clone https://github.com/NVIDIA/spark-rapids-jni.git
cd spark_rapids_jni
git submodule update --init
build/build-in-docker -DGPU_ARCHS=89-real package -DCPP_PARALLEL_LEVEL=60 -DBUILD_TESTS=OFF -DskipTests

After the build completes, the converter tool will be located at:

target/cmake-build/profiler/spark_rapids_profile_converter

Converting to NVTXT Format#

To convert the profile data to NVTXT format, use the following command:

~/src/spark-rapids-jni/target/cmake-build/profiler/spark_rapids_profile_converter -t -o /tmp/my-profile.nvtxt /tmp/my-profile.bin

If the dumped file is zstd compressed, then decompress the file first:

zstd -dv -o my-profile.bin my-profile.bin.zstd

Converting to Nsight Systems Format#

To convert the NVTXT file into a nsys-rep file that Nsight Systems viewer can display, run the ImportNvtxt tool:

/opt/nvidia/nsight-systems/2023.4.4/host-linux-x64/ImportNvtxt --cmd create -n /tmp/my-profile.nvtxt -o /tmp/myprofile.nsys-rep

The resulting nsys-rep file can now be loaded and viewed in Nsight Systems.