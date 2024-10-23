Spark event log(s) from Spark 2.0 or above version. Supports both rolled and compressed event logs with .lz4 , .lzf , .snappy and .zstd suffixes as well as Databricks-specific rolled and compressed(.gz) event logs.

The tool requires the Spark 3.x+ jars to be able to run but it doesn’t need an Apache Spark runtime. If you don’t already have Spark 3.x+ installed, you can download the Apache Spark Distribution to any machine and include the jars in the classpath.

This tool parses the Spark CPU event log(s) and creates an output report. Acceptable inputs are either individual or multiple event logs files or directories containing spark event logs in the local filesystem, HDFS, S3, ABFS, GCS or mixed. If you want to point to the local filesystem be sure to include prefix file: in the path. If any input is a remote file path or directory path, then you need to the connector dependencies to be on the classpath