RAPIDS Accelerator for Apache Spark - User Guide#
Qualification Tool
Getting Started
- Overview
 - RAPIDS Accelerator with On-prem Cluster or Local Mode
- Spark Deployment Methods
 - Apache Spark Setup for GPU
 - Install Spark
 - Download the RAPIDS Accelerator jar
 - Install the GPU Discovery Script
 - Local Mode
 - Spark Standalone Cluster
 - Running on YARN
 - Running on Kubernetes
 - RAPIDS Accelerator Configuration and Tuning
 - Example Join Operation
 - Enabling RAPIDS Shuffle Manager
 - Advanced Configuration
 - Monitoring
 - Debugging
 - Out of GPU Memory
 
 - RAPIDS Accelerator on AWS EMR
- Leveraging RAPIDS Accelerator User Tools for Qualification and Bootstrap
 - Qualify CPU Workloads for GPU Acceleration
 - Configure and Launch AWS EMR with GPU Nodes
- Launch an EMR Cluster using AWS Console (GUI)
 - Launch an EMR Cluster using AWS CLI
 - Running the RAPIDS Accelerator User Tools Bootstrap for Optimal Cluster Spark Settings
 - Running an Example Join Operation Using Spark Shell
 - Submit Spark jobs to an EMR Cluster Accelerated by GPUs
 - Running GPU Accelerated Mortgage ETL Example using EMR Notebook
 
 
 - RAPIDS Accelerator on Databricks
 - RAPIDS Accelerator on GCP Dataproc
- Create a Dataproc Cluster Accelerated by GPUs
 - Run Python or Scala Spark Notebook on a Dataproc Cluster Accelerated by GPUs
 - Submit Spark jobs to a Dataproc Cluster Accelerated by GPUs
 - Diagnosing a GPU Cluster
 - Bootstrap GPU Cluster with Optimized Settings
 - Qualify CPU Workloads for GPU Acceleration
 - Tune Applications on GPU Cluster
 
 - RAPIDS Accelerator on Dataproc Serverless
 - RAPIDS Accelerator on Azure Synapse Analytics
 - RAPIDS and Kubernetes
 - RAPIDS and Alluxio
- Prerequisites
 - Alluxio setup
 - RAPIDS Configuration
 - Alluxio auto mount for AWS S3 buckets
 - Configure whether the disks used by Alluxio are fast
 - Alluxio Troubleshooting
 - Alluxio reliability
 - Alluxio limitations
 - Alluxio metrics
 
 - Spark Workload Qualification
 - RAPIDS Accelerator on Oracle Cloud Infrastructure
 - Spark3 GPU Configuration Guide on Yarn 3.2.1
 
Tuning
- Tuning Guide
 - Best Practices on the RAPIDS Accelerator for Apache Spark
- Workload Qualification
 - Performance Tuning
 - How to handle GPU OOM issues
- Reduce the number of concurrent tasks per GPU
 - Install CUDA 11.5 or above version
 - Identify which SQL, job and stage is involved in the error
 - Increase the number of tasks/partitions based on the type of the problematic stage
 - Reduce columnar batch size and file reader batch size
 - File an issue or ask a question on the GitHub repo
 
 
 
Profiling Tool
Additional Functionality
- RAPIDS Accelerator for Apache Spark ML Library Integration
 - RAPIDS Shuffle Manager
 - Apache Iceberg Support
 - Delta Lake Support
 - RAPIDS Accelerator File Cache
 
Appendixes
- Frequently Asked Questions
- What versions of Apache Spark does the RAPIDS Accelerator for Apache Spark support?
 - Which distributions are supported?
 - What CUDA versions are supported?
 - What hardware is supported?
 - How can I check if the RAPIDS Accelerator is installed and which version is running?
 - What parts of Apache Spark are accelerated?
 - Is the Spark 
DatasetAPI supported? - What is the road-map like?
 - How much faster will my query run?
 - What operators are best suited for the GPU?
 - Are there initialization costs?
 - How long does it take to translate a query to run on the GPU?
 - How can I tell what will run on the GPU and what will not run on it?
 - Why does the plan for the GPU query look different from the CPU query?
 - Why does 
explain()show that the GPU will be used even after settingspark.rapids.sql.enabledtofalse? - How are failures handled?
 - How does the Spark scheduler decide what to do on the GPU vs the CPU?
 - Is Dynamic Partition Pruning (DPP) Supported?
 - Is Adaptive Query Execution (AQE) Supported?
 - Why does my query show as not on the GPU when Adaptive Query Execution is enabled?
 - Does the RAPIDS Shuffle Manager support External Shuffle Service (ESS)?
 - Are cache and persist supported?
 - Can I cache data into GPU memory?
 - Is PySpark supported?
 - Are the R APIs for Spark supported?
 - Are the Java APIs for Spark supported?
 - Are the Scala APIs for Spark supported?
 - Is the GPU needed on the driver? Are there any benefits to having a GPU on the driver?
 - Are table layout formats supported?
 - How many tasks can I run per executor? How many should I run per executor?
 - How are 
spark.executor.cores,spark.task.resource.gpu.amount, andspark.rapids.sql.concurrentGpuTasksrelated? - Why are multiple GPUs per executor not supported?
 - Why are multiple executors per GPU not supported?
 - Is Multi-Instance GPU (MIG) supported?
 - How can I run custom expressions/UDFs on the GPU?
 - Why is the size of my output Parquet/ORC file different?
 - Why am I getting the error 
Failed to open the timezone filewhen reading files? - Why am I getting an error when trying to use pinned memory?
 - Why am I getting a buffer overflow error when using the KryoSerializer?
 - Why am I getting “Unable to acquire buffer” or “Trying to free an invalid buffer”?
 - Is speculative execution supported?
 - Why is my query in GPU mode slower than CPU mode?
 - Why is the Avro library not found by RAPIDS?
 - What is the default RMM pool allocator?
 - What is a 
RetryOOMorSplitAndRetryOOMexception? - Encryption Support
 - Can the RAPIDS Accelerator work with Spark on Ray (RayDP)?
 - I have more questions, where do I go?
 
 - Examples
 - Glossary
 - Contact Us