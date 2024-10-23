The RAPIDS Accelerator for Apache Spark provides limited support for Apache Iceberg tables. This document details the Apache Iceberg features that are supported.

Reading Tables#

Metadata Queries# Reads of Apache Iceberg metadata, that is: the history , snapshots , and other metadata tables associated with a table, won’t be GPU-accelerated. The CPU will continue to process these metadata-level queries.

Schema Evolution# Columns that are added and removed at the top level of the table schema are supported. Columns that are added or removed within struct columns aren’t supported.

Data Formats# Apache Iceberg can store data in various formats. Each section below details the levels of support for each of the underlying data formats. Parquet# Data stored in Parquet is supported with the same limitations for loading data from raw Parquet files. Refer to the Input/Output documentation for details. The following compression codecs applied to the Parquet data are supported: gzip (Apache Iceberg default)

snappy

uncompressed

zstd ORC# The RAPIDS Accelerator doesn’t support Apache Iceberg tables using the ORC data format. Avro# The RAPIDS Accelerator doesn’t support Apache Iceberg tables using the Avro data format.