Skip to content

Commit 7b85d03

Browse files
authored
docs: Update confs to bypass Iceberg Spark issues (#2166)
* Update confs to bypass Iceberg Spark issues - Document current limitation * Update iceberg.md * Users can diable Spark's AQE as well * Let users turn off AQE or Comet's broadcastExchange
1 parent eb197ca commit 7b85d03

File tree

1 file changed

+12
-3
lines changed

1 file changed

+12
-3
lines changed

docs/source/user-guide/iceberg.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,11 +80,13 @@ $SPARK_HOME/bin/spark-shell \
8080
--conf spark.sql.catalog.spark_catalog.type=hadoop \
8181
--conf spark.sql.catalog.spark_catalog.warehouse=/tmp/warehouse \
8282
--conf spark.plugins=org.apache.spark.CometPlugin \
83-
--conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager \
83+
--conf spark.comet.exec.shuffle.enabled=false \
8484
--conf spark.sql.iceberg.parquet.reader-type=COMET \
8585
--conf spark.comet.explainFallback.enabled=true \
8686
--conf spark.memory.offHeap.enabled=true \
87-
--conf spark.memory.offHeap.size=2g
87+
--conf spark.memory.offHeap.size=2g \
88+
--conf spark.comet.use.lazyMaterialization=false \
89+
--conf spark.comet.schemaEvolution.enabled=true
8890
```
8991

9092
Create an Iceberg table. Note that Comet will not accelerate this part.
@@ -138,4 +140,11 @@ scala> spark.sql(s"SELECT * from t1").explain()
138140
== Physical Plan ==
139141
*(1) CometColumnarToRow
140142
+- CometBatchScan spark_catalog.default.t1[c0#26, c1#27] spark_catalog.default.t1 (branch=null) [filters=, groupedBy=] RuntimeFilters: []
141-
```
143+
```
144+
145+
## Known issues
146+
- We temporarily disable Comet when there are delete files in Iceberg scan, see Iceberg [1.8.1 diff](../../../dev/diffs/iceberg/1.8.1.diff) and this [PR](https://github.com/apache/iceberg/pull/13793)
147+
- Iceberg scan w/ delete files lead to [runtime exceptions](https://github.com/apache/datafusion-comet/issues/2117) and [incorrect results](https://github.com/apache/datafusion-comet/issues/2118)
148+
- Enabling `CometShuffleManager` leads to [runtime exceptions](https://github.com/apache/datafusion-comet/issues/2086)
149+
- Spark Runtime Filtering isn't [working](https://github.com/apache/datafusion-comet/issues/2116)
150+
- You can bypass the issue by either setting `spark.sql.adaptive.enabled=false` or `spark.comet.exec.broadcastExchange.enabled=false`

0 commit comments

Comments
 (0)