You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+17Lines changed: 17 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -99,3 +99,20 @@ def model(dbt, session):
99
99
http_path="sql/protocolv1/..."
100
100
)
101
101
```
102
+
103
+
## Python models and ANSI mode
104
+
105
+
When ANSI mode is enabled (`spark.sql.ansi.enabled=true`), there are limitations when using pandas DataFrames in Python models:
106
+
107
+
1.**Regular pandas DataFrames**: dbt-databricks will automatically handle conversion even when ANSI mode is enabled, falling back to `spark.createDataFrame()` if needed.
108
+
109
+
2.**pandas-on-Spark DataFrames**: If you create pandas-on-Spark DataFrames directly in your model (using `pyspark.pandas` or `databricks.koalas`), you may encounter errors with ANSI mode enabled. In this case, you have two options:
110
+
- Disable ANSI mode for your session: Set `spark.sql.ansi.enabled=false` in your cluster or SQL warehouse configuration
111
+
- Set the pandas-on-Spark option in your model code:
112
+
```python
113
+
import pyspark.pandas as ps
114
+
ps.set_option('compute.fail_on_ansi_mode', False)
115
+
```
116
+
Note: This may cause unexpected behavior as pandas-on-Spark follows pandas semantics (returning null/NaN for invalid operations) rather than ANSISQL semantics (raising errors).
117
+
118
+
For more information about ANSI mode and its implications, see the [Spark documentation on ANSI compliance](https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html).
0 commit comments