Skip to content

Commit fb46424

Browse files
xinrong-mengdongjoon-hyun
authored andcommitted
[SPARK-53626][DOCS] Add invalid mixed-type operations to ANSI migration guide
### What changes were proposed in this pull request? Add invalid mixed-type operations to ANSI migration guide ### Why are the changes needed? Smooth user migration since ANSI is on by default for Pandas API on Spark ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests; manual verification as: <img width="571" height="735" alt="image" src="https://github.com/user-attachments/assets/9bb2a652-34b5-4a52-8ee6-f95c38e30c7a" /> ### Was this patch authored or co-authored using generative AI tooling? No Closes #52376 from xinrong-meng/ansi_guide_2. Authored-by: Xinrong Meng <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 2639792 commit fb46424

File tree

1 file changed

+76
-0
lines changed

1 file changed

+76
-0
lines changed

python/docs/source/user_guide/ansi_migration_guide.ipynb

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,82 @@
147147
"```"
148148
]
149149
},
150+
{
151+
"cell_type": "markdown",
152+
"id": "a9ceb6cb-3bc4-4c23-b74b-84e60fd64e11",
153+
"metadata": {},
154+
"source": [
155+
"### Invalid Mixed-Type Operations\n",
156+
"**ANSI off:** Spark implicitly coerces so these operations succeed.\n",
157+
"\n",
158+
"**ANSI on:** Behaves like pandas, such operations are disallowed and raise errors.\n",
159+
"\n",
160+
"Operation types that show behavior changes under ANSI mode:\n",
161+
"\n",
162+
"- **Decimal–Float Arithmetic**: `/`, `//`, `*`, `%` \n",
163+
"- **Boolean vs. None**: `|`, `&`, `^`"
164+
]
165+
},
166+
{
167+
"cell_type": "markdown",
168+
"id": "2a8d5705-11ea-458c-8528-c7b1b7c88472",
169+
"metadata": {},
170+
"source": [
171+
"Example: Decimal–Float Arithmetic\n",
172+
"```python\n",
173+
">>> import decimal\n",
174+
">>> pser = pd.Series([decimal.Decimal(1), decimal.Decimal(2)])\n",
175+
">>> psser = ps.from_pandas(pser)\n",
176+
"\n",
177+
"# ANSI on\n",
178+
">>> spark.conf.set(\"spark.sql.ansi.enabled\", True)\n",
179+
">>> psser * 0.1\n",
180+
"Traceback (most recent call last):\n",
181+
"...\n",
182+
"TypeError: Multiplication can not be applied to given types.\n",
183+
"\n",
184+
"# ANSI off\n",
185+
">>> spark.conf.set(\"spark.sql.ansi.enabled\", False)\n",
186+
">>> psser * 0.1\n",
187+
"0 0.1\n",
188+
"1 0.2\n",
189+
"dtype: float64\n",
190+
"\n",
191+
"# Pandas\n",
192+
">>> pser * 0.1\n",
193+
"...\n",
194+
"TypeError: unsupported operand type(s) for *: 'decimal.Decimal' and 'float'\n",
195+
"```"
196+
]
197+
},
198+
{
199+
"cell_type": "markdown",
200+
"id": "0d2b8268-4b98-4239-95db-5269f9c658d2",
201+
"metadata": {},
202+
"source": [
203+
"Example: Boolean vs. None\n",
204+
"```python\n",
205+
"# ANSI on\n",
206+
">>> spark.conf.set(\"spark.sql.ansi.enabled\", True)\n",
207+
">>> ps.Series([True, False]) | None\n",
208+
"Traceback (most recent call last):\n",
209+
"...\n",
210+
"TypeError: OR can not be applied to given types.\n",
211+
"\n",
212+
"# ANSI off\n",
213+
">>> spark.conf.set(\"spark.sql.ansi.enabled\", False)\n",
214+
">>> ps.Series([True, False]) | None\n",
215+
"0 False \n",
216+
"1 False\n",
217+
"dtype: bool\n",
218+
"\n",
219+
"# Pandas\n",
220+
">>> pd.Series([True, False]) | None\n",
221+
"...\n",
222+
"TypeError: unsupported operand type(s) for |: 'bool' and 'NoneType'\n",
223+
"```"
224+
]
225+
},
150226
{
151227
"cell_type": "markdown",
152228
"id": "fe146afd",

0 commit comments

Comments
 (0)