[SPARK-53626][DOCS] Add invalid mixed-type operations to ANSI migration guide

xinrong-meng · dongjoon-hyun · commit fb46424cd112 · 2025-09-18T10:43:23.000-07:00
### What changes were proposed in this pull request? Add invalid mixed-type operations to ANSI migration guide ### Why are the changes needed? Smooth user migration since ANSI is on by default for Pandas API on Spark ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests; manual verification as: <img width="571" height="735" alt="image" src="https://github.com/user-attachments/assets/9bb2a652-34b5-4a52-8ee6-f95c38e30c7a" /> ### Was this patch authored or co-authored using generative AI tooling? No Closes #52376 from xinrong-meng/ansi_guide_2. Authored-by: Xinrong Meng <xinrong@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
diff --git a/python/docs/source/user_guide/ansi_migration_guide.ipynb b/python/docs/source/user_guide/ansi_migration_guide.ipynb
@@ -147,6 +147,82 @@
     "```"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "a9ceb6cb-3bc4-4c23-b74b-84e60fd64e11",
+   "metadata": {},
+   "source": [
+    "### Invalid Mixed-Type Operations\n",
+    "**ANSI off:** Spark implicitly coerces so these operations succeed.\n",
+    "\n",
+    "**ANSI on:** Behaves like pandas, such operations are disallowed and raise errors.\n",
+    "\n",
+    "Operation types that show behavior changes under ANSI mode:\n",
+    "\n",
+    "- **Decimal–Float Arithmetic**: `/`, `//`, `*`, `%`  \n",
+    "- **Boolean vs. None**: `|`, `&`, `^`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2a8d5705-11ea-458c-8528-c7b1b7c88472",
+   "metadata": {},
+   "source": [
+    "Example: Decimal–Float Arithmetic\n",
+    "```python\n",
+    ">>> import decimal\n",
+    ">>> pser = pd.Series([decimal.Decimal(1), decimal.Decimal(2)])\n",
+    ">>> psser = ps.from_pandas(pser)\n",
+    "\n",
+    "# ANSI on\n",
+    ">>> spark.conf.set(\"spark.sql.ansi.enabled\", True)\n",
+    ">>> psser * 0.1\n",
+    "Traceback (most recent call last):\n",
+    "...\n",
+    "TypeError: Multiplication can not be applied to given types.\n",
+    "\n",
+    "# ANSI off\n",
+    ">>> spark.conf.set(\"spark.sql.ansi.enabled\", False)\n",
+    ">>> psser * 0.1\n",
+    "0    0.1\n",
+    "1    0.2\n",
+    "dtype: float64\n",
+    "\n",
+    "# Pandas\n",
+    ">>> pser * 0.1\n",
+    "...\n",
+    "TypeError: unsupported operand type(s) for *: 'decimal.Decimal' and 'float'\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0d2b8268-4b98-4239-95db-5269f9c658d2",
+   "metadata": {},
+   "source": [
+    "Example: Boolean vs. None\n",
+    "```python\n",
+    "# ANSI on\n",
+    ">>> spark.conf.set(\"spark.sql.ansi.enabled\", True)\n",
+    ">>> ps.Series([True, False]) | None\n",
+    "Traceback (most recent call last):\n",
+    "...\n",
+    "TypeError: OR can not be applied to given types.\n",
+    "\n",
+    "# ANSI off\n",
+    ">>> spark.conf.set(\"spark.sql.ansi.enabled\", False)\n",
+    ">>> ps.Series([True, False]) | None\n",
+    "0    False                                                                      \n",
+    "1    False\n",
+    "dtype: bool\n",
+    "\n",
+    "# Pandas\n",
+    ">>> pd.Series([True, False]) | None\n",
+    "...\n",
+    "TypeError: unsupported operand type(s) for |: 'bool' and 'NoneType'\n",
+    "```"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "fe146afd",