|
147 | 147 | "```"
|
148 | 148 | ]
|
149 | 149 | },
|
| 150 | + { |
| 151 | + "cell_type": "markdown", |
| 152 | + "id": "a9ceb6cb-3bc4-4c23-b74b-84e60fd64e11", |
| 153 | + "metadata": {}, |
| 154 | + "source": [ |
| 155 | + "### Invalid Mixed-Type Operations\n", |
| 156 | + "**ANSI off:** Spark implicitly coerces so these operations succeed.\n", |
| 157 | + "\n", |
| 158 | + "**ANSI on:** Behaves like pandas, such operations are disallowed and raise errors.\n", |
| 159 | + "\n", |
| 160 | + "Operation types that show behavior changes under ANSI mode:\n", |
| 161 | + "\n", |
| 162 | + "- **Decimal–Float Arithmetic**: `/`, `//`, `*`, `%` \n", |
| 163 | + "- **Boolean vs. None**: `|`, `&`, `^`" |
| 164 | + ] |
| 165 | + }, |
| 166 | + { |
| 167 | + "cell_type": "markdown", |
| 168 | + "id": "2a8d5705-11ea-458c-8528-c7b1b7c88472", |
| 169 | + "metadata": {}, |
| 170 | + "source": [ |
| 171 | + "Example: Decimal–Float Arithmetic\n", |
| 172 | + "```python\n", |
| 173 | + ">>> import decimal\n", |
| 174 | + ">>> pser = pd.Series([decimal.Decimal(1), decimal.Decimal(2)])\n", |
| 175 | + ">>> psser = ps.from_pandas(pser)\n", |
| 176 | + "\n", |
| 177 | + "# ANSI on\n", |
| 178 | + ">>> spark.conf.set(\"spark.sql.ansi.enabled\", True)\n", |
| 179 | + ">>> psser * 0.1\n", |
| 180 | + "Traceback (most recent call last):\n", |
| 181 | + "...\n", |
| 182 | + "TypeError: Multiplication can not be applied to given types.\n", |
| 183 | + "\n", |
| 184 | + "# ANSI off\n", |
| 185 | + ">>> spark.conf.set(\"spark.sql.ansi.enabled\", False)\n", |
| 186 | + ">>> psser * 0.1\n", |
| 187 | + "0 0.1\n", |
| 188 | + "1 0.2\n", |
| 189 | + "dtype: float64\n", |
| 190 | + "\n", |
| 191 | + "# Pandas\n", |
| 192 | + ">>> pser * 0.1\n", |
| 193 | + "...\n", |
| 194 | + "TypeError: unsupported operand type(s) for *: 'decimal.Decimal' and 'float'\n", |
| 195 | + "```" |
| 196 | + ] |
| 197 | + }, |
| 198 | + { |
| 199 | + "cell_type": "markdown", |
| 200 | + "id": "0d2b8268-4b98-4239-95db-5269f9c658d2", |
| 201 | + "metadata": {}, |
| 202 | + "source": [ |
| 203 | + "Example: Boolean vs. None\n", |
| 204 | + "```python\n", |
| 205 | + "# ANSI on\n", |
| 206 | + ">>> spark.conf.set(\"spark.sql.ansi.enabled\", True)\n", |
| 207 | + ">>> ps.Series([True, False]) | None\n", |
| 208 | + "Traceback (most recent call last):\n", |
| 209 | + "...\n", |
| 210 | + "TypeError: OR can not be applied to given types.\n", |
| 211 | + "\n", |
| 212 | + "# ANSI off\n", |
| 213 | + ">>> spark.conf.set(\"spark.sql.ansi.enabled\", False)\n", |
| 214 | + ">>> ps.Series([True, False]) | None\n", |
| 215 | + "0 False \n", |
| 216 | + "1 False\n", |
| 217 | + "dtype: bool\n", |
| 218 | + "\n", |
| 219 | + "# Pandas\n", |
| 220 | + ">>> pd.Series([True, False]) | None\n", |
| 221 | + "...\n", |
| 222 | + "TypeError: unsupported operand type(s) for |: 'bool' and 'NoneType'\n", |
| 223 | + "```" |
| 224 | + ] |
| 225 | + }, |
150 | 226 | {
|
151 | 227 | "cell_type": "markdown",
|
152 | 228 | "id": "fe146afd",
|
|
0 commit comments