Skip to content

Commit c31c297

Browse files
SNOW-2044877: fix snowpandas in apply error msg (#3678)
Signed-off-by: Labanya Mukhopadhyay <labanya.mukhopadhyay@snowflake.com>
1 parent 9567971 commit c31c297

File tree

3 files changed

+27
-1
lines changed

3 files changed

+27
-1
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@
5353
- Introduce faster pandas: Improved performance by deferring row position computation.
5454
- The following operations are currently supported and can benefit from the optimization: `read_snowflake`, `repr`, `loc`, `reset_index`, `merge`, and binary operations.
5555
- If a lazy object (e.g., DataFrame or Series) depends on a mix of supported and unsupported operations, the optimization will not be used.
56+
- Updated the error message for when Snowpark pandas is referenced within apply.
5657

5758
#### Dependency Updates
5859

src/snowflake/snowpark/modin/plugin/compiler/snowflake_query_compiler.py

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8884,7 +8884,20 @@ def _apply_with_udtf_and_dynamic_pivot_along_axis_1(
88848884
# materially slow down CI or individual groupby.apply() calls.
88858885
# TODO(SNOW-1345395): Investigate why and to what extent the cache_result
88868886
# is useful.
8887-
ordered_dataframe = cache_result(udtf_dataframe)
8887+
try:
8888+
ordered_dataframe = cache_result(udtf_dataframe)
8889+
except SnowparkSQLException as e:
8890+
if "No module named 'snowflake'" in str(
8891+
e
8892+
) or "Modin is not installed" in str(e):
8893+
raise SnowparkSQLException(
8894+
"modin.pandas cannot be referenced within a Snowpark pandas apply() function. "
8895+
"You can only use native pandas inside apply(). Please check developer guide for details "
8896+
"https://docs.snowflake.com/developer-guide/snowpark/python/pandas-on-snowflake#limitations."
8897+
)
8898+
else:
8899+
# retry the try-block logic
8900+
ordered_dataframe = cache_result(udtf_dataframe)
88888901

88898902
# After applying the udtf, the underlying Snowpark DataFrame becomes
88908903
# -------------------------------------------------------------------------------------------

tests/integ/modin/frame/test_apply.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1255,3 +1255,15 @@ def operation(col, arg):
12551255
eval_snowpark_pandas_result(
12561256
*create_test_dfs(test_data), lambda df: df.apply(operation, arg=arg2)
12571257
)
1258+
1259+
1260+
@sql_count_checker(query_count=3)
1261+
def test_snowpandas_in_apply_negative():
1262+
df = pd.DataFrame({"date": ["2025-01-01"], "time": ["12:34:56"]})
1263+
with pytest.raises(
1264+
SnowparkSQLException,
1265+
match=re.escape(
1266+
"modin.pandas cannot be referenced within a Snowpark pandas apply() function"
1267+
),
1268+
):
1269+
df.apply(lambda row: pd.to_datetime(f"{row.date} {row.time}"), axis=1)

0 commit comments

Comments
 (0)