Skip to content

Commit fd01458

Browse files
Merge branch 'main' into helmeleegy-SNOW-2432713
2 parents f0469e6 + 30fa5f3 commit fd01458

27 files changed

+1715
-557
lines changed

CHANGELOG.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@
77
#### New Features
88

99
- Added a new function `service` in `snowflake.snowpark.functions` that allows users to create a callable representing a Snowpark Container Services (SPCS) service.
10-
- Added a new function `group_by_all()` to the `DataFrame` class.
1110
- Added `connection_parameters` parameter to `DataFrameReader.dbapi()` (PuPr) method to allow passing keyword arguments to the `create_connection` callable.
1211
- Added support for `Session.begin_transaction`, `Session.commit` and `Session.rollback`.
1312
- Added support for the following functions in `functions.py`:
@@ -67,6 +66,8 @@
6766
- Fixed a bug where writing Snowpark pandas dataframes on the pandas backend with a column multiindex to Snowflake with `to_snowflake` would raise `KeyError`.
6867
- Fixed a bug that `DataFrameReader.dbapi` (PuPr) is not compatible with oracledb 3.4.0.
6968
- Fixed a bug where `modin` would unintentionally be imported during session initialization in some scenarios.
69+
- Fixed a bug where `session.udf|udtf|udaf|sproc.register` failed when an extra session argument was passed. These methods do not expect a session argument; please remove it if provided.
70+
- Fixed a bug in `DataFrameGroupBuy.agg` where func is a list of tuples used to set the names of the output columns.
7071

7172
#### Improvements
7273

@@ -83,6 +84,7 @@
8384
- Added support for the `dtypes` parameter of `pd.get_dummies`
8485
- Added support for `nunique` in `df.pivot_table`, `df.agg` and other places where aggregate functions can be used.
8586
- Added support for `DataFrame.interpolate` and `Series.interpolate` with the "linear", "ffill"/"pad", and "backfill"/bfill" methods. These use the SQL `INTERPOLATE_LINEAR`, `INTERPOLATE_FFILL`, and `INTERPOLATE_BFILL` functions (PuPr).
87+
- Added support for `Dataframe.groupby.rolling()`.
8688

8789
#### Improvements
8890

@@ -93,6 +95,16 @@
9395
- `skew()` with `axis=1` or `numeric_only=False` parameters
9496
- `round()` with `decimals` parameter as a Series
9597
- `corr()` with `method!=pearson` parameter
98+
- `shift()` with `suffix` or non-integer `periods` parameters
99+
- `sort_index()` with `axis=1` or `key` parameters
100+
- `sort_values()` with `axis=1`
101+
- `melt()` with `col_level` parameter
102+
- `apply()` with `result_type` parameter for DataFrame
103+
- `pivot_table()` with `sort=True`, non-string `index` list, non-string `columns` list, non-string `values` list, or `aggfunc` dict with non-string values
104+
- `fillna()` with `downcast` parameter or using `limit` together with `value`
105+
- `dropna()` with `axis=1`
106+
107+
96108
- Set `cte_optimization_enabled` to True for all Snowpark pandas sessions.
97109
- Add support for the following in faster pandas:
98110
- `isin`
@@ -127,6 +139,7 @@
127139
- `sort_values`
128140
- `loc` (setting columns)
129141
- `to_datetime`
142+
- `rename`
130143
- `drop`
131144
- `invert`
132145
- `duplicated`
@@ -151,6 +164,9 @@
151164
- `groupby.median`
152165
- `groupby.std`
153166
- `groupby.var`
167+
- `groupby.nunique`
168+
- `groupby.size`
169+
- `groupby.apply`
154170
- `drop_duplicates`
155171
- Reuse row count from the relaxed query compiler in `get_axis_len`.
156172

docs/source/modin/supported/groupby_supported.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,10 @@ Computations/descriptive stats
153153
| | | will be lost. ``rule`` frequencies 's', 'min', |
154154
| | | 'h', and 'D' are supported. |
155155
+-----------------------------+---------------------------------+----------------------------------------------------+
156-
| ``rolling`` | N | |
156+
| ``rolling`` | P | Implemented for DataframeGroupby objects. ``N`` for|
157+
| | | ``on``, non-integer ``window``, ``axis = 1``, |
158+
| | | ``method`` != ``single``, ``min_periods = 0``, or |
159+
| | | ``closed`` != ``None``. |
157160
+-----------------------------+---------------------------------+----------------------------------------------------+
158161
| ``sample`` | N | |
159162
+-----------------------------+---------------------------------+----------------------------------------------------+

docs/source/snowpark/dataframe.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,6 @@ DataFrame
5757
DataFrame.flatten
5858
DataFrame.groupBy
5959
DataFrame.group_by
60-
DataFrame.group_by_all
6160
DataFrame.group_by_grouping_sets
6261
DataFrame.intersect
6362
DataFrame.join

setup.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,10 @@
6868
"lxml", # used in XML reader unit tests
6969
]
7070
MODIN_DEVELOPMENT_REQUIREMENTS = [
71-
"scipy", # Snowpark pandas 3rd party library testing
71+
# Snowpark pandas 3rd party library testing. Cap the scipy version because
72+
# Snowflake cannot find newer versions of scipy for python 3.11+. See
73+
# SNOW-2452791.
74+
"scipy<=1.16.0",
7275
"statsmodels", # Snowpark pandas 3rd party library testing
7376
"scikit-learn", # Snowpark pandas 3rd party library testing
7477
# plotly version restricted due to foreseen change in query counts in version 6.0.0+

src/snowflake/snowpark/_internal/analyzer/analyzer.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,6 @@
9191
from snowflake.snowpark._internal.analyzer.grouping_set import (
9292
GroupingSet,
9393
GroupingSetsExpression,
94-
GroupByAll,
9594
)
9695
from snowflake.snowpark._internal.analyzer.select_statement import (
9796
Selectable,
@@ -347,8 +346,6 @@ def analyze(
347346
)
348347

349348
if isinstance(expr, GroupingSet):
350-
if isinstance(expr, GroupByAll):
351-
return "ALL"
352349
return self.grouping_extractor(expr, df_aliased_col_name_to_real_col_name)
353350

354351
if isinstance(expr, WindowExpression):

src/snowflake/snowpark/_internal/analyzer/grouping_set.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -40,10 +40,6 @@ class Rollup(GroupingSet):
4040
pass
4141

4242

43-
class GroupByAll(GroupingSet):
44-
pass
45-
46-
4743
class GroupingSetsExpression(Expression):
4844
def __init__(self, args: List[List[Expression]]) -> None:
4945
super().__init__()

0 commit comments

Comments
 (0)