snowflakedb
diff --git a/‎.github/workflows/parameters/rsa_keys/rsa_key_aws.p8.gpg‎
2.54 KB b/‎.github/workflows/parameters/rsa_keys/rsa_key_aws.p8.gpg‎
2.54 KB
diff --git a/‎.github/workflows/parameters/rsa_keys/rsa_key_azure.p8.gpg‎
2.54 KB b/‎.github/workflows/parameters/rsa_keys/rsa_key_azure.p8.gpg‎
2.54 KB
diff --git a/‎.github/workflows/parameters/rsa_keys/rsa_key_gcp.p8.gpg‎
2.53 KB b/‎.github/workflows/parameters/rsa_keys/rsa_key_gcp.p8.gpg‎
2.53 KB
diff --git a/‎CHANGELOG.md‎
Lines changed: 63 additions & 1 deletion b/‎CHANGELOG.md‎
Lines changed: 63 additions & 1 deletion
diff --git a/‎docs/source/modin/hybrid_execution.rst‎
Lines changed: 12 additions & 4 deletions b/‎docs/source/modin/hybrid_execution.rst‎
Lines changed: 12 additions & 4 deletions
diff --git a/‎docs/source/modin/supported/agg_supp.rst‎
Lines changed: 3 additions & 0 deletions b/‎docs/source/modin/supported/agg_supp.rst‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/source/modin/supported/dataframe_supported.rst‎
Lines changed: 5 additions & 1 deletion b/‎docs/source/modin/supported/dataframe_supported.rst‎
Lines changed: 5 additions & 1 deletion
diff --git a/‎docs/source/modin/supported/general_supported.rst‎
Lines changed: 2 additions & 2 deletions b/‎docs/source/modin/supported/general_supported.rst‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/source/modin/supported/groupby_supported.rst‎
Lines changed: 4 additions & 1 deletion b/‎docs/source/modin/supported/groupby_supported.rst‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎docs/source/modin/supported/series_supported.rst‎
Lines changed: 5 additions & 1 deletion b/‎docs/source/modin/supported/series_supported.rst‎
Lines changed: 5 additions & 1 deletion
@@ -49,21 +49,43 @@
     - `st_y`
     - `st_ymax`
     - `st_ymin`
-
+    - `st_geogfromgeohash`
+    - `st_geogpointfromgeohash`
+    - `st_geographyfromwkb`
+    - `st_geographyfromwkt`
+    - `st_geometryfromwkb`
+    - `st_geometryfromwkt`
+    - `try_to_geography`
+    - `try_to_geometry`
+- Added a parameter to enable and disable automatic column name aliasing for `interval_day_time_from_parts` and `interval_year_month_from_parts` functions.
 
 #### Bug Fixes
 
 - Fixed a bug that `DataFrameReader.xml` fails to parse XML files with undeclared namespaces when `ignoreNamespace` is `True`.
 - Added a fix for floating point precision discrepancies in `interval_day_time_from_parts`.
 - Fixed a bug where writing Snowpark pandas dataframes on the pandas backend with a column multiindex to Snowflake with `to_snowflake` would raise `KeyError`.
 - Fixed a bug that `DataFrameReader.dbapi` (PuPr) is not compatible with oracledb 3.4.0.
+- Fixed a bug where `modin` would unintentionally be imported during session initialization in some scenarios.
+- Fixed a bug where `session.udf|udtf|udaf|sproc.register` failed when an extra session argument was passed. These methods do not expect a session argument; please remove it if provided.
+- Fixed a bug in `DataFrameGroupBuy.agg` where func is a list of tuples used to set the names of the output columns.
+
+#### Improvements
+
+- The default maximum length for inferred StringType columns during schema inference in `DataFrameReader.dbapi` is now increased from 16MB to 128MB in parquet file based ingestion.
 
 #### Dependency Updates
 
 - Updated dependency of `snowflake-connector-python>=3.17,<5.0.0`.
 
 ### Snowpark pandas API Updates
 
+#### New Features
+
+- Added support for the `dtypes` parameter of `pd.get_dummies`
+- Added support for `nunique` in `df.pivot_table`, `df.agg` and other places where aggregate functions can be used.
+- Added support for `DataFrame.interpolate` and `Series.interpolate` with the "linear", "ffill"/"pad", and "backfill"/bfill" methods. These use the SQL `INTERPOLATE_LINEAR`, `INTERPOLATE_FFILL`, and `INTERPOLATE_BFILL` functions (PuPr).
+- Added support for `Dataframe.groupby.rolling()`.
+
 #### Improvements
 
 - Improved performance of `Series.to_snowflake` and `pd.to_snowflake(series)` for large data by uploading data via a parquet file. You can control the dataset size at which Snowpark pandas switches to parquet with the variable `modin.config.PandasToSnowflakeParquetThresholdBytes`.
@@ -73,6 +95,16 @@
   - `skew()` with `axis=1` or `numeric_only=False` parameters
   - `round()` with `decimals` parameter as a Series
   - `corr()` with `method!=pearson` parameter
+  - `shift()` with `suffix` or non-integer `periods` parameters
+  - `sort_index()` with `axis=1` or `key` parameters
+  - `sort_values()` with `axis=1`
+  - `melt()` with `col_level` parameter
+  - `apply()` with `result_type` parameter for DataFrame
+  - `pivot_table()` with `sort=True`, non-string `index` list, non-string `columns` list, non-string `values` list, or `aggfunc` dict with non-string values
+  - `fillna()` with `downcast` parameter or using `limit` together with `value`
+  - `dropna()` with `axis=1`
+
+
 - Set `cte_optimization_enabled` to True for all Snowpark pandas sessions.
 - Add support for the following in faster pandas:
   - `isin`
@@ -105,7 +137,37 @@
   - `dt.days_in_month`
   - `dt.daysinmonth`
   - `sort_values`
+  - `loc` (setting columns)
   - `to_datetime`
+  - `rename`
+  - `drop`
+  - `invert`
+  - `duplicated`
+  - `iloc`
+  - `head`
+  - `columns` (e.g., df.columns = ["A", "B"])
+  - `agg`
+  - `min`
+  - `max`
+  - `count`
+  - `sum`
+  - `mean`
+  - `median`
+  - `std`
+  - `var`
+  - `groupby.agg`
+  - `groupby.min`
+  - `groupby.max`
+  - `groupby.count`
+  - `groupby.sum`
+  - `groupby.mean`
+  - `groupby.median`
+  - `groupby.std`
+  - `groupby.var`
+  - `groupby.nunique`
+  - `groupby.size`
+  - `groupby.apply`
+  - `drop_duplicates`
 - Reuse row count from the relaxed query compiler in `get_axis_len`.
 
 #### Bug Fixes
 
@@ -1,5 +1,5 @@
 ===========================================
-Hybrid Execution (Public Preview)
+Hybrid Execution
 ===========================================
 
 Snowpark pandas supports workloads on mixed underlying execution engines and will automatically
@@ -37,8 +37,8 @@ read_snowflake, value_counts, tail, var, std, sum, sem, max, min, mean, agg, agg
 Examples
 ========
 
-Enabling Hybrid Execution
-~~~~~~~~~~~~~~~~~~~~~~~~~
+Disabling or Enabling Hybrid Execution
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 .. code-block:: python
 
@@ -140,4 +140,12 @@ Debugging Hybrid Execution
 
 `pd.explain_switch()` provides information on how execution engine decisions
 are made. This method prints a simplified version of the command unless `simple=False` is
-passed as an argument.
+passed as an argument.
+
+Performance Considerations
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+Hybrid mode will generally perform well with small datasets and traditional notebook
+workloads, but merge-heavy workloads using a star schema can result in moving data too
+often, particularly when tables in the star schema straddle the transfer-cost boundary.
+Since the Snowflake Warehouse is designed for these SQL-like workloads turning off hybrid
+mode may be desirable.
@@ -38,6 +38,9 @@ methods ``pd.pivot_table``, ``DataFrame.pivot_table``, and ``pd.crosstab``.
 | ``median``                  | ``Y`` for ``axis=0``.               | ``Y``                            | ``Y``                                      | ``Y``                                   | ``Y``                                   |
 |                             | ``N`` for  ``axis=1``.              |                                  |                                            |                                         |                                         |
 +-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+-----------------------------------------+
+| ``nunique``                 | ``Y`` for ``axis=0``.               | ``Y``                            | ``Y``                                      | ``Y``                                   | ``Y``                                   |
+|                             | ``N`` for  ``axis=1``.              |                                  |                                            |                                         |                                         |
++-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+-----------------------------------------+
 | ``size``                    | ``Y`` for ``axis=0``.               | ``Y``                            | ``Y``                                      | ``Y``                                   | ``N``                                   |
 |                             | ``N`` for  ``axis=1``.              |                                  |                                            |                                         |                                         |
 +-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+-----------------------------------------+
 
@@ -227,7 +227,11 @@ Methods
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
 | ``insert``                  | Y                               |                                  |                                                    |
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
-| ``interpolate``             | N                               |                                  |                                                    |
+| ``interpolate``             | P                               |                                  | ``N`` if ``axis == 1``, ``limit`` is set,          |
+|                             |                                 |                                  | ``limit_area`` is "outside", or ``method`` is not  |
+|                             |                                 |                                  | "linear", "bfill", "backfill", "ffill", or "pad".  |
+|                             |                                 |                                  | ``limit_area="inside"`` is supported only when     |
+|                             |                                 |                                  | ``method`` is ``linear``.                          |
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
 | ``isetitem``                | N                               |                                  |                                                    |
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
 
@@ -32,8 +32,8 @@ Data manipulations
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
 | ``from_dummies``            | N                               |                                  |                                                    |
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
-| ``get_dummies``             | P                               | ``sparse`` is ignored            | ``Y`` if params ``dummy_na``, ``drop_first``       |
-|                             |                                 |                                  | and ``dtype`` are default, otherwise ``N``         |
+| ``get_dummies``             | P                               | ``sparse`` is ignored            | ``Y`` if params ``dummy_na`` and ``drop_first``    |
+|                             |                                 |                                  | are default, otherwise ``N``                       |
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
 | ``json_normalize``          | Y                               |                                  |                                                    |
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
 
@@ -153,7 +153,10 @@ Computations/descriptive stats
 |                             |                                 | will be lost. ``rule`` frequencies 's', 'min',     |
 |                             |                                 | 'h', and 'D' are supported.                        |
 +-----------------------------+---------------------------------+----------------------------------------------------+
-| ``rolling``                 | N                               |                                                    |
+| ``rolling``                 | P                               | Implemented for DataframeGroupby objects. ``N`` for|
+|                             |                                 | ``on``, non-integer ``window``, ``axis = 1``,      |
+|                             |                                 | ``method`` != ``single``, ``min_periods = 0``, or  |
+|                             |                                 | ``closed``  != ``None``.                           |
 +-----------------------------+---------------------------------+----------------------------------------------------+
 | ``sample``                  | N                               |                                                    |
 +-----------------------------+---------------------------------+----------------------------------------------------+
 
@@ -243,7 +243,11 @@ Methods
 | ``info``                    | D                               |                                  | Different Index types are used in pandas but not   |
 |                             |                                 |                                  | in Snowpark pandas                                 |
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
-| ``interpolate``             | N                               |                                  |                                                    |
+| ``interpolate``             | P                               |                                  | ``N`` if ``limit`` is set,                         |
+|                             |                                 |                                  | ``limit_area`` is "outside", or ``method`` is not  |
+|                             |                                 |                                  | "linear", "bfill", "backfill", "ffill", or "pad".  |
+|                             |                                 |                                  | ``limit_area="inside"`` is supported only when     |
+|                             |                                 |                                  | ``method`` is ``linear``.                          |
 +-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
 | ``isin``                    | Y                               |                                  | Snowpark pandas deviates with respect to handling  |
 |                             |                                 |                                  | NA values                                          |