Releases · snowflakedb/snowpark-python

08 Feb 21:22

sfc-gh-anavalos

v1.12.1

86b16f7

Release

1.12.1 (2024-02-08)

Improvements

Use split_blocks=True by default during to_pandas conversion, for optimal memory allocation. This parameter is passed to pyarrow.Table.to_pandas, which enables PyArrow to split the memory allocation into smaller, more manageable blocks instead of allocating a single contiguous block. This results in better memory management when dealing with larger datasets.

Bug Fixes

Fixed a bug in DataFrame.to_pandas that caused an error when evaluating on a Dataframe with an IntergerType column with null values.

Assets 2

31 Jan 00:23

sfc-gh-anavalos

v1.12.0

638ce3d

v1.12.0

1.12.0 (2024-01-30)

New Features

Exposed statement_params in StoredProcedure.__call__.
Added two optional arguments to Session.add_import.
- chunk_size: The number of bytes to hash per chunk of the uploaded files.
- whole_file_hash: By default only the first chunk of the uploaded import is hashed to save time. When this is set to True each uploaded file is fully hashed instead.
Added parameters external_access_integrations and secrets when creating a UDAF from Snowpark Python to allow integration with external access.
Added a new method Session.append_query_tag. Allows an additional tag to be added to the current query tag by appending it as a comma separated value.
Added a new method Session.update_query_tag. Allows updates to a JSON encoded dictionary query tag.
SessionBuilder.getOrCreate will now attempt to replace the singleton it returns when token expiration has been detected.
Added support for new functions in snowflake.snowpark.functions:
- array_except
- create_map
- sign/signum
Added the following functions to DataFrame.analytics:
- Added the moving_agg function in DataFrame.analytics to enable moving aggregations like sums and averages with multiple window sizes.
- Added the cummulative_agg function in DataFrame.analytics to enable moving aggregations like sums and averages with multiple window sizes.

Bug Fixes

Fixed a bug in DataFrame.na.fill that caused Boolean values to erroneously override integer values.
Fixed a bug in Session.create_dataframe where the Snowpark DataFrames created using pandas DataFrames were not inferring the type for timestamp columns correctly. The behavior is as follows:
- Earlier timestamp columns without a timezone would be converted to nanosecond epochs and inferred as LongType(), but will now be correctly maintained as timestamp values and be inferred as TimestampType(TimestampTimeZone.NTZ).
- Earlier timestamp columns with a timezone would be inferred as TimestampType(TimestampTimeZone.NTZ) and loose timezone information but will now be correctly inferred as TimestampType(TimestampTimeZone.LTZ) and timezone information is retained correctly.
- Set session parameter PYTHON_SNOWPARK_USE_LOGICAL_TYPE_FOR_CREATE_DATAFRAME to revert back to old behavior. It is recommended that you update your code to align with correct behavior because the parameter will be removed in the future.
Fixed a bug that DataFrame.to_pandas gets decimal type when scale is not 0, and creates an object dtype in pandas. Instead, we cast the value to a float64 type.
Fixed bugs that wrongly flattened the generated SQL when one of the following happens:
- DataFrame.filter() is called after DataFrame.sort().limit().
- DataFrame.sort() or filter() is called on a DataFrame that already has a window function or sequence-dependent data generator column.
  For instance, df.select("a", seq1().alias("b")).select("a", "b").sort("a") won't flatten the sort clause anymore.
- a window or sequence-dependent data generator column is used after DataFrame.limit(). For instance, df.limit(10).select(row_number().over()) won't flatten the limit and select in the generated SQL.
Fixed a bug where aliasing a DataFrame column raised an error when the DataFame was copied from another DataFrame with an aliased column. For instance,
```
df = df.select(col("a").alias("b"))
df = copy(df)
df.select(col("b").alias("c"))  # threw an error. Now it's fixed.
```
Fixed a bug in Session.create_dataframe that the non-nullable field in a schema is not respected for boolean type. Note that this fix is only effective when the user has the privilege to create a temp table.
Fixed a bug in SQL simplifier where non-select statements in session.sql dropped a SQL query when used with limit().
Fixed a bug that raised an exception when session parameter ERROR_ON_NONDETERMINISTIC_UPDATE is true.

Behavior Changes (API Compatible)

When parsing data types during a to_pandas operation, we rely on GS precision value to fix precision issues for large integer values. This may affect users where a column that was earlier returned as int8 gets returned as int64. Users can fix this by explicitly specifying precision values for their return column.
Aligned behavior for Session.call in case of table stored procedures where running Session.call would not trigger stored procedure unless a collect() operation was performed.
StoredProcedureRegistration will now automatically add snowflake-snowpark-python as a package dependency. The added dependency will be on the client's local version of the library and an error is thrown if the server cannot support that version.

Assets 2

07 Dec 21:25

sfc-gh-tvidyasankar

v1.11.1

7813b7d

Release

1.11.1 (2023-12-07)

Bug Fixes

Fixed a bug that numpy should not be imported at the top level of mock module.

Assets 2

07 Dec 01:04

sfc-gh-tvidyasankar

v1.11.0

a09f5c7

Release

1.11.0 (2023-12-05)

New Features

Add the conn_error attribute to SnowflakeSQLException that stores the whole underlying exception from snowflake-connector-python.
Added support for RelationalGroupedDataframe.pivot() to access pivot in the following pattern Dataframe.group_by(...).pivot(...).
Added experimental feature: Local Testing Mode, which allows you to create and operate on Snowpark Python DataFrames locally without connecting to a Snowflake account. You can use the local testing framework to test your DataFrame operations locally, on your development machine or in a CI (continuous integration) pipeline, before deploying code changes to your account.
Added support for arrays_to_object new functions in snowflake.snowpark.functions.
Added support for the vector data type.

Dependency Updates

Bumped cloudpickle dependency to work with cloudpickle==2.2.1
Updated snowflake-connector-python to 3.4.0.

Bug Fixes

DataFrame column names quoting check now supports newline characters.
Fix a bug where a DataFrame generated by session.read.with_metadata creates inconsistent table when doing df.write.save_as_table.

Assets 2

03 Nov 20:12

sfc-gh-kdama

v1.10.0

92a77fc

Release

1.10.0 (2023-11-03)

New Features

Added support for managing case sensitivity in DataFrame.to_local_iterator().
Added support for specifying vectorized UDTF's input column names by using the optional parameter input_names in UDTFRegistration.register/register_file and functions.pandas_udtf. By default, RelationalGroupedDataFrame.applyInPandas will infer the column names from current dataframe schema.
Add sql_error_code and raw_message attributes to SnowflakeSQLException when it is caused by a SQL exception.

Bug Fixes

Fixed a bug in DataFrame.to_pandas() where converting snowpark dataframes to pandas dataframes was losing precision on integers with more than 19 digits.
Fixed a bug that session.add_packages can not handle requirement specifier that contains project name with underscore and version.
Fixed a bug in DataFrame.limit() when offset is used and the parent DataFrame uses limit. Now the offset won't impact the parent DataFrame's limit.
Fixed a bug in DataFrame.write.save_as_table where dataframes created from read api could not save data into snowflake because of invalid column name $1.

Behavior change

Changed the behavior of date_format:
- The format argument changed from optional to required.
- The returned result changed from a date object to a date-formatted string.
When a window function, or a sequence-dependent data generator (normal, zipf, uniform, seq1, seq2, seq4, seq8) function is used, the sort and filter operation will no longer be flattened when generating the query.

Assets 2

16 Oct 18:51

sfc-gh-anavalos

v1.9.0

5a01033

Release

1.9.0 (2023-10-13)

New Features

Added support for the Python 3.11 runtime environment.

Dependency updates

Added back the dependency of typing-extensions.

Bug Fixes

Fixed a bug where imports from permanent stage locations were ignored for temporary stored procedures, UDTFs, UDFs, and UDAFs.
Revert back to using CTAS (create table as select) statement for Dataframe.writer.save_as_table which does not need insert permission for writing tables.

New Features

Support PythonObjJSONEncoder json-serializable objects for ARRAY and OBJECT literals.

Assets 2

15 Sep 20:47

sfc-gh-anavalos

v1.8.0

100071a

Release

1.8.0 (2023-09-14)

New Features

Added support for VOLATILE/IMMUTABLE keyword when registering UDFs.
Added support for specifying clustering keys when saving dataframes using DataFrame.save_as_table.
Accept Iterable objects input for schema when creating dataframes using Session.create_dataframe.
Added the property DataFrame.session to return a Session object.
Added the property Session.session_id to return an integer that represents session ID.
Added the property Session.connection to return a SnowflakeConnection object .
Added support for creating a Snowpark session from a configuration file or environment variables.

Dependency updates

Updated snowflake-connector-python to 3.2.0.

Bug Fixes

Fixed a bug where automatic package upload would raise ValueError even when compatible package version were added in session.add_packages.
Fixed a bug where table stored procedures were not registered correctly when using register_from_file.
Fixed a bug where dataframe joins failed with invalid_identifier error.
Fixed a bug where DataFrame.copy disables SQL simplfier for the returned copy.
Fixed a bug where session.sql().select() would fail if any parameters are specified to session.sql().

Assets 2

28 Aug 20:43

sfc-gh-anavalos

v1.7.0

513fad6

Release

1.7.0 (2023-08-28)

New Features

Added parameters external_access_integrations and secrets when creating a UDF, UDTF or Stored Procedure from Snowpark Python to allow integration with external access.
Added support for these new functions in snowflake.snowpark.functions:
- array_flatten
- flatten
Added support for apply_in_pandas in snowflake.snowpark.relational_grouped_dataframe.
Added support for replicating your local Python environment on Snowflake via Session.replicate_local_environment.

Bug Fixes

Fixed a bug where session.create_dataframe fails to properly set nullable columns where nullability was affected by order or data was given.
Fixed a bug where DataFrame.select could not identify and alias columns in presence of table functions when output columns of table function overlapped with columns in dataframe.

Behavior Changes

When creating stored procedures, UDFs, UDTFs, UDAFs with parameter is_permanent=False will now create temporary objects even when stage_name is provided. The default value of is_permanent is False which is why if this value is not explicitly set to True for permanent objects, users will notice a change in behavior.
types.StructField now enquotes column identifier by default.

Assets 2

03 Aug 01:28

sfc-gh-anavalos

v1.6.1

243bf10

Release

1.6.1 (2023-08-02)

New Features

Added support for these new functions in snowflake.snowpark.functions:
- array_sort
- sort_array
- array_min
- array_max
- explode_outer
Added support for pure Python packages specified via Session.add_requirements or Session.add_packages. They are now usable in stored procedures and UDFs even if packages are not present on the Snowflake Anaconda channel.
- Added Session parameter custom_packages_upload_enabled and custom_packages_force_upload_enabled to enable the support for pure Python packages feature mentioned above. Both parameters default to False.
Added support for specifying package requirements by passing a Conda environment yaml file to Session.add_requirements.
Added support for asynchronous execution of multi-query dataframes that contain binding variables.
Added support for renaming multiple columns in DataFrame.rename.
Added support for Geometry datatypes.
Added support for params in session.sql() in stored procedures.
Added support for user-defined aggregate functions (UDAFs). This feature is currently in private preview.
Added support for vectorized UDTFs (user-defined table functions). This feature is currently in public preview.
Added support for Snowflake Timestamp variants (i.e., TIMESTAMP_NTZ, TIMESTAMP_LTZ, TIMESTAMP_TZ)
- Added TimestampTimezone as an argument in TimestampType constructor.
- Added type hints NTZ, LTZ, TZ and Timestamp to annotate functions when registering UDFs.

Improvements

Removed redundant dependency typing-extensions.
DataFrame.cache_result now creates temp table fully qualified names under current database and current schema.

Bug Fixes

Fixed a bug where type check happens on pandas before it is imported.
Fixed a bug when creating a UDF from numpy.ufunc.
Fixed a bug where DataFrame.union was not generating the correct Selectable.schema_query when SQL simplifier is enabled.

Behavior Changes

DataFrameWriter.save_as_table now respects the nullable field of the schema provided by the user or the inferred schema based on data from user input.

Dependency updates

Updated snowflake-connector-python to 3.0.4.

Assets 2

21 Jun 18:13

sfc-gh-anavalos

v1.5.1

c6e9b56

Release

1.5.1 (2023-06-20)

New Features

Added support for the Python 3.10 runtime environment.

Assets 2

Releases: snowflakedb/snowpark-python

Release

1.12.1 (2024-02-08)

Improvements

Bug Fixes

Uh oh!

v1.12.0

1.12.0 (2024-01-30)

New Features

Bug Fixes

Behavior Changes (API Compatible)

Uh oh!

Release

1.11.1 (2023-12-07)

Bug Fixes

Uh oh!

Release

1.11.0 (2023-12-05)

New Features

Dependency Updates

Bug Fixes

Uh oh!

Release

1.10.0 (2023-11-03)

New Features

Bug Fixes

Behavior change

Uh oh!

Release

1.9.0 (2023-10-13)

New Features

Dependency updates

Bug Fixes

New Features

Uh oh!

Release

1.8.0 (2023-09-14)

New Features

Dependency updates

Bug Fixes

Uh oh!

Release

1.7.0 (2023-08-28)

New Features

Bug Fixes

Behavior Changes

Uh oh!

Release

1.6.1 (2023-08-02)

New Features

Improvements

Bug Fixes

Behavior Changes

Dependency updates

Uh oh!

Release

1.5.1 (2023-06-20)

Uh oh!