Skip to content

Releases: snowflakedb/snowpark-python

Release

13 Jun 23:21
14456f6

Choose a tag to compare

1.5.0 (2023-06-09)

Behavior Changes

  • Aggregation results, from functions such as DataFrame.agg and DataFrame.describe, no longer strip away non-printing characters from column names.

New Features

  • Added support for the Python 3.9 runtime environment.
  • Added support for new functions in snowflake.snowpark.functions:
    • array_generate_range
    • array_unique_agg
    • collect_set
    • sequence
  • Added support for registering and calling stored procedures with TABLE return type.
  • Added support for parameter length in StringType() to specify the maximum number of characters that can be stored by the column.
  • Added the alias functions.element_at() for functions.get().
  • Added the alias Column.contains for functions.contains.
  • Added experimental feature DataFrame.alias.
  • Added support for querying metadata columns from stage when creating DataFrame using DataFrameReader.
  • Added support for StructType.add to append more fields to existing StructType objects.
  • Added support for parameter execute_as in StoredProcedureRegistration.register_from_file() to specify stored procedure caller rights.

Bug Fixes

  • Fixed a bug where the Dataframe.join_table_function did not run all of the necessary queries to set up the join table function when SQL simplifier was enabled.
  • Fixed type hint declaration for custom types - ColumnOrName, ColumnOrLiteralStr, ColumnOrSqlExpr, LiteralType and ColumnOrLiteral that were breaking mypy checks.
  • Fixed a bug where DataFrameWriter.save_as_table and DataFrame.copy_into_table failed to parse fully qualified table names.

Release

24 Apr 22:21
1d6973e

Choose a tag to compare

1.4.0 (2023-04-24)

New Features

  • Added support for session.getOrCreate.
  • Added support for alias Column.getField.
  • Added support for new functions in snowflake.snowpark.functions:
    • date_add and date_sub to make add and subtract operations easier.
    • daydiff
    • explode
    • array_distinct.
    • regexp_extract.
    • struct.
    • format_number.
    • bround.
    • substring_index
  • Added parameter skip_upload_on_content_match when creating UDFs, UDTFs and stored procedures using register_from_file to skip uploading files to a stage if the same version of the files are already on the stage.
  • Added support for DataFrame.save_as_table method to take table names that contain dots.
  • Flattened generated SQL when DataFrame.filter() or DataFrame.order_by() is followed by a projection statement (e.g. DataFrame.select(), DataFrame.with_column()).
  • Added support for creating dynamic tables (in private preview) using Dataframe.create_or_replace_dynamic_table.
  • Added an optional argument params in session.sql() to support binding variables. Note that this is not supported in stored procedures yet.

Bug Fixes

  • Fixed a bug in strtok_to_array where an exception was thrown when a delimiter was passed in.
  • Fixed a bug in session.add_import where the module had the same namespace as other dependencies.

Release

29 Mar 00:59
667ea4e

Choose a tag to compare

1.3.0 (2023-03-28)

New Features

  • Added support for delimiters parameter in functions.initcap().
  • Added support for functions.hash() to accept a variable number of input expressions.
  • Added API Session.conf for getting, setting or checking the mutability of any runtime configuration.
  • Added support for managing case sensitivity in Row results from DataFrame.collect using case_sensitive parameter.
  • Added indexer support for snowflake.snowpark.types.StructType.
  • Added a keyword argument log_on_exception to Dataframe.collect and Dataframe.collect_no_wait to optionally disable error logging for SQL exceptions.

Bug Fixes

  • Fixed a bug where a DataFrame set operation(DataFrame.substract, DataFrame.union, etc.) being called after another DataFrame set operation and DataFrame.select or DataFrame.with_column throws an exception.
  • Fixed a bug where chained sort statements are overwritten by the SQL simplifier.

Improvements

  • Simplified JOIN queries to use constant subquery aliases (SNOWPARK_LEFT, SNOWPARK_RIGHT) by default. Users can disable this at runtime with session.conf.set('use_constant_subquery_alias', False) to use randomly generated alias names instead.
  • Allowed specifying statement parameters in session.call().
  • Enabled the uploading of large pandas DataFrames in stored procedures by defaulting to a chunk size of 100,000 rows.

Release

03 Mar 01:37
04ce69d

Choose a tag to compare

1.2.0 (2023-03-02)

New Features

  • Added support for displaying source code as comments in the generated scripts when registering stored procedures. This
    is enabled by default, turn off by specifying source_code_display=False at registration.
  • Added a parameter if_not_exists when creating a UDF, UDTF or Stored Procedure from Snowpark Python to ignore creating the specified function or procedure if it already exists.
  • Accept integers when calling snowflake.snowpark.functions.get to extract value from array.
  • Added functions.reverse in functions to open access to Snowflake built-in function
    reverse.
  • Added parameter require_scoped_url in snowflake.snowflake.files.SnowflakeFile.open() (in Private Preview) to replace is_owner_file is marked for deprecation.

Bug Fixes

  • Fixed a bug that overwrote paramstyle to qmark when creating a Snowpark session.
  • Fixed a bug where df.join(..., how="cross") fails with SnowparkJoinException: (1112): Unsupported using join type 'Cross'.
  • Fixed a bug where querying a DataFrame column created from chained function calls used a wrong column name.

1.1.0

27 Jan 05:44
dc1e0c8

Choose a tag to compare

1.1.0 (2023-01-26)

New Features:

  • Added asc, asc_nulls_first, asc_nulls_last, desc, desc_nulls_first, desc_nulls_last, date_part and unix_timestamp in functions.
  • Added the property DataFrame.dtypes to return a list of column name and data type pairs.
  • Added the following aliases:
    • functions.expr() for functions.sql_expr().
    • functions.date_format() for functions.to_date().
    • functions.monotonically_increasing_id() for functions.seq8()
    • functions.from_unixtime() for functions.to_timestamp()

Bug Fixes:

  • Fixed a bug in SQL simplifier that didn’t handle Column alias and join well in some cases. See #658 for details.
  • Fixed a bug in SQL simplifier that generated wrong column names for function calls, NaN and INF.

Improvements

  • The session parameter PYTHON_SNOWPARK_USE_SQL_SIMPLIFIER is True after Snowflake 7.3 was released. In snowpark-python, session.sql_simplifier_enabled reads the value of PYTHON_SNOWPARK_USE_SQL_SIMPLIFIER by default, meaning that the SQL simplfier is enabled by default after the Snowflake 7.3 release. To turn this off, set PYTHON_SNOWPARK_USE_SQL_SIMPLIFIER in Snowflake to False or run session.sql_simplifier_enabled = False from Snowpark. It is recommended to use the SQL simplifier because it helps to generate more concise SQL.

1.0.0

01 Nov 04:32
1ec8b19

Choose a tag to compare

1.0.0 (2022-11-01)

New Features

  • Added Session.generator() to create a new DataFrame using the Generator table function.
  • Added a parameter secure to the functions that create a secure UDF or UDTF.

v0.12.0

14 Oct 21:15
87faa2f

Choose a tag to compare

v0.12.0 Pre-release
Pre-release

0.12.0 (2022-10-14)

New Features

  • Added new APIs for async job:
    • Session.create_async_job() to create an AsyncJob instance from a query id.
    • AsyncJob.result() now accepts argument result_type to return the results in different formats.
    • AsyncJob.to_df() returns a DataFrame built from the result of this asynchronous job.
    • AsyncJob.query() returns the SQL text of the executed query.
  • DataFrame.agg() and RelationalGroupedDataFrame.agg() now accept variable-length arguments.
  • Added parameters lsuffix and rsuffix to DataFram.join() and DataFrame.cross_join() to conveniently rename overlapping columns.
  • Added Table.drop_table() so you can drop the temp table after DataFrame.cache_result(). Table is also a context manager so you can use the with statement to drop the cache temp table after use.
  • Added Session.use_secondary_roles().
  • Added functions first_value() and last_value(). (contributed by @chasleslr)
  • Added on as an alias for using_columns and how as an alias for join_type in DataFrame.join().

Bug Fixes

  • Fixed a bug in Session.create_dataframe() that raised an error when schema names had special characters.
  • Fixed a bug in which options set in Session.read.option() were not passed to DataFrame.copy_into_table() as default values.
  • Fixed a bug in which DataFrame.copy_into_table() raises an error when a copy option has single quotes in the value.

v0.11.0

29 Sep 16:55
7a8f511

Choose a tag to compare

v0.11.0 Pre-release
Pre-release

0.11.0 (2022-09-28)

Behavior Changes:

  • Session.add_packages() now raises ValueError when the version of a package cannot be found in Snowflake Anaconda channel. Previously, Session.add_packages() succeeded, and a SnowparkSQLException exception was raised later in the UDF/SP registration step.

New Features:

  • Added method FileOperation.get_stream() to support downloading stage files as stream.
  • Added support in functions.ntiles() to accept int argument.
  • Added the following aliases:
    • functions.call_function() for functions.call_builtin().
    • functions.function() for functions.builtin().
    • DataFrame.order_by() for DataFrame.sort()
    • DataFrame.orderBy() for DataFrame.sort()
  • Improved DataFrame.cache_result() to return a more accurate Table class instead of a DataFrame class.
  • Added support to allow session as the first argument when calling StoredProcedure.

Improvements:

  • Improved nested query generation by flattening queries when applicable.
    • This improvement could be enabled by setting Session.sql_simplifier_enabled = True.
    • DataFrame.select(), DataFrame.with_column(), DataFrame.drop() and other select-related APIs have more flattened SQLs.
    • DataFrame.union(), DataFrame.union_all(), DataFrame.except_(), DataFrame.intersect(), DataFrame.union_by_name() have flattened SQLs generated when multiple set operators are chained.
  • Improved type annotations for async job APIs.

Bug Fixes:

  • Fixed a bug in which Table.update(), Table.delete(), Table.merge() try to reference a temp table that does not exist.

v0.10.0

16 Sep 19:30

Choose a tag to compare

v0.10.0 Pre-release
Pre-release

0.10.0 (2022-09-16)

New Features:

  • Added experimental APIs for evaluating Snowpark dataframes with asynchronous queries:
    • Added keyword argument block to the following action APIs on Snowpark dataframes (which execute queries) to allow asynchronous evaluations:
      • DataFrame.collect(), DataFrame.to_local_iterator(), DataFrame.to_pandas(), DataFrame.to_pandas_batches(), DataFrame.count(), DataFrame.first().
      • DataFrameWriter.save_as_table(), DataFrameWriter.copy_into_location().
      • Table.delete(), Table.update(), Table.merge().
    • Added method DataFrame.collect_nowait() to allow asynchronous evaluations.
    • Added class AsyncJob to retrieve results from asynchronously executed queries and check their status.
  • Added support for table_type in Session.write_pandas(). You can now choose from these table_type options: "temporary", "temp", and "transient".
  • Added support for using Python structured data (list, tuple and dict) as literal values in Snowpark.
  • Added keyword argument execute_as to functions.sproc() and session.sproc.register() to allow registering a stored procedure as a caller or owner.
  • Added support for specifying a pre-configured file format when reading files from a stage in Snowflake.

Improvements:

  • Added support for displaying details of a Snowpark session.

Bug Fixes:

  • Fixed a bug in which DataFrame.copy_into_table() and DataFrameWriter.save_as_table() mistakenly created a new table if the table name is fully qualified, and the table already exists.

Deprecations:

  • Deprecated keyword argument create_temp_table in Session.write_pandas().
  • Deprecated invoking UDFs using arguments wrapped in a Python list or tuple. You can use variable-length arguments without a list or tuple.

Dependency updates

  • Updated snowflake-connector-python to 2.7.12.

v0.9.0

31 Aug 00:52
d45eb5c

Choose a tag to compare

v0.9.0 Pre-release
Pre-release

0.9.0 (2022-08-30)

New Features:

  • Added support for displaying source code as comments in the generated scripts when registering UDFs.
    This feature is turned on by default. To turn it off, pass the new keyword argument source_code_display as False when calling register() or @udf().
  • Added support for calling table functions from DataFrame.select(), DataFrame.with_column() and DataFrame.with_columns() which now take parameters of type table_function.TableFunctionCall for columns.
  • Added keyword argument overwrite to session.write_pandas() to allow overwriting contents of a Snowflake table with that of a Pandas DataFrame.
  • Added keyword argument column_order to df.write.save_as_table() to specify the matching rules when inserting data into table in append mode.
  • Added method FileOperation.put_stream() to upload local files to a stage via file stream.
  • Added methods TableFunctionCall.alias() and TableFunctionCall.as_() to allow aliasing the names of columns that come from the output of table function joins.
  • Added function get_active_session() in module snowflake.snowpark.context to get the current active Snowpark session.

Bug Fixes:

  • Fixed a bug in which batch insert should not raise an error when statement_params is not passed to the function.
  • Fixed a bug in which column names should be quoted when session.create_dataframe() is called with dicts and a given schema.
  • Fixed a bug in which creation of table should be skipped if the table already exists and is in append mode when calling df.write.save_as_table().
  • Fixed a bug in which third-party packages with underscores cannot be added when registering UDFs.

Improvements:

  • Improved function function.uniform() to infer the types of inputs max_ and min_ and cast the limits to IntegerType or FloatType correspondingly.