Releases: pathwaycom/pathway
Releases · pathwaycom/pathway
v0.7.9
Changed
pw.io.http.rest_connectornow also accepts port as a string for backwards compatibility.
v0.7.8
Added
- Support for comparisons of tuples has been added.
- Standalone versions of methods such as
pw.groupby,pw.join,pw.join_inner,pw.join_left,pw.join_right, andpw.join_outerare now available. - The
absfunction from Python can now be used on Pathway expressions. - The
asof_joinmethod now has configurable temporal behavior. Thebehaviorparameter can be used to pass the configuration. - The state of the
deduplicateoperator can now be persisted.
Changed
interval_joincan now work with intervals of zero length.- The
pw.io.http.rest_connectorcan now open multiple endpoints on the same port using a newpw.io.http.PathwayWebserverclass. - The
pw.xpacks.connectors.sharepoint.readandpw.io.gdrive.readmethods now support the size limit for a single object. If set, it will exclude too large files and won't read them.
v0.7.7
Added
- pathway.xpacks.llm.splitter.TokenCountSplitter.
v0.7.6
New Features
Conversion Methods in pw.Json
- Introducing new methods for strict conversion of
pw.Jsonto desired types within a UDF body:as_int()as_float()as_str()as_bool()as_list()as_dict()
DateTime Functionality
- Added
table.col.dt.utc_from_timestampmethod: CreatesDateTimeUtcfrom timestamps represented asints orfloats. - Enhanced the
table.col.dt.timestampmethod with a newunitargument to specify the unit of the returned timestamp.
Experimental Features
- Introduced an experimental xpack with a Microsoft SharePoint input connector.
Enhancements
Improved JSON Handling
- Index operator (
[]) can now be directly applied topw.Jsonwithin UDFs to access elements of JSON objects, arrays, and strings.
Expanded Timestamp Functionality
- Enhanced the
table.col.dt.from_timestampmethod to createDateTimeNaivefrom timestamps represented asints orfloats. - Deprecated not specifying the
unitargument of thetable.col.dt.timestampmethod.
KNNIndex Enhancements
KNNIndexnow supports returning computed distances.- Added support for cosine similarity in
KNNIndex.
Deprecated Features
- The
offsetargument ofpw.stdlib.temporal.slidingandpw.stdlib.temporal.tumblingis deprecated. Useorigininstead, as it represents a point in time, not a duration.
Bug Fixes
DateTime Fixes
- Sliding window now works correctly with UTC Datetimes.
asof_join Improvements
- Temporal column in
asof_joinno longer has to be namedt. asof_joinincludes rows with equal times for all values of thedirectionparameter.
Fixed Issues
- Fixed an issue with
pw.io.gdrive.read: Shared folders support is now working seamlessly.
v0.7.5
Added
- Added Table.split() method for splitting table based on an expression into two tables.
- Columns with datatype duration can now be multiplied and divided by floats.
- Columns with datatype duration now support both true and floor division (
/and//) by integers.
Changed
- Pathway is better at typing if_else expressions when optional types are involved.
table.flatten()operator now supports Json array.- Buffers (used to delay outputs, configured via delay in
common_behavior) now flush the data when the computation is finished. The effect of this change can be seen when run in bounded (batch / multi-revision) mode. pw.io.subscribe()takes additional argumenton_time_end- the callback function to be called on each closed time of computation.pw.io.subscribe()is now a single-worker operator, guaranteeing thaton_endis triggered at most once.KNNIndexsupports now metadata filtering. Each query can specify it's own filter in the JMESPath format.
Fixed
- Resolved an optimization bug causing
pw.iterateto malfunction when handling columns effectively pointing to the same data.
v0.7.4
Fixed
- Fixed issues with standalone panel+Bokeh dashboards to ensure optimal functionality and performance.
v0.7.3
Added
- A method
weekdayhas been added to thedtnamespace, that can be called on column expressions containing datetime data. This method returns an integer that represents the day of the week. - EXPERIMENTAL: Methods
showandploton Tables, providing visualizations of data using HoloViz Panel. - Added support for
instanceparameter togroupby,join,windowbyand temporal join methods. pw.PersistenceMode.UDF_CACHINGpersistence mode enabling automatic caching ofAsyncTransformerinvocations.
Changed
- Methods
roundandflooron columns with datetimes now accept duration argument to be a string. pw.debug.compute_and_printandpw.debug.compute_and_print_update_streamhave a new argumentn_rowsthat limits the number of rows printed.pw.debug.table_to_pandashas a new argumentinclude_id(by defaultTrue). If set toFalse, creates a new index for the Pandas DataFrame, rather than using the keys of the Pathway Table.windowbyfunctionshardargument is now deprecated andinstanceshould be used.- Special column name
_pw_shardis now deprecated, and_pw_instanceshould be used. pw.ReplayModenow can be accessed aspw.PersistenceMode, while theSPEEDRUNandREALTIMEvariants are now accessible asSPEEDRUN_REPLAYandREALTIME_REPLAY.- EXPERIMENTAL:
pw.io.gdrive.readhas a new argumentwith_metadata(by defaultFalse). If set toTrue, adds a_metadatacolumn containing file metadata to the resulting table. - Methods
get_nearest_itemsandget_nearest_items_asof_nowofKNNIndexallow to specifyk(number of returned elements) separately in each query.
v0.7.2
Added
- Added ability of creating custom reducers using
pw.reducers.udf_reducerdecorator. Usepw.BaseCustomAccumulatoras a base class
for creating accumulators. Decorating accumulator returns reducer following custom logic. - A function
pw.debug.compute_and_print_update_streamthat computes and prints the update stream of the table. - SQLite input connector (
pw.io.sqlite).
Changed
pw.debug.parse_to_tableis now deprecated,pw.debug.table_from_markdownshould be used instead.pw.schema_from_csvnow hasquoteanddouble_quote_escapesarguments.
Fixed
- Schema returned from
pw.schema_from_csvwill have quotes removed from column names, so it will now work properly withpw.io.csv.read.
v0.7.1
Added
- Experimental Google Drive input connector.
- Stateful deduplication function (
pw.stateful.deduplicate) allowing alerting on significant changes. - The ability to split data into batches in
pw.debug.table_from_markdownandpw.debug.table_from_pandas.
v0.7.0
Added
- class
Behavior, a superclass of all behavior classes. - class
ExactlyOnceBehaviorindicating we want to create aCommonBehaviorthat results in each window producing exactly one output (shifted in time by an optionalshiftparameter). - function
exactly_once_behaviorcreating an instance ofExactlyOnceBehavior.
Changed
- BREAKING:
WindowBehavioris now calledCommonBehavior, as it can be also used with interval joins. - BREAKING:
window_behavioris now calledcommon_behavior, as it can be also used with interval joins. - Deprecating parameter
keep_queriesinpw.io.http.rest_connector. Nowdelete_completed_querieswith an opposite meaning should be used instead. The default is stilldelete_completed_queries=True(equivalent tokeep_queries=False) but it will soon be required to be set explicitly.