Releases · pola-rs/polars

20 Mar 11:16

github-actions

py-1.39.3

1cd236c

Python Polars 1.39.3 Latest

Latest

No changes

Thank you to all our contributors for making this release possible!
@ritchie46

Contributors

ritchie46

Assets 2

17 Mar 17:19

github-actions

py-1.39.2

4f31f72

Python Polars 1.39.2

No changes

Thank you to all our contributors for making this release possible!
@nameexhaustion and @ritchie46

Contributors

ritchie46 and nameexhaustion

Assets 2

17 Mar 09:40

github-actions

py-1.39.1

39d1d4c

Python Polars 1.39.1

🐞 Bug fixes

Handle empty rolling windows in streaming engine (#26903)

📖 Documentation

Add documentation for on_columns for LazyFrame pivot (#26859)

🛠️ Other improvements

Bump build deps used in ARM64 Windows release pipeline (#26892)

Thank you to all our contributors for making this release possible!
@Kevin-Patyk, @RenzoMXD, @TNieuwdorp, @dsprenkels, @gautamvarmadatla, @nameexhaustion, @nicholaslegrand102 and @ritchie46

Contributors

dsprenkels, ritchie46, and 6 other contributors

Assets 2

12 Mar 14:25

github-actions

py-1.39.0

2bce04a

Python Polars 1.39.0

🚀 Performance improvements

Lower arg_{min,max} to streaming engine (#26845)
Additional IR slice pushdown after filter pushdown (#26815)
Streaming first/last on Enum through physical (#26783)
Fast filter for scalar predicates (#26745)
Allow SimpleProjection in streaming engine to rename (#26709)
Streaming cloud download for scan_csv (#26637)
Drop columns only needed for predicates after the predicate is applied (#26703)
Run projection pushdown after predicate pushdown (#26688)
Comparison literal downcasting (#26663)
Add dynamic predicates for TopK (#26495)
Increase minimum default parquet row group prefetch to 8 (#26632)
Partial predicate conversion to PyArrow (#26567)
Streaming cloud download for scan_ndjson / scan_lines (#26563)
Grab GIL fewer times during Object join materialization (#26587)
Improve CSV and NDJSON cloud sink performance (#26545)
Tune cloud writer performance (#26518)
Allow parallel InMemorySinks in streaming engine (#26501)
Add streaming AsOf join node (#26398)
Don't always rechunk on gather of nested types (#26478)

✨ Enhancements

Support Expr for holidays in business day calculations (#26193)
Parameter for pivot to always include value column name (#26730)
Raise error in .collect_schema() when arr.get() is out-of-bounds (#26866)
Extend Expr.reinterpret to all numeric types of the same size (#26401)
Add missing_columns parameter to scan_csv (#26787)
Clear no-op scan projections (#26858)
Support nested datatypes for {min,max}_by (#26849)
Support SQL ARRAY init from typed literals (#26622)
Accept table identifier string in scan_iceberg() (#26826)
Add a convenience make fresh command to the Makefile (#26809)
Expose "use_zip64" Workbook option for write_excel (#26699)
Add unstable LazyFrame.sink_iceberg (#26799)
Add maintain order argument on implode (#26782)
Speed up casting primitive to bool by at least 2x (#26823)
Support ASCII format table input to pl.from_repr (#26806)
Enable rowgroup skipping for float columns (#26805)
Add expression context to errors (#26716)
Add Decimal support for product reduction (#26725)
Support all Iceberg V2 arrow types in sink_parquet arrow_schema parameter (#26669)
Re-work behavior of arrow_schema parameter on sink_parquet (#26621)
Add contains_dtype() method for Schema (#26661)
Implement truncate as a "to_zero" rounding mode (#26677)
More generic streaming GroupBy lowering (#26696)
Create an Alignment TypeAlias (#26668)
Add basic MemoryManager to track buffered dataframes for out-of-core support later (#26443)
Add truncate Expression for numeric values (#26666)
Better error messages for hex literal conversion issues in the SQL interface (#26657)
Add SQL support for LPAD and RPAD string functions (#26631)
Support SQL "FROM-first" SELECT query syntax (#26598)
Improve base_type typing (#26602)
Bump Chrono to 0.4.24, enabling stricter parsing of %.3f/%.6f/%.9f specifiers (#26075)
Expose unstable assert_schema_equal in py-polars (#24869)
Allow parsing of compact ISO 8601 strings (#24629)
Add optional "label" param to DataFrame corr (#26588)
Streaming cloud download for scan_ndjson / scan_lines (#26563)
Configuration to cast integers to floats in cast_options for scan_parquet (#26492)
Add escaping to quotes and newlines when reading JSON object into string (#26578)
Standardise on RFC-5545 when doing datetime arithmetic on timezone-aware datetimes (#26425)
Support sas_token in Azure credential provider (#26565)
Relax SQL requirement for derived tables and subqueries to have aliases (#26543)
Add polars-config and pl.Config.reload_env_vars() (#26524)
Record path for object store error raised from sinks (#26541)
Use CRC64NVME for checksum in aws sinks (#26522)
Add get() for binary Series (#26514)
Add streaming AsOf join node (#26398)
Add primitive filter -> agg lowering in streaming GroupBy (#26459)
Support for the SQL FETCH clause (#26449)

🐞 Bug fixes

Prevent Boolean arithmetic with integer literals producing Unknown type in streaming engine (#26878)
Fix sink to partitioned S3 from Windows corrupted slashes (#26889)
Remove outdated warning about List columns in unique() (#26295) (#26890)
Restore pyarrow predicate conversion for is_in (#26811)
Release GIL before df.to_ndarray() to avoid deadlock (#26832)
Fix panic on CSV count_rows with FORCE_ASYNC (#26883)
Add scalar comparisons for UInt128 series (#26886)
Fix shape error not raised for 0 width inputs with non-0 height for streaming horizontal concat (#26877)
Fix streaming zip-broadcast node did not raise shape mismatch on empty recv from ready port (#26871)
Fix incorrect output list.eval with scalar expr, fix panic on list.agg with nulls (#26868)
Allow list argument in group_by().map_groups() (#26707)
Support for ADBC drivers instantiated with dbc in DataFrame.write_database (#26157)
Incorrect arg_sort with descending+limit (#26839)
Raise error in .collect_schema() when arr.get() is out-of-bounds (#26866)
Return ComputeError instead of panicking in map_groups UDF (#26665)
Issue PerformanceWarning in LazyFrame.__contains__ (#26734)
Correct type hint for map_columns function parameter (#26487)
Apply thousands_separator to count/null_count in describe() for non-numeric columns (#26486)
Ensure proper handling of timedelta when multiplying with a Series (#26830)
Correct type hint for function parameter in DataFrame.map_columns (#26372)
Segfault in JoinExec on deep plan (#26796)
Fix unary expressions on literal in over context (#26827)
Fix {min,max}_by in streaming engine for Boolean full {min,max} value column (#26848)
Fix debug panic on clip with nan bound (#26854)
Support grouped {arg_,}_{min,max} for Categoricals (#26856)
Throw an error if a string is passed to LazyFrame.pivot on_columns (#26852)
Preserve input float precision in rolling_cov() and rolling_corr() with mixed input types (#26820)
Preserve row count when converting zero-column DataFrame via arrow PyCapsule interface (#26835)
Prevent infinite recursion in streaming group_by fallback (#26801)
Use RowEncodingContext::Struct when determining D::Struct encoded item len (#26817)
Incorrectly applied CSE on different map_batches functions (#26822)
Fix duplicated query execution on todo panic when combining collect(engine='streaming') with POLARS_AUTO_NEW_STREAMING (#26792)
Prevent predicate pushdown across Sort with baked-in slice (#26804)
Restore compatibility with pd.Timedelta (#26785)
Fix panic on lazy sink_parquet created in pipe_with_schema (#26784)
Support {column_name} and {index} placeholders in pl.format string (#26771)
Do not use merge-join if nulls_last is unknown (#26778)
Normalize float zeros in Parquet column statistics (#26776)
Fix out-of-bounds for positive offset in windowed rolling (#26724)
Raise error when .get() is out-of-bounds in group by context (#26752)
Boolean bitwise_xor aggregation inverted when column contains nulls (#26749)
Parameter nulls_last was ignored in over (#26718)
Allow missing time in inexact strptime (#26714)
Respect nulls_last in sort_by within group_by().agg() slow path (#26681)
Return NaN when using corr() with a literal and expr (#26697)
Allow strict horizontal concat with empty df (#26345)
Fix PoisonError panic caused by reentrant usage of file cache (#26627)
Return null for int values exceeding 128-bit range with strict=False (#26674)
Incorrect boolean min/max with nulls (#26671)
Slice-slice pushdown for n_rows (#26673)
Resolve panic in Enum struct slicing (#26643)
Fix CSPE for group_by.map_groups (#26640)
Remove non-existent parameter from SQLContext typing overloads (#26658)
Address pl.from_epoch losing fractional seconds (#26419)
Fix to_pandas() on empty enum Series did not preserve enum dictionary (#26610)
Rounding behaviour for f32 values with "HalfAwayFromZero" mode (#26624)
Updated Sum Type Hint (#26629)
Don't allow namespace registration to override standard methods or properties (#26450)
Correct arg_(min|max) for scalar columns (#26609)
Use monkeypatch.chdir in test_sink_path_slicing_utf8_boundaries_26324 (#26616)
Respect SQL semantics for cumulative functions mapped via OVER clause (#26570)
Fix incorrect multiplexer output ordering on source token stop request (#26561)
Fix PyIceberg filter on boolean column (#26550)
Set dictionary_page_offset when dictionary encoding is used and point data_page_offset to the first data page (#26542)
Move query parameters to request body when retrieving Unity Catalog temporary credentials (#26539)
Ensure read_csv_batched() prints deprecation warning (#26530)
Implement PhysicalExpr for MinBy/MaxBy nodes (#26506)
Refactor row-encoding logic in IR join lowering into separate function (#26512)
Correctly check for path extensions (#26513)
Change AsOf join to be based on TotalOrd (#26497)
Correctly raise error on failing nested strict casts (#26499)
Prevent invalid type casts in replace_strict() (#26453)
Return null when dividing literals by 0 (#26343)
Fix type-hint for Series.quantile (#26422)

📖 Documentation

Mention ComputeContexts create ephemeral environments by default and hint at re-use (#26692)
Remove confusing join validation note (#26795)
Fix formatting in categorical documentation (#26746)
Fix broken AI policy link (#26728)
Create Polars Cloud Glossary (#26690)
Additional SQL documentation (#26662)
Include invalidate_caches in bisect instructions (#26641)
Add git bisect guide to contributing docs (#26634)
Fix Polars Cloud examples (formatting & type hints) (#26625)
Updated Airflow orchestration documentation (#26585)
Improve SQL docs for...

Contributors

orlp, dsprenkels, and 43 other contributors

Assets 2

09 Feb 09:16

github-actions

rs-0.53.0

16c0d99

Rust Polars 0.53.0

🏆 Highlights

Add Extension types (#25322)

🚀 Performance improvements

Don't always rechunk on gather of nested types (#26478)
Enable zero-copy object_store put upload for IPC sink (#26288)
Resolve file schema's and metadata concurrently (#26325)
Run elementwise CSEE for the streaming engine (#26278)
Disable morsel splitting for fast-count on streaming engine (#26245)
Implement streaming decompression for scan_ndjson and scan_lines (#26200)
Improve string slicing performance (#26206)
Refactor scan_delta to use python dataset interface (#26190)
Add dedicated kernel for group-by arg_max/arg_min (#26093)
Add streaming merge-join (#25964)
Generalize Bitmap::new_zeroed opt for Buffer::zeroed (#26142)
Reduce fs stat calls in path expansion (#26173)
Lower streaming group_by n_unique to unique().len() (#26109)
Speed up SQL interface "UNION" clauses (#26039)
Speed up SQL interface "ORDER BY" clauses (#26037)
Add fast kernel for is_nan and use it for numpy NaN->null conversion (#26034)
Optimize ArrayFromIter implementations for ObjectArray (#25712)
New streaming NDJSON sink pipeline (#25948)
New streaming CSV sink pipeline (#25900)
Dispatch partitioned usage of sink_* functions to new-streaming by default (#25910)
Replace ryu with faster zmij (#25885)
Reduce memory usage for .item() count in grouped first/last (#25787)
Skip schema inference if schema provided for scan_csv/ndjson (#25757)
Add width-aware chunking to prevent degradation with wide data (#25764)
Use new sink pipeline for write/sink_ipc (#25746)
Reduce memory usage when scanning multiple parquet files in streaming (#25747)
Don't call cluster_with_columns optimization if not needed (#25724)
Tune partitioned sink_parquet cloud performance (#25687)
New single file IO sink pipeline enabled for sink_parquet (#25670)
New partitioned IO sink pipeline enabled for sink_parquet (#25629)
Correct overly eager local predicate insertion for unpivot (#25644)
Reduce HuggingFace API calls (#25521)
Use strong hash instead of traversal for CSPE equality (#25537)
Fix panic in is_between support in streaming Parquet predicate push down (#25476)
Faster kernels for rle_lengths (#25448)
Allow detecting plan sortedness in more cases (#25408)
Enable predicate expressions on unsigned integers (#25416)
Mark output of more non-order-maintaining ops as unordered (#25419)
Fast find start window in group_by_dynamic with large offset (#25376)
Add streaming native LazyFrame.group_by_dynamic (#25342)
Add streaming sorted Group-By (#25013)
Add parquet prefiltering for string regexes (#25381)
Use fast path for agg_min/agg_max when nulls present (#25374)
Fuse positive slice into streaming LazyFrame.rolling (#25338)
Mark Expr.reshape((-1,)) as row separable (#25326)
Use bitmap instead of Vec<bool> in first/last w. skip_nulls (#25318)
Return references from aexpr_to_leaf_names_iter (#25319)

✨ Enhancements

Add primitive filter -> agg lowering in streaming GroupBy (#26459)
Support for the SQL FETCH clause (#26449)
Add get() to retrieve a byte from binary data (#26454)
Remove with_context in SQL lowering (#26416)
Avoid OOM for scan_ndjson and scan_lines if input is compressed and negative slice (#26396)
Add JoinBuildSide (#26403)
Support annoymous agg in-mem (#26376)
Add unstable arrow_schema parameter to sink_parquet (#26323)
Improve error message formatting for structs (#26349)
Remove parquet field overwrites (#26236)
Enable zero-copy object_store put upload for IPC sink (#26288)
Improved disambiguation for qualified wildcard columns in SQL projections (#26301)
Expose upload_concurrency through env var (#26263)
Allow quantile to compute multiple quantiles at once (#25516)
Allow empty LazyFrame in LazyFrame.group_by(...).map_groups (#26275)
Use delta file statistics for batch predicate pushdown (#26242)
Add streaming UnorderedUnion (#26240)
Implement compression support for sink_ndjson (#26212)
Add unstable record batch statistics flags to {sink/scan}_ipc (#26254)
Cloud retry/backoff configuration via storage_options (#26204)
Use same sort order for expanded paths across local / cloud / directory / glob (#26191)
Expose physical plan NodeStyle (#26184)
Add streaming merge-join (#25964)
Serialize optimization flags for cloud plan (#26168)
Add compression support to write_csv and sink_csv (#26111)
Add scan_lines (#26112)
Support regex in str.split (#26060)
Add unstable IPC Statistics read/write to scan_ipc/sink_ipc (#26079)
Add nulls support for all rolling_by operations (#26081)
ArrowStreamExportable and sink_delta (#25994)
Release musl builds (#25894)
Implement streaming decompression for CSV COUNT(*) fast path (#25988)
Add nulls support for rolling_mean_by (#25917)
Add lazy collect_all (#25991)
Add streaming decompression for NDJSON schema inference (#25992)
Improved handling of unqualified SQL JOIN columns that are ambiguous (#25761)
Expose record batch size in {sink,write}_ipc (#25958)
Add null_on_oob parameter to expr.get (#25957)
Suggest correct timezone if timezone validation fails (#25937)
Support streaming IPC scan from S3 object store (#25868)
Implement streaming CSV schema inference (#25911)
Support hashing of meta expressions (#25916)
Improve SQLContext recognition of possible table objects in the Python globals (#25749)
Add pl.Expr.(min|max)_by (#25905)
Improve MemSlice Debug impl (#25913)
Implement or fix json encode/decode for (U)Int128, Categorical, Enum, Decimal (#25896)
Expand scatter to more dtypes (#25874)
Implement streaming CSV decompression (#25842)
Add Series sql method for API consistency (#25792)
Mark Polars as safe for free-threading (#25677)
Support Binary and Decimal in arg_(min|max) (#25839)
Allow Decimal parsing in str.json_decode (#25797)
Add shift support for Object data type (#25769)
Add node status to NodeMetrics (#25760)
Allow scientific notation when parsing Decimals (#25711)
Allow creation of Object literal (#25690)
Don't collect schema in SQL union processing (#25675)
Add bin.slice(), bin.head(), and bin.tail() methods (#25647)
Add SQL support for the QUALIFY clause (#25652)
New partitioned IO sink pipeline enabled for sink_parquet (#25629)
Add SQL syntax support for CROSS JOIN UNNEST(col) (#25623)
Add separate env var to log tracked metrics (#25586)
Expose fields for generating physical plan visualization data (#25562)
Allow pl.Object in pivot value (#25533)
Extend SQL UNNEST support to handle multiple array expressions (#25418)
Minor improvement for as_struct repr (#25529)
Temporal quantile in rolling context (#25479)
Add support for Float16 dtype (#25185)
Add strict parameter to pl.concat(how='horizontal') (#25452)
Add leftmost option to str.replace_many / str.find_many / str.extract_many (#25398)
Add quantile for missing temporals (#25464)
Expose and document pl.Categories (#25443)
Support decimals in search_sorted (#25450)
Use reference to Graph pipes when flushing metrics (#25442)
Add SQL support for named WINDOW references (#25400)
Add Extension types (#25322)
Add having to group_by context (#23550)
Allow elementwise Expr.over in aggregation context (#25402)
Add SQL support for ROW_NUMBER, RANK, and DENSE_RANK functions (#25409)
Automatically Parquet dictionary encode floats (#25387)
Add empty_as_null and keep_nulls to {Lazy,Data}Frame.explode (#25369)
Allow hash for all List dtypes (#25372)
Support unique_counts for all datatypes (#25379)
Add maintain_order to Expr.mode (#25377)
Display function of streaming physical plan map node (#25368)
Allow slice on scalar in aggregation context (#25358)
Allow implode and aggregation in aggregation context (#25357)
Add empty_as_null and keep_nulls flags to Expr.explode (#25289)
Add ignore_nulls to first / last (#25105)
Move GraphMetrics into StreamingQuery (#25310)
Allow Expr.unique on List/Array with non-numeric types (#25285)
Allow Expr.rolling in aggregation contexts (#25258)
Support additional forms of SQL CREATE TABLE statements (#25191)
Add LazyFrame.pivot (#25016)
Support column-positional SQL UNION operations (#25183)
Allow arbitrary expressions as the Expr.rolling index_column (#25117)
Allow arbitrary Expressions in "subset" parameter of unique frame method (#25099)
Support arbitrary expressions in SQL JOIN constraints (#25132)

🐞 Bug fixes

Do not overwrite used names in cluster_with_columns pushdown (#26467)
Do not mark output of concat_str on multiple inputs as sorted (#26468)
Fix CSV schema inference content line duplication bug (#26452)
Fix InvalidOperationError using scan_delta with filter (#26448)
Alias giving missing column after streaming GroupBy CSE (#26447)
Ensure by_name selector selects only names (#26437)
Restore compatibility of strings written to parquet with pyarrow filter (#26436)
Update schema in cluster_with_columns optimization (#26430)
Fix negative slice in groups slicing (#26442)
Don't run CPU check on aarch64 musl (#26439)
Remove the POLARS_IDEAL_MORSEL_SIZE monkeypatching in the parametric merge-join test (#26418)
Correct off-by-one in RLE row counting for nullable dictionary-encoded columns (#26411)
Support very large integers in env var limits (#26399)
Fix PlPath panic from incorrect slicing of UTF8 boundaries (#26389)
Fix Float dtype for spearman correlation (#26392)
Fix optimizer panic in right joins with type coercion (#26365)
Don't serialize retry config from ...

Contributors

wtn, orlp, and 65 other contributors

Assets 2

06 Feb 18:13

github-actions

py-1.38.1

50a3bfb

Python Polars 1.38.1

✨ Enhancements

Add get() to retrieve a byte from binary data (#26454)
Remove with_context in SQL lowering (#26416)

🐞 Bug fixes

Do not overwrite used names in cluster_with_columns pushdown (#26467)
Do not mark output of concat_str on multiple inputs as sorted (#26468)
Fix CSV schema inference content line duplication bug (#26452)
Fix InvalidOperationError using scan_delta with filter (#26448)
Alias giving missing column after streaming GroupBy CSE (#26447)
Ensure by_name selector selects only names (#26437)
Restore compatibility of strings written to parquet with pyarrow filter (#26436)
Update schema in cluster_with_columns optimization (#26430)
Fix negative slice in groups slicing (#26442)
Don't run CPU check on aarch64 musl (#26439)
Fixed annotations shadowed by class methods (#26356)
Remove the POLARS_IDEAL_MORSEL_SIZE monkeypatching in the parametric merge-join test (#26418)
Fix selector match patterns for multiline column names (#26320)

📖 Documentation

Add sink_delta to API reference (#26446)

🛠️ Other improvements

Cleanup unused attributes in optimizer (#26464)
Use Expr::Display as catch all for IR - DSL asymmetry (#26471)
Ignore pytz in mypy (#26441)
Remove the POLARS_IDEAL_MORSEL_SIZE monkeypatching in the parametric merge-join test (#26418)
Cleanup the parametric merge-join test (#26413)

Thank you to all our contributors for making this release possible!
@Voultapher, @alexander-beedie, @azimafroozeh, @cmdlineluser, @dependabot[bot], @dsprenkels, @hamdanal, @kdn36, @nameexhaustion, @orlp, @ritchie46 and dependabot[bot]

Contributors

orlp, dsprenkels, and 9 other contributors

Assets 2

04 Feb 12:01

github-actions

py-1.38.0

e1612c2

Python Polars 1.38.0

⚠️ Deprecations

Deprecate retries=n in favor of storage_options={"max_retries": n} (#26155)

🚀 Performance improvements

Enable zero-copy object_store put upload for IPC sink (#26288)
Resolve file schema's and metadata concurrently (#26325)
Run elementwise CSEE for the streaming engine (#26278)
Disable morsel splitting for fast-count on streaming engine (#26245)
Implement streaming decompression for scan_ndjson and scan_lines (#26200)
Improve string slicing performance (#26206)
Refactor scan_delta to use python dataset interface (#26190)
Add dedicated kernel for group-by arg_max/arg_min (#26093)
Add streaming merge-join (#25964)
Generalize Bitmap::new_zeroed opt for Buffer::zeroed (#26142)
Reduce fs stat calls in path expansion (#26173)
Lower streaming group_by n_unique to unique().len() (#26109)

✨ Enhancements

Avoid OOM for scan_ndjson and scan_lines if input is compressed and negative slice (#26396)
Support annoymous agg in-mem (#26376)
Add unstable arrow_schema parameter to sink_parquet (#26323)
Improve error message formatting for structs (#26349)
Remove parquet field overwrites (#26236)
Enable zero-copy object_store put upload for IPC sink (#26288)
Improved disambiguation for qualified wildcard columns in SQL projections (#26301)
Expose upload_concurrency through env var (#26263)
Allow quantile to compute multiple quantiles at once (#25516)
Allow empty LazyFrame in LazyFrame.group_by(...).map_groups (#26275)
Use delta file statistics for batch predicate pushdown (#26242)
Add streaming UnorderedUnion (#26240)
Implement compression support for sink_ndjson (#26212)
Add unstable record batch statistics flags to {sink/scan}_ipc (#26254)
Support CSE for python UDFs on the same address (#26253)
Cloud retry/backoff configuration via storage_options (#26204)
Use same sort order for expanded paths across local / cloud / directory / glob (#26191)
Add streaming merge-join (#25964)
Serialize optimization flags for cloud plan (#26168)
Add compression support to write_csv and sink_csv (#26111)
Add scan_lines (#26112)
Support regex in str.split (#26060)
Add unstable IPC Statistics read/write to scan_ipc/sink_ipc (#26079)
Add unstable height parameter to DataFrame/LazyFrame (#26014)
Remove old partition sink API (#26100)
Expose ArrowStreamExportable on python collect batches iterator (#26074)
Add nulls support for all rolling_by operations (#26081)

🐞 Bug fixes

Correct off-by-one in RLE row counting for nullable dictionary-encoded columns (#26411)
Support very large integers in env var limits (#26399)
Fix PlPath panic from incorrect slicing of UTF8 boundaries (#26389)
Fix Float dtype for spearman correlation (#26392)
Fix optimizer panic in right joins with type coercion (#26365)
Don't serialize retry config from local environment vars (#26289)
Fix PartitionBy with scalar key expressions and diff() (#26370)
Add {Float16, Float32} -> Float32 lossless upcast (#26373)
Fix panic using with_columns and collect_all (#26366)
Add multi-page support for writing dictionary-encoded Parquet columns (#26360)
Ensure slice advancement when skipping non-inlinable values in is_in with inlinable needles (#26361)
Pin xlsx2csv version temporarily (#26352)
Bugs in ViewArray total_bytes_len (#26328)
Overflow in i128::abs in Decimal fits check (#26341)
Make Expr.hash on Categorical mapping-independent (#26340)
Clone shared GroupBy node before mutation in physical plan creation (#26327)
Fixed "sheet_name" typing for read_ods and read_excel (#26317)
Improve Polars dtype inference from Python Union typing (#26303)
Consider the "current location" of an item when computing rolling_rank_by (#26287)
Reset is_count_star flag between queries in collect_all (#26256)
Fix incorrect is_between filter on scan_parquet (#26284)
Make polars compatible with ty (#26270)
Lower AnonymousStreamingAgg in group-by as aggregate (#26258)
Avoid overflow in pl.duration scalar arguments case (#26213)
Broadcast arr.get on single array with multiple indices (#26219)
Fix panic on CSPE with sorts (#26231)
Eager DataFrame.slice with negative offset and length=None (#26215)
Use correct schema side for streaming merge join lowering (#26218)
Overflow panic in scan_csv with multiple files and skip_rows + n_rows larger than total row count (#26128)
Respect allow_object flag after cache (#26196)
Raise error on non-elementwise PartitionBy keys (#26194)
Allow ordered categorical dictionary in scan_parquet (#26180)
Allow excess bytes on IPC bitmap compressed length (#26176)
Address a macOS-specific compile issue (#26172)
Fix deadlock on hash_rows() of 0-width DataFrame (#26154)
Fix NameError filtering pyarrow dataset (#26166)
Fix concat_arr panic when using categoricals/enums (#26146)
Fix NDJSON/scan_lines negative slice splitting with extremely long lines (#26132)
Incorrect group_by min/max fast path (#26139)
Remove a source of non-determinism from lowering (#26137)
Error when with_row_index or unpivot create duplicate columns on a LazyFrame (#26107)
Panics on shift with head (#26099)

📖 Documentation

Fix Expr.get referencing incorrect dtype for index parameter (#26364)
Fix Expr.quantile formatting (#26351)
Drop sphinx-llms-txt extension (#26285)
Remove deprecated cublet_id (#26260)
Update for new release (#26255)
Update MCP server section with new URL (#26241)
Fix unmatched paren and punctuation in pandas migration guide (#26251)
Add observatory database_path to docs (#26201)
Note plugins in Python user-defined functions (#26138)

📦 Build system

Address remaining Python 3.14 issues with make requirements-all (#26195)
Address a macOS-specific compile issue (#26172)

🛠️ Other improvements

Ensure local doctests skip from_torch if module not installed (#26405)
Change linked timezones in test suite to canonical timezones (#26310)
Implement various deprecations (#26314)
Rename Operator::Divide to RustDivide (#26339)
Properly disable the Pyodide tests (#26382)
Remove unused field (#26367)
Fix runtime nesting (#26359)
Remove xlsx2csv dependency pin (#26355)
Use outer runtime if exists in to_alp (#26353)
Make CategoricalMapping::new pub(crate) to avoid misuse (#26308)
Clarify IPC buffer read limit/length paramter (#26334)
Add dtype test coverage for delta predicate filter (#26291)
Add AI policy (#26286)
Unpin "pandas<3" in dev dependencies (#26249)
Remove all non CSV fast-count paths (#26233)
Pin pandas to 2.x for now (#26221)
Remove unnecessary xfail (#26199)
Ensure optimization flag modification happens local (#26185)
Simplify IcebergDataset (#26165)
Reorganize unit tests into logical subdirectories (#26149)
Lint leftover fixme (#26122)
Improve backtrace for POLARS_PANIC_ON_ERR (#26125)
Fix Python docs build (#26117)
Disable unused-ignore mypy lint (#26110)
Ignore mypy warning (#26105)
Raise error on file://hostname/path (#26061)
Disable debug info for docs workflow (#26086)
Update docs for next polars cloud release (#26091)
Support Python 3.14 in dev environment (#26073)

Thank you to all our contributors for making this release possible!
@Atarust, @EndPositive, @Kevin-Patyk, @LeeviLindgren, @MarcoGorelli, @Matt711, @MrAttoAttoAtto, @Voultapher, @WaffleLapkin, @agossard, @alex-gregory-ds, @alexander-beedie, @azimafroozeh, @bayoumi17m, @c-peters, @carnarez, @dependabot[bot], @dsprenkels, @hallmason17, @hamdanal, @ion-elgreco, @kdn36, @lun3x, @mcrumiller, @nameexhaustion, @orlp, @qxzcode, @r-brink, @ritchie46, @sweb and dependabot[bot]

Contributors

orlp, sweb, and 28 other contributors

Assets 2

12 Jan 23:27

github-actions

py-1.37.1

bb79993

Python Polars 1.37.1

🚀 Performance improvements

Speed up SQL interface "UNION" clauses (#26039)

🐞 Bug fixes

Optimize slicing support on compressed IPC (#26071)
CPU check for musl builds (#26076)
Propagate C Stream import errors instead of panicking (#26036)
Fix slicing on compressed IPC (#26066)

📖 Documentation

Clarify min_by/max_by behavior on ties (#26077)

🛠️ Other improvements

Mark top slow normal tests as slow (#26080)
Update breaking deps (#26055)
Fix for upstream url bug and update deps (#26052)
Properly pin chrono (#26051)
Don't run rust doctests (#26046)
Update deps (#26042)
Ignore very slow test (#26041)

Thank you to all our contributors for making this release possible!
@Voultapher, @alexander-beedie, @kdn36, @nameexhaustion, @orlp, @ritchie46 and @wtn

Contributors

wtn, orlp, and 5 other contributors

Assets 2

10 Jan 12:28

github-actions

py-1.37.0

1674b37

Python Polars 1.37.0

🚀 Performance improvements

Speed up SQL interface "ORDER BY" clauses (#26037)
Add fast kernel for is_nan and use it for numpy NaN->null conversion (#26034)
Optimize ArrayFromIter implementations for ObjectArray (#25712)
New streaming NDJSON sink pipeline (#25948)
New streaming CSV sink pipeline (#25900)
Dispatch partitioned usage of sink_* functions to new-streaming by default (#25910)
Replace ryu with faster zmij (#25885)
Reduce memory usage for .item() count in grouped first/last (#25787)
Skip schema inference if schema provided for scan_csv/ndjson (#25757)
Add width-aware chunking to prevent degradation with wide data (#25764)
Use new sink pipeline for write/sink_ipc (#25746)
Reduce memory usage when scanning multiple parquet files in streaming (#25747)
Don't call cluster_with_columns optimization if not needed (#25724)

✨ Enhancements

Add new pl.PartitionBy API (#26004)
ArrowStreamExportable and sink_delta (#25994)
Release musl builds (#25894)
Implement streaming decompression for CSV COUNT(*) fast path (#25988)
Add nulls support for rolling_mean_by (#25917)
Add lazy collect_all (#25991)
Add streaming decompression for NDJSON schema inference (#25992)
Improved handling of unqualified SQL JOIN columns that are ambiguous (#25761)
Drop Python 3.9 support (#25984)
Expose record batch size in {sink,write}_ipc (#25958)
Add null_on_oob parameter to expr.get (#25957)
Suggest correct timezone if timezone validation fails (#25937)
Support streaming IPC scan from S3 object store (#25868)
Implement streaming CSV schema inference (#25911)
Support hashing of meta expressions (#25916)
Improve SQLContext recognition of possible table objects in the Python globals (#25749)
Add pl.Expr.(min|max)_by (#25905)
Improve MemSlice Debug impl (#25913)
Implement or fix json encode/decode for (U)Int128, Categorical, Enum, Decimal (#25896)
Expand scatter to more dtypes (#25874)
Implement streaming CSV decompression (#25842)
Add Series sql method for API consistency (#25792)
Mark Polars as safe for free-threading (#25677)
Support Binary and Decimal in arg_(min|max) (#25839)
Allow Decimal parsing in str.json_decode (#25797)
Add shift support for Object data type (#25769)
Add missing Series.arr.mean (#25774)
Allow scientific notation when parsing Decimals (#25711)

🐞 Bug fixes

Release GIL on collect_batches (#26033)
Missing buffer update in String is_in Parquet pushdown (#26019)
Make struct.with_fields data model coherent (#25610)
Incorrect output order for order sensitive operations after join_asof (#25990)
Use SeriesExport for pyo3-polars FFI (#26000)
Add pl.Schema to type signature for DataFrame.cast (#25983)
Don't write Parquet min/max statistics for i128 (#25986)
Ensure chunk consistency in in-memory join (#25979)
Fix varying block metadata length in IPC reader (#25975)
Implement collect_batches properly in Rust (#25918)
Fix panic on arithmetic with bools in list (#25898)
Convert to index type with strict cast in some places (#25912)
Empty dataframe in streaming non-strict hconcat (#25903)
Infer large u64 in json as i128 (#25904)
Set http client timeouts to 10 minutes (#25902)
Correct lexicographic ordering for Parquet BYTE_ARRAY statistics (#25886)
Raise error on duplicate group_by names in upsample() (#25811)
Correctly export view buffer sizes nested in Extension types (#25853)
Fix DataFrame.estimated_size not handling overlapping chunks correctly (#25775)
Ensure Kahan sum does not introduce NaN from infinities (#25850)
Trim excess bytes in parquet decode (#25829)
Fix panic/deadlock sinking parquet with rows larger than 64MB estimated size (#25836)
Fix quantile midpoint interpolation (#25824)
Don't use cast when converting from physical in list.get (#25831)
Invalid null count on int -> categorical cast (#25816)
Update groups in list.eval (#25826)
Use downcast before FFI conversion in PythonScan (#25815)
Double-counting of row metrics (#25810)
Cast nulls to expected type in streaming union node (#25802)
Incorrect slice pushdown into map_groups (#25809)
Fix panic writing parquet with single bool column (#25807)
Fix upsample with group_by incorrectly introduced NULLs on group key columns (#25794)
Panic in top_k pruning (#25798)
Fix incorrect collect_schema for unpivot followed by join (#25782)
Verify arr namespace is called from array column (#25650)
Ensure LazyFrame.serialize() unchanged after collect_schema() (#25780)
Function map_(rows|elements) with return_dtype = pl.Object (#25753)
Fix incorrect cargo sub-feature (#25738)

📖 Documentation

Fix display of deprecation warning (#26010)
Document null behaviour for rank (#25887)
Add QUALIFY clause and SUBSTRING function to the SQL docs (#25779)
Update mixed-offset datetime parsing example in user guide (#25915)
Update bare-metal docs for mounted anonymous results (#25801)
Fix credential parameter name in cloud-storage.py (#25788)
Configuration options update (#25756)

🛠️ Other improvements

Update rust compiler (#26017)
Improve csv test coverage (#25980)
Ramp up CSV read size (#25997)
Mark lazy parameter to collect_all as unstable (#25999)
Update ruff action and simplify version handling (#25940)
Run python lint target as part of pre-commit (#25982)
Disable HTTP timeout for receiving response body (#25970)
Fix mypy lint (#25963)
Add AI contribution policy (#25956)
Fix failing scan delta S3 test (#25932)
Improve MemSlice Debug impl (#25913)
Remove and deprecate batched csv reader (#25884)
Remove unused AnonymousScan functions (#25872)
Filter DeprecationWarning from pyparsing indirectly through pyiceberg (#25854)
Various small improvements (#25835)
Clear venv with appropriate version of Python (#25851)
Skip schema inference if schema provided for scan_csv/ndjson (#25757)
Ensure proper async connection cleanup on DB test exit (#25766)
Ensure we uninstall other Polars runtimes in CI (#25739)
Make 'make requirements' more robust (#25693)
Remove duplicate compression level types (#25723)

Thank you to all our contributors for making this release possible!
@AndreaBozzo, @EndPositive, @Kevin-Patyk, @MarcoGorelli, @Voultapher, @alexander-beedie, @anosrepenilno, @arlyon, @azimafroozeh, @carnarez, @dependabot[bot], @dsprenkels, @edizeqiri, @eitanf, @gab23r, @henryharbeck, @hutch3232, @ion-elgreco, @jqnatividad, @kdn36, @lun3x, @m1guelperez, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @sachinn854, @yonikremer and dependabot[bot]

Contributors

orlp, dsprenkels, and 26 other contributors

Assets 2

10 Dec 01:15

github-actions

py-1.36.1

2a151c1

Python Polars 1.36.1

🚀 Performance improvements

Tune partitioned sink_parquet cloud performance (#25687)

✨ Enhancements

Allow creation of Object literal (#25690)
Don't collect schema in SQL union processing (#25675)

🐞 Bug fixes

Don't invalidate node in cluster-with-columns (#25714)
Move boto3 extra from s3fs in dev requirements (#25667)
Add missing type stubs for bin_slice, bin_head, and bin_tail (#25697)
Binary slice methods missing from Series and docs (#25683)
Mix-up of variable_name/value_name in unpivot (#25685)
Invalid usage of drop_first in to_dummies when nulls present (#25435)

📖 Documentation

Fix typos in Excel and Pandas migration guides (#25709)
Add "right" to how options in join() docstrings (#25678)

🛠️ Other improvements

Move Object lit fix earlier in the function (#25713)
Remove unused decimal file (#25701)
Move boto3 extra from s3fs in dev requirements (#25667)
Upgrade to latest version of sqlparser-rs (#25673)
Update slab to version without RUSTSEC (#25686)
Fix typo (#25684)

Thank you to all our contributors for making this release possible!
@AndreaBozzo, @Kevin-Patyk, @alexander-beedie, @dsprenkels, @jamesfricker, @mcrumiller, @nameexhaustion, @orlp and @ritchie46

Contributors

orlp, dsprenkels, and 7 other contributors

Assets 2

Releases: pola-rs/polars

Python Polars 1.39.3

Contributors

Uh oh!

Python Polars 1.39.2

Contributors

Uh oh!

Python Polars 1.39.1

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.39.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

Contributors

Uh oh!

Rust Polars 0.53.0

🏆 Highlights

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

Contributors

Uh oh!

Python Polars 1.38.1

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.38.0

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.37.1

🚀 Performance improvements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.37.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.36.1

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Uh oh!