Releases · unionai-oss/pandera

29 Jan 02:48

v0.29.0

7614754

Release 0.29.0: support list, dict, and tuple of dataframes Latest

Latest

⭐️ Highlight

Pandera now supports collection types containing dataframes, shoutout to @garethellis0 with an amazing first contribution!

@pa.check_types
def process_tuple_and_return_dict(
    dfs: tuple[DataFrame[OnlyZeroesSchema], DataFrame[OnlyOnesSchema]],
) -> dict[str, DataFrame[OnlyZeroesSchema]]:
    return {
        "foo": dfs[0],
        "bar": dfs[0]
    }


result = process_tuple_and_return_dict((
    pd.DataFrame({"a": [0, 0]}),
    pd.DataFrame({"a": [1, 1]}),
))
print(result)

What's Changed

feature/1078: Added Support For List, Dict, And Tuples Of Dataframes by @garethellis0 in #2204
pin sphinx version by @cosmicBboy in #2208
Add map datatype to the Ibis engine implementation by @deepyaman in #2206

New Contributors

@garethellis0 made their first contribution in #2204

Full Changelog: v0.28.1...v0.29.0

Contributors

cosmicBboy, garethellis0, and deepyaman

Assets 2

08 Jan 14:10

cosmicBboy

v0.28.1

71f860a

v0.28.1: Fix regressions in Check behavior

What's Changed

fix bugs in Check interface and Field by @cosmicBboy in #2203

Full Changelog: v0.28.0...v0.28.1

Contributors

cosmicBboy

Assets 2

06 Jan 20:37

cosmicBboy

v0.28.0

82096dd

Release 0.28.0: Add support for Pyspark 4

⭐️ Highlight

Pandera now supports Pyspark 4 🚀

What's Changed

refactor(pyspark): restructure pyspark components by @ELC in #2007
add support for pyspark 4 by @cosmicBboy in #2193
Decouple import dependencies for io serialization formats by @cosmicBboy in #2195
Use get_annotations instead of direct __annotations__ access by @amerberg in #2196
Re-implement improvements to str_length check by @cosmicBboy in #2198
Support the Decimal data type in the Ibis engine by @deepyaman in #2194
Update .git-blame-ignore-revs to add Ruff refactor by @deepyaman in #2199
Avoid full materialization of levels in failing MultiIndex validations by @amerberg in #2187
schema descriptor should raise AttributeError if build_schema_ is not implemented by @amerberg in #2197

New Contributors

@ELC made their first contribution in #2007

Full Changelog: v0.27.1...v0.28.0

Contributors

amerberg, cosmicBboy, and 2 other contributors

Assets 2

22 Dec 19:01

cosmicBboy

v0.27.1

70abc5c

Release v0.27.1: bugfix related to numpy==2.4.0

What's Changed

enhancement #2122 by @Jarek-Rolski in #2177
Fix failure_cases index value for MultiIndex schema errors by @amerberg in #2186
handle new numpy 2.4.0 ValueError when type is not recognized by @cosmicBboy in #2191

Full Changelog: v0.27.0...v0.27.1

Contributors

amerberg, cosmicBboy, and Jarek-Rolski

Assets 2

25 Nov 16:11

cosmicBboy

v0.27.0

ff8674a

v0.27.0: Support Python 3.14

⭐️ Highlight

Pandera now supports Python 3.14! We also dropped support for Python 3.9

What's Changed

scipy-stubs by @jorenham in #2121
bugfix: set SPARK_LOCAL_IP to 127.0.0.1 if not set. by @cosmicBboy in #2123
fix: collect failure_cases in check_column_values_are_unique by @MikeEvansLarah in #2120
Adding import to code example in data_synthesis_strategies.md by @OwenLund in #2126
Pin DuckDB<1.4.0 in dev env due to breaking change by @deepyaman in #2140
fix mypy polars issues by @cosmicBboy in #2142
Add descriptors to DataFrameModel by @lundybernard in #2136
Support nonnullable equivalents for all data types by @deepyaman in #2146
Fix failure case count for Ibis tables and strings by @deepyaman in #2145
Implement check_nullable checks for Ibis backend by @deepyaman in #2149
feat: create empty dataframe with index (and multiindex) when present… by @davidkleiven in #2133
Bugfix/1994 Error loading frictionless schema by @Jarek-Rolski in #2159
Do not pass removed name argument to memtables by @deepyaman in #2162
Fix: Add enum.Enum serialization support for to_json() by @chris-wright-nl in #2163
Implement drop_invalid_rows for the Ibis backend by @deepyaman in #2151
Support Python 3.14 by @glatterf42 in #2158
optimize pandas MultiIndex validation by avoiding materializing level values when possible by @amerberg in #2118
fix: remove pandas.concat signature hook by @kitagry in #2173
add codecov token to ci by @cosmicBboy in #2175
Use Ruff instead of Black, pyupgrade and isort by @deepyaman in #2171

New Contributors

@jorenham made their first contribution in #2121
@MikeEvansLarah made their first contribution in #2120
@OwenLund made their first contribution in #2126
@chris-wright-nl made their first contribution in #2163
@glatterf42 made their first contribution in #2158
@kitagry made their first contribution in #2173

Full Changelog: v0.26.1...v0.27.0

Contributors

amerberg, MikeEvansLarah, and 10 other contributors

Assets 2

23 Nov 13:51

cosmicBboy

v0.27.0b0

b48e0e3

v0.27.0b0: beta release, add Python 3.14 Pre-release

Pre-release

What's Changed

scipy-stubs by @jorenham in #2121
bugfix: set SPARK_LOCAL_IP to 127.0.0.1 if not set. by @cosmicBboy in #2123
fix: collect failure_cases in check_column_values_are_unique by @MikeEvansLarah in #2120
Adding import to code example in data_synthesis_strategies.md by @OwenLund in #2126
Pin DuckDB<1.4.0 in dev env due to breaking change by @deepyaman in #2140
fix mypy polars issues by @cosmicBboy in #2142
Add descriptors to DataFrameModel by @lundybernard in #2136
Support nonnullable equivalents for all data types by @deepyaman in #2146
Fix failure case count for Ibis tables and strings by @deepyaman in #2145
Implement check_nullable checks for Ibis backend by @deepyaman in #2149
feat: create empty dataframe with index (and multiindex) when present… by @davidkleiven in #2133
Bugfix/1994 Error loading frictionless schema by @Jarek-Rolski in #2159
Do not pass removed name argument to memtables by @deepyaman in #2162
Fix: Add enum.Enum serialization support for to_json() by @chris-wright-nl in #2163
Implement drop_invalid_rows for the Ibis backend by @deepyaman in #2151
Support Python 3.14 by @glatterf42 in #2158
optimize pandas MultiIndex validation by avoiding materializing level values when possible by @amerberg in #2118
fix: remove pandas.concat signature hook by @kitagry in #2173

New Contributors

@jorenham made their first contribution in #2121
@MikeEvansLarah made their first contribution in #2120
@OwenLund made their first contribution in #2126
@chris-wright-nl made their first contribution in #2163
@glatterf42 made their first contribution in #2158
@kitagry made their first contribution in #2173

Full Changelog: v0.26.1...v0.27.0b0

Contributors

amerberg, MikeEvansLarah, and 10 other contributors

Assets 2

26 Aug 16:48

cosmicBboy

v0.26.1

f8384ae

v0.26.1: Multi-index, `@check_types` Bugfixes

What's Changed

fix MultiIndex check regression by @amerberg in #2116
implement multiindex_strict and multiindex_unique add test cases by @amerberg in #2114
Bugfix: #2058 Check_types for callable by @ybressler in #2069

New Contributors

@ybressler made their first contribution in #2069

Full Changelog: v0.26.0...v0.26.1

Contributors

amerberg and ybressler

Assets 2

13 Aug 01:12

cosmicBboy

v0.26.0

24fe938

v0.26.0: Add support for Python 3.13

⭐️ Highlight

📣 Pandera now supports Python 3.13! Now go forth and use bare forward reference types to your hearts content 🤗

What's Changed

Enh/future annotations py3.13 by @cosmicBboy in #1980
fix pyspark check registration by @cosmicBboy in #2087
remove top-level pandera init import warning by @cosmicBboy in #2088
Bugfix 2075: Polar dataframe default values - fill_nan AND fill_null for float columns by @cmsommerville in #2076
Remove pylint by @cosmicBboy in #2086
Upgrade pyupgrade hook and target Python version by @deepyaman in #2093
Fix passing an empty column list to check duplicates by @rush4ratio in #2092
Replace Literal imports from typing_extensions by @deepyaman in #2100
Add .git-blame-ignore-revs to avoid bulk changes by @deepyaman in #2101
limit polars version on Mac OS by @amerberg in #2105
delete monthly downloads, not available by @cosmicBboy in #2112
Implement parser machinery and the strict parser by @deepyaman in #2096
Support checking joint uniqueness of table columns by @deepyaman in #2097
Reimplement pandas MultiIndex backend without inheriting from DataFrame backend by @amerberg in #2103
fix(doc): clarify check_fn signature by @Farley-Chen in #2107
Fix missing tests core directory by @rush4ratio in #2102
fix polars Categorical bug by @cosmicBboy in #2113

New Contributors

@cmsommerville made their first contribution in #2076
@rush4ratio made their first contribution in #2092
@Farley-Chen made their first contribution in #2107

Full Changelog: v0.25.0...v0.26.0

Contributors

amerberg, cosmicBboy, and 4 other contributors

Assets 2

08 Jul 19:19

cosmicBboy

v0.25.0

c49b18f

v0.25.0: 🦩 Support Ibis table validation

⭐️ Highlight

Pandera now supports Ibis 🦩! You can now validate data on all available ibis backends using the pandera.ibis module.

In-memory table example:

import ibis
import pandera.ibis as pa

class Schema(pa.DataFrameModel):
    state: str
    city: str
    price: int = pa.Field(in_range={"min_value": 5, "max_value": 20})

t = ibis.memtable(
    {
        'state': ['FL','FL','FL','CA','CA','CA'],
        'city': [
            'Orlando',
            'Miami',
            'Tampa',
            'San Francisco',
            'Los Angeles',
            'San Diego',
        ],
        'price': [8, 12, 10, 16, 20, 18],
    }
)
Schema.validate(t).execute()

Sqlite example:

con = ibis.sqlite.connect()
t = con.create_table(
    "table",
    schema=ibis.schema(dict(state="string", city="string", price="int64"))
)

con.insert(
    "table",
    obj=[
        ("FL", "Orlando", 8),
        ("FL", "Miami", 12),
        ("FL", "Tampa", 10),
        ("CA", "San Francisco", 16),
        ("CA", "Los Angeles", 20),
        ("CA", "San Diego", 18),
    ]
)

Schema.validate(t).execute()

What does this mean?

This release unlocks in database validation in some of the most widely used data platforms, including PostGres, Snowflake, BigQuery, MySQL, and more ✨. It means that you can validate data at scale, on your database/data framework of your choice, before fetching it for downstream analysis/modeling work.

Naturally, this also means that you can develop your schemas locally on a duckdb or sqlite backend and then use the same schemas in production on a remote database like postgres.

Learn more about the integration here.

What's Changed

Add Polars pydantic integration with format support and native JSON schema generation by @halicki in #1979
exclude python 3.12 and pyspark combo in ci by @cosmicBboy in #2005
Delete previously-added foo.txt and new_example.py by @deepyaman in #2013
Pin PySpark due to test failures/incompatibilities by @deepyaman in #2010
Temporarily pin polars due to test failure in CI by @deepyaman in #2011
Replace event_loop removed in pytest-asyncio 1.0 by @deepyaman in #2014
Fix typehint in unique_values_eq (issue #1492) by @AhmetZamanis in #2015
fix pyarrow string issue, fix docs failing issues by @cosmicBboy in #2026
bugfix: PANDERA_VALIDATION_ENABLED=False should disable validation by @cosmicBboy in #2028
Expect Python slice index errors after Python 3.10 by @deepyaman in #2033
Ibis dev by @deepyaman in #2040
handle dataframe-level failure cases: convert row to dict by @cosmicBboy in #2050
bugfix/1927 by @Jarek-Rolski in #2019
[🐻‍❄️ polars] Limit reported failure cases if Check.n_failure_cases is defined. by @cosmicBboy in #2055
[🦩 ibis] Limit reported failure cases if Check.n_failure_cases is defined. by @cosmicBboy in #2056
Add link to the documentation about Ibis datatypes by @deepyaman in #2057
Test column presence, mark other features not impl by @deepyaman in #2060
Run pre-commit on all files to fix linter issues by @deepyaman in #2063
Implement regex option and add additional checks by @deepyaman in #2061
Implement binary and boolean types (and test them) by @deepyaman in #2064
Add unit test suite for Ibis components, fix a bug by @deepyaman in #2065
bugfix: fix format_vectorized_error_message to properly format nested pyarrow failed cases by @AndrejIring in #2036
handle empty dataframes with PydanticModel: show warning by @cosmicBboy in #2066
bugfix/2031: Allow strict='filter' and coerce='True' at the same time for PySpark schemas by @gfilaci in #2032
Set validation scope for pandas run_checks methods by @amerberg in #2003
DataFrameSchema.update_index correctly sets title, description, and metadata by @cosmicBboy in #2067
[ibis 🦩] remove inplace=True in column validate call by @cosmicBboy in #2068
[ibis 🦩] check backend: use positional join for duckdb and polars, fix ibis DataFrameModel.validate types by @cosmicBboy in #2071

New Contributors

@halicki made their first contribution in #1979
@AhmetZamanis made their first contribution in #2015
@AndrejIring made their first contribution in #2036
@gfilaci made their first contribution in #2032
@amerberg made their first contribution in #2003

Full Changelog: v0.24.0...v0.25.0

Contributors

amerberg, cosmicBboy, and 6 other contributors

Assets 2

0 Join discussion

07 Jul 00:34

cosmicBboy

v0.25.0rc0

ad8f08d

v0.25.0rc0: Support ibis table validation Pre-release

Pre-release

What's Changed

Add Polars pydantic integration with format support and native JSON schema generation by @halicki in #1979
exclude python 3.12 and pyspark combo in ci by @cosmicBboy in #2005
Delete previously-added foo.txt and new_example.py by @deepyaman in #2013
Pin PySpark due to test failures/incompatibilities by @deepyaman in #2010
Temporarily pin polars due to test failure in CI by @deepyaman in #2011
Replace event_loop removed in pytest-asyncio 1.0 by @deepyaman in #2014
Fix typehint in unique_values_eq (issue #1492) by @AhmetZamanis in #2015
fix pyarrow string issue, fix docs failing issues by @cosmicBboy in #2026
bugfix: PANDERA_VALIDATION_ENABLED=False should disable validation by @cosmicBboy in #2028
Expect Python slice index errors after Python 3.10 by @deepyaman in #2033
Ibis dev by @deepyaman in #2040
handle dataframe-level failure cases: convert row to dict by @cosmicBboy in #2050
bugfix/1927 by @Jarek-Rolski in #2019
[🐻‍❄️ polars] Limit reported failure cases if Check.n_failure_cases is defined. by @cosmicBboy in #2055
[🦩 ibis] Limit reported failure cases if Check.n_failure_cases is defined. by @cosmicBboy in #2056
Add link to the documentation about Ibis datatypes by @deepyaman in #2057
Test column presence, mark other features not impl by @deepyaman in #2060
Run pre-commit on all files to fix linter issues by @deepyaman in #2063
Implement regex option and add additional checks by @deepyaman in #2061
Implement binary and boolean types (and test them) by @deepyaman in #2064
Add unit test suite for Ibis components, fix a bug by @deepyaman in #2065
bugfix: fix format_vectorized_error_message to properly format nested pyarrow failed cases by @AndrejIring in #2036
handle empty dataframes with PydanticModel: show warning by @cosmicBboy in #2066
bugfix/2031: Allow strict='filter' and coerce='True' at the same time for PySpark schemas by @gfilaci in #2032
Set validation scope for pandas run_checks methods by @amerberg in #2003
DataFrameSchema.update_index correctly sets title, description, and metadata by @cosmicBboy in #2067
[ibis 🦩] remove inplace=True in column validate call by @cosmicBboy in #2068

New Contributors

@halicki made their first contribution in #1979
@AhmetZamanis made their first contribution in #2015
@AndrejIring made their first contribution in #2036
@gfilaci made their first contribution in #2032
@amerberg made their first contribution in #2003

Full Changelog: v0.24.0...v0.25.0rc0

Contributors

amerberg, cosmicBboy, and 6 other contributors

Assets 2

Uh oh!

Releases: unionai-oss/pandera

Release 0.29.0: support list, dict, and tuple of dataframes

⭐️ Highlight

What's Changed

New Contributors

Contributors

Uh oh!

v0.28.1: Fix regressions in Check behavior

What's Changed

Contributors

Uh oh!

Release 0.28.0: Add support for Pyspark 4

⭐️ Highlight

What's Changed

New Contributors

Contributors

Uh oh!

Release v0.27.1: bugfix related to numpy==2.4.0

What's Changed

Contributors

Uh oh!

v0.27.0: Support Python 3.14

⭐️ Highlight

What's Changed

New Contributors

Contributors

Uh oh!

v0.27.0b0: beta release, add Python 3.14

What's Changed

New Contributors

Contributors

Uh oh!

v0.26.1: Multi-index, `@check_types` Bugfixes

What's Changed

New Contributors

Contributors

Uh oh!

v0.26.0: Add support for Python 3.13

⭐️ Highlight

What's Changed

New Contributors

Contributors

Uh oh!

v0.25.0: 🦩 Support Ibis table validation

⭐️ Highlight

What does this mean?

What's Changed

New Contributors

Contributors

Uh oh!

v0.25.0rc0: Support ibis table validation

What's Changed

New Contributors

Contributors

Uh oh!