-
-
Notifications
You must be signed in to change notification settings - Fork 375
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
When validating a DataFrame (which includes an extra column and two columns out of order) against a DataFrameSchema that has both strict and ordered set to True, you will only get a COLUMN_NOT_ORDERED schema error. I would also expect a COLUMN_NOT_IN_SCHEMA error due to the DataFrame having an extra unexpected column.
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandera.
- (optional) I have confirmed this bug exists on the main branch of pandera.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
schema = pa.DataFrameSchema(
columns={
"id": pa.Column(pa.Int64, nullable=False),
"name": pa.Column(pa.String, nullable=True),
},
strict=True,
ordered=True,
)
# This dataframe will incorrectly only raise a COLUMN_NOT_ORDERED schema error even tho it also contains an extra unexpected column.
df = pd.DataFrame(
{
"name": ["Alice", "Bob", "Charlie"],
"id": [1, 2, 3],
"extra_column": ["extra1", "extra2", "extra3"],
},
)
# This dataframe will correctly only raise the COLUMN_NOT_IN_SCHEMA schema error
_df = pd.DataFrame(
{
"id": [1, 2, 3],
"name": ["Alice", "Bob", "Charlie"],
"extra_column": ["extra1", "extra2", "extra3"],
},
)Expected behavior
When my DataFrameSchema has both strict and ordered set to True, I'd expect the validation to raise both COLUMN_NOT_ORDERED and COLUMN_NOT_IN_SCHEMA schema errors, when the validated DataFrame contains both issues.
Desktop (please complete the following information):
- OS: iOS
- Browser: N/A
- Version: [e.g. 22]
Screenshots
N/A
Additional context
N/A
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working