-
Notifications
You must be signed in to change notification settings - Fork 3.4k
feat(ingestion/deps): add upper bounds to dependency versions in setup.py #15813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingestion/deps): add upper bounds to dependency versions in setup.py #15813
Conversation
kyungsoo-datahub
commented
Jan 6, 2026
- Add upper bounds to all dependencies using next major version (e.g., <3.0.0)
- For 0.x packages, use tight bounds (<=current) or <1.0.0 where no 1.x exists
- Add comments documenting automatic dependency chains:
- ex> numpy<2 -> feast<=0.47 -> pyarrow<18.1 (resolved automatically)
- Keep explicit constraints for deliberate choices (sqlalchemy<2, numpy<2)
|
Linear: ING-1334 |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
✅ Meticulous spotted 0 visual differences across 967 screens tested: view results. Meticulous evaluated ~8 hours of user flows against your PR. Expected differences? Click here. Last updated for commit 974146c. This comment will update as new commits are pushed. |
Bundle ReportBundle size has no change ✅ |
metadata-ingestion/setup.py
Outdated
| "click>=7.1.2, !=8.2.0", | ||
| "click>=7.1.2,!=8.2.0,<9.0.0", | ||
| "click-default-group", | ||
| "PyYAML", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[MINOR] There are still unbounded packages, plan to upper-bound all the packages right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
metadata-ingestion/setup.py
Outdated
| # `aws_access_key_id`, `aws_secret_access_key`, and `aws_session_token` were deprecated and removed in version | ||
| # 0.8.0. | ||
| "pyiceberg[glue,hive,dynamodb,snappy,hive,s3fs,adlfs,pyarrow,zstandard]>=0.8.0", | ||
| "pyiceberg[glue,hive,dynamodb,snappy,hive,s3fs,adlfs,pyarrow,zstandard]>=0.8.0,<=0.10.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pyiceberg is still actively developed - should we strict pin to the latest version ? same with databricks-sdk
askumar27
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Do we need the lower bounds - would upper bound just suffice?
…p.py - Add upper bounds to all dependencies using next major version (e.g., <3.0.0) - For 0.x packages, use tight bounds (<=current) or <1.0.0 where no 1.x exists - Add comments documenting automatic dependency chains: - numpy<2 -> feast<=0.47 -> pyarrow<18.1 (resolved automatically) - sqlalchemy<2 -> sqlalchemy-pytds<1.0, sqlalchemy-hana<4.0 (resolved automatically) - protobuf<5 -> grpcio-tools<1.63 (resolved automatically) - urllib3<2 -> tableauserverclient<0.27 (resolved automatically) - Keep explicit constraints for deliberate choices (sqlalchemy<2, numpy<2)
Add version upper bounds to packages that previously had no constraints.
Strategy:
- For 1.x+ packages: <next_major.0.0
- For 0.x packages: <0.{minor+1}.0
9506d43 to
321530f
Compare
metadata-ingestion/setup.py
Outdated
| "excel": { | ||
| "openpyxl>=3.1.5", | ||
| "openpyxl>=3.1.5,<4.0.0", | ||
| "pandas", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let add a upper bound here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Thanks.
…kages Add version upper bounds to packages that were missed by the automated script (packages with no existing version specifiers): - google-cloud-bigquery<4.0.0 - google-cloud-resource-manager<2.0.0 - google-cloud-dataplex<3.0.0 - pandas<3.0.0 (excel plugin)
askumar27
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notes from review meetings:
There are 4 types of py package bounding cases we have:
- With strict upper and lower bounds - these are untouched
- With lower bounds only (mainly the case) - upper bounds are added to the current latest MAJOR version - ensuring not to cause breaking changes
- With no bounds - upper bounds are added to the current latest MAJOR version - ensuring not to cause breaking changes
- With upper bounds only - these are untouched
…ow 3.1.x compatibility