Skip to content

Conversation

notatallshaw
Copy link
Member

@notatallshaw notatallshaw commented Aug 2, 2025

Closes #6257
Supplants #12717

Changes from the previous PR:

  • upload-before -> exclude-newer-than uploaded-prior-to
  • Uses UTC by default so that is reproducible across environments
  • Pass to isolated build install
  • Adds tests

@notatallshaw
Copy link
Member Author

I've made some changes:

  • I've reverted to using local timezone based on @BurntSushi feedback
  • I've added help for users on how to explicitly specify a timezone
  • I am now passing --exclude-newer-than to the isolated build install command

Copy link
Member

@ichard26 ichard26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my first review pass. I'll need to think about the larger design more, but I like where this is going. Thank you for working on this!

Comment on lines +307 to +311
upload_time: datetime.datetime | None
if upload_time_data := file_data.get("upload-time"):
upload_time = parse_iso_datetime(upload_time_data)
else:
upload_time = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that the JSON version of the Simple API is used in the vast majority of installs, but we should probably support this feature with the HTML Simple API if possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature doesn't exist in the HTML version of the API, it only exists as addition JSON fields in the spec: https://peps.python.org/pep-0700/#specification

For the HTML version of the API, there is no change from version 1.0. For the JSON version of the API, the following changes are made:

IMO this was a short sighted choice by the spec authors, as it has prevented certain simple index libraries from supporting this feature. But we can't change the past.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for the information. /me wonders if it'd be useful to propose a v1.1 of the HTML spec to bring it to feature parity with the JSON API, but I don't have the time for that, haha.

("2023-01-01T00:00:00", (2023, 1, 1, 0, 0, 0)),
("2023-12-31T23:59:59", (2023, 12, 31, 23, 59, 59)),
# Test date only (will be extended to midnight)
("2023-01-01", (2023, 1, 1, 0, 0, 0)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, but this means if I specify --exclude-newer-than 2025-08-14 (aka today), then pip will exclude candidates published on August 14, right? That behaves opposite to what I'd expect by the name --exclude-newer-than <day> -> exclude anything published after this day.

Should this not be extended to (2023, 1, 2, 0, 0, 0) or (2023, 1, 1, 23, 59, 59) ...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datetime 2025-08-14 is equivalent so 2025-08-14 00:00:00 in the same way the float 1 is equivalent to the float 1.000.

To me your suggestion is the same as saying "not larger than 1" and allowing 1.1 because it is less than 2.

I agree though on the boundary "exclude newer than 2025-08-14" should allow 2025-08-14 00:00:00, I'll take a look at that.

Happy to accept a better phrase than "exclude newer than", but no one seems to have been confused by uv's "exclude newer"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me your suggestion is the same as saying "not larger than 1" and allowing 1.1 because it is less than 2.

This is personal, but I don't read --exclude-newer-than like that. I read it like "ignore any packages published after <date/time>". exclude means ignore and than is exclusive (as is, it won't ignore... double negatives are hard).

The difference with uv's --exclude-newer is that the word "than" is missing, thus it's ambiguous whether it's inclusive or exclusive. I do agree that uv's option name looks too much like a boolean flag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my explanation doesn't make sense. I agree with @ofek's opinion on the inclusive/exclusive question: #12717 (comment)

For better or worse, I do think it's important that we match uv in semantics. Otherwise, the two tools will function subtly differently which is bound to become a footgun. I guess that means I am most in favour of --uploaded-before. Fun 🙂

Copy link
Member Author

@notatallshaw notatallshaw Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a good suggestion or strong option on the naming or semantics of the opinion, and am open to alternatives.

But I am strongly against mutating the user input or changing the logic to anything other than package datetime <= user datetime or exlusive <. It might not be semantically clear at first but it is logically simple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to think on the additional feedback given and any feedback over the next couple of days and hold off from changing the name in this PR for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking hard about this and I am still against this inclusive "date" comparison approach, one problem that demonstrates its complexity is time zones:

  1. If I do implicit type conversion and parse the user input as a date and then naively take the date part of the upload-time then this creates wrong values, as the user date and the UTC date will not exactly align unless the user happens to be in UTC
  2. If I follow uv's example and add 1 day to the user datetime and then do a comparison with the upload-time that's precarious because, IMO, Python's datetime library design is not easy to understand where as uv gets to follow jiff, which is a lot more intuitive
  3. Which leaves me following @pfmoore's suggestion where I tack on 23:59:59 to the user input (reading the spec I think to be technically correct it would be 23:59:59.999999) and I can not bring myself to mutate user input in that way, for a few reasons, but the biggest one is having this kind of "smart" comparison (like Python package versions and pre-releases) creates hard to follow logical situations

I am therefore going to use an exclusive comparison, and then this issue of date vs. datetime goes away. Everyone agrees that "uploaded prior to 2025-08-14" is the same as "uploaded prior to 2025-08-14 00:00:00".

Though I appreciate that:

  • Everyone else seems to disagree with my viewpoint about inclusive operators
  • This breaks semantics with uv's option

If this causes this PR to be rejected that's fine, I won't block another PR if someone wants to follow the inclusive comparison using the semantics everyone else seems to agree on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, does this mean you will change the option name to --uploaded-prior-to?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If we're going to have different semantics a different name seems better)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, does this mean you will change the option name to --uploaded-prior-to?

Yes, sorry, meant to do it last night but ran out of time to work on stuff, commit landed now.

@notatallshaw notatallshaw force-pushed the exclude-newer-than branch 2 times, most recently from 9f528c7 to c7877f2 Compare August 19, 2025 04:48
@notatallshaw notatallshaw changed the title Implement --exclude-newer-than Implement --uploaded-prior-to to filter packages by upload-time Aug 21, 2025
@notatallshaw
Copy link
Member Author

I believe I have addressed all outstanding review items, I also:

  • Added this option to the lock command (was missing because original PR predated lock command)
  • Added a section to the user guide
  • Added a few more tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Install packages up to a certain date
7 participants