-
Notifications
You must be signed in to change notification settings - Fork 262
Only apply RD range when parsing from_str #7531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Needs #7533 to land first. |
1bc7719 to
1900dcf
Compare
1900dcf to
2316767
Compare
|
I don't see what the actual issue you're trying to fix is. The test you have written could be written with the current ranges as well. It looks like you're trying to align IXDTF parsing to the constructor-documented year ranges. I don't think those should be aligned, we arbitrarily chose the constructor year ranges so that they are easy to document for the constructors. They are not invariants of the |
|
Sorry, the PR description is from when this was still a draft, so it doesn't describe what this PR does, it describes why this PR doesn't work :) That's fixed now that I wrote #7533 I updated the PR description.
No, I'm aligning it to the RD range, which is an internal invariant of the Date type.
I agree that IXDTF parsing should not be aligned with constructor documented year ranges, and I'm glad to hear you feel the same. Currently the ISO year range is applied to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the second commit, but not with the first.
I think what you're trying to do here is:
- you're tightening the year range, because
- you're tightening the RD range, because
- you want to add IXDTF serialization in the future, and that requires an RD range smaller than -999999-01-01..999999-12-31
Is that accurate? The explanations/justifications I've seen all seem backward
| // Month must be 1-12 | ||
| if !(1..=12).contains(&month) { | ||
| return Err(RangeError { | ||
| field: "month", | ||
| value: i32::from(month), | ||
| min: 1, | ||
| max: 12, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Month must be 1-12 | |
| if !(1..=12).contains(&month) { | |
| return Err(RangeError { | |
| field: "month", | |
| value: i32::from(month), | |
| min: 1, | |
| max: 12, | |
| }); | |
| } | |
| range_check(month, "month", 1..=12)?; |
| // Day must be valid for the month and year | ||
| let days_in_month = AbstractGregorian::<IsoEra>::days_in_provided_month(year, month); | ||
| if !(1..=days_in_month).contains(&day) { | ||
| return Err(RangeError { | ||
| field: "day", | ||
| value: i32::from(day), | ||
| min: 1, | ||
| max: i32::from(days_in_month), | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Day must be valid for the month and year | |
| let days_in_month = AbstractGregorian::<IsoEra>::days_in_provided_month(year, month); | |
| if !(1..=days_in_month).contains(&day) { | |
| return Err(RangeError { | |
| field: "day", | |
| value: i32::from(day), | |
| min: 1, | |
| max: i32::from(days_in_month), | |
| }); | |
| } | |
| range_check(day, "day", 1..=AbstractGregorian::<IsoEra>::days_in_provided_month(year, month))?; |
| if !VALID_RD_RANGE.contains(&rd) { | ||
| return Err(RangeError { | ||
| field: "RataDie", | ||
| value: rd.to_i64_date() as i32, | ||
| min: VALID_RD_RANGE.start().to_i64_date() as i32, | ||
| max: VALID_RD_RANGE.end().to_i64_date() as i32, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to throw a range error over RD, this is not actionable for callers.
Given that IXDTF is written as Gregorian dates, I'm tempted to restrict it to the same range as Date::try_new_gregorian after all
| /// | ||
| /// This allows for construction in cases where interchange is needed, e.g. from IXDTF | ||
| #[cfg(feature = "ixdtf")] | ||
| pub(crate) fn try_new_iso_without_range_check( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please inline this to the IXDTF parser, it will make it easier to reason about error cases
| use crate::calendar_arithmetic::VALID_RD_RANGE; | ||
|
|
||
| // Month must be 1-12 | ||
| if !(1..=12).contains(&month) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IXDTF parser already returns a ParseError here:
icu4x/utils/ixdtf/src/parsers/datetime.rs
Line 294 in 50912ea
| return Err(ParseError::InvalidMonthRange); |
|
|
||
| // Day must be valid for the month and year | ||
| let days_in_month = AbstractGregorian::<IsoEra>::days_in_provided_month(year, month); | ||
| if !(1..=days_in_month).contains(&day) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IXDTF parser already returns a ParseError here:
icu4x/utils/ixdtf/src/parsers/datetime.rs
Line 226 in 50912ea
| return Err(ParseError::InvalidMonthDay); |
| #[test] | ||
| fn test_parse_out_of_iso_year_range() { | ||
| // This test ensures that dates with large years, representable in a calendar | ||
| // like Islamic-tabular, can roundtrip through the IXDTF string format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no round-tripping going on in this test
| Iso, | ||
| ); | ||
| let rd = date.to_rata_die(); | ||
| if !VALID_RD_RANGE.contains(&rd) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if you don't change the year range to 500000 (and update the RD range to stay tight), you don't need this check. The IXDTF parser allows at most 999999, which is within the RD range.
No, that's not accurate. We already have IXDTF here, there's no "future" thing here. IXDTF as a serialization format is specced by Temporal. We don't have serde impls but it is philosophically a serialization format and we should support it in a non-buggy way. My path on this issue is that I was investigating our date ranges for #7207, and realized that the current impl applies the ISO year range for I went to fix it (this PR), and realized that it was untestable because our RD range is not actually parseable. This I consider to be a bigger bug, so I went ahead and proposed a fix to that and based this PR on top of it. There are multiple ways of doing the first commit, we should discuss that on the other issue. The first commit does not drive this one. My core motivator here is I do not want any of our dates to be non-serializable. |
It's "untestable" because there is currently no bug. Currently every valid IXDTF string is a valid
It's silly that it checks it, because the parser already guarantees that we're in range. Let's just remove the checks and replace them by a comment. |
Yes, but there's the underlying bug that all of our valid dates are not roundtrippable through IXDTF. Again, I noticed what I seemed to be a bug in the code, and while attempting to test it, discovered what I considered to be a worse bug, fixed it, and then fixed the original bug. You're right that the bug this fixes is not present in the current code, but it is not present due to what I consider to be a worse bug. |
Let's do this one step at a time then, with clear justifications for each step. You keep talking about round tripping, and I keep saying we don't have IXDTF serialization. |
I am? I made two PRs, one depending on the other? That's ... how that works. I'm frustrating with your insistence on debating this PR outside of the context of the PR it is based on. It's based on another PR. That's how that works. Yes, this "fix" is not necessary in the current code. I discovered that via a somewhat circituous route, but before anyone saw these PRs I set them up to depend on each other in that way. It's a waste of time to try and debate the merits of this PR while assuming hte other one won't land. That's not what this PR was written for, and I agree that without the other PR we shouldn't land this.
Huh, you're right. We only have deserialization, it is temporal_rs which implements serialization. That's fair, I crossed my wires between the two codebases in understanding what we do and don't have. We probably should support IXTDF serialization, though. |
|
I'm removing myself from reviewer because I don't want to spend time reviewing things that we might revert or change again once we decide on the policy. |
Currently, we have soft per-calendar year ranges (±1M years, ±500K after #7533) that affect per-calendar constructors, and a hard internal RD range that is the minimum range required to support all of the soft ranges.
It is possible to get out of the soft range by e.g. converting dates between calendars. This is by design.
The soft year range works for all of the constructors from codes/etc. However, string constructors work on dates like
1970-01-01[u-ca=islamic-tbla], where the date is specified as an ISO date, and annotated as being interpreted in a particular calendar.This is the primary serialization format for such dates, as specified in IXDTF and Temporal.
Unfortunately, in the current implementation, the ISO year range is applied to this serialization format, regardless of the calendar chosen. If a date is in range for
islamic-tbla, but out of range for ISO, it cannot be deserialized from its serialized form.Instead, we should just apply the R.D. range when calling
from_str. An alternate route is to apply the calendar-specific year range when deserializing from a calendar, however this results in dates that are theoretically possible to construct but will not deserialize.Based on #7533