Use simple approximation for LunarChinese #7006

robertbastian · 2025-09-30T19:42:55Z

Replaces #6995

gemini-code-assist · 2025-09-30T19:42:59Z

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

sffc

Initial feedback. I want to do a more thorough review before this lands

components/calendar/tests/pingqi.rs

components/calendar/src/cal/chinese.rs

sffc · 2025-09-30T20:41:10Z

components/calendar/src/cal/chinese.rs

+            if !(1900..2100).contains(&case.iso_year) {
+                continue;
+            }


doesn't pass outside the hardcoded range

Then update the tests to either be in-range or use the new dates?

I'll wait until I have an approval before I delete test cases that I might have to restore later

I have removed these tests because consistency with ICU is no longer a goal, and we have tests against the actual ground truth

components/calendar/src/cal/chinese.rs

Manishearth · 2025-09-30T22:47:31Z

Dangi data: https://gist.github.com/Manishearth/d8c94a7df22a9eacefc4472a5805322e. Please ignore the data for the year 1899 and 2050, it is incomplete.

Manishearth · 2025-09-30T22:47:54Z

I'm going to figure out a home for the scraper and scraped data, but for now we can just cite KASI.

The code and data used for fetching this will be pushed up to a separate (private) Unicode repo once we have one. You can find the cleaned up source data in https://gist.github.com/Manishearth/d8c94a7df22a9eacefc4472a5805322e. I'm imagining that post-1950 data will change or be removed with #7006 The initial motivation here was to fix the apparent ground truth mismatch found in https://github.com/unicode-org/icu4x/pull/7007/files#r2393049682. Turns out it was a different problem, and it has been fixed in #7013. We may potentially need the same discussion as #6970 about whether we care about these pre-1912 dates, since that's the only time this diverges.

robertbastian · 2025-10-01T16:32:31Z

components/calendar/src/cal/chinese.rs

+            LunarChineseYearData::simple(
+                // Future reference time is probably UTC+9
+                day_fraction_to_ms!(9 / 24),
+                // This is required for continuity with the hardcoded data


these years start on the same day with both methods:

2092, 2093, 2094, 2095, 2096, 2097, 2098, 2103, 2104, 2106, 2107, 2108, 2109, 2110

crucially, not 2101, which is why we need a correction

I don't like adjusting the new moon because it skews not only this year but all future years. I want us to at least try to approximate GB/T, with an average error near zero, even if local errors are several hours one way or the other. Can we just project Reingold two more years before cutting over?

we could, that's why I posted this list

we're not going to get an average error of zero, because the mean lunar cycle varies considerable over centuries. this method does not really align with gbt, even without lunar correction

We should pick a new moon that is close to the mean of the synodic cycle. If the moon was at the outer extreme of the ellipse on the date you picked in January 2000, then we'll be carrying that error forward.

Thought: We should run Reingold's new moon code for several hundred years and pick the new moon instant that minimizes the error. This should be an easy simulation to write, and it makes the approximation less arbitrary.

I don't consider minimising an error to be a goal here, at least not in this PR

components/calendar/src/cal/chinese.rs

Manishearth · 2025-10-01T17:13:34Z

components/calendar/src/cal/chinese.rs

+            if !(1900..2100).contains(&case.iso_year) {
+                continue;
+            }


components/calendar/src/cal/chinese.rs

Manishearth · 2025-10-01T17:18:32Z

components/calendar/src/cal/chinese.rs

+        new_moon_correction: Milliseconds,
+        related_iso: i32,
+    ) -> LunarChineseYearData {
+        fn periodic_duration_on_or_before(


Manishearth · 2025-10-01T17:32:35Z

components/calendar/src/cal/chinese.rs

+            base_moment: LocalMoment,
+            duration: Milliseconds,
+        ) -> LocalMoment {
+            let num_periods = ((rata_die - base_moment.rata_die + 1)


question: what's going on with this +1 / -1 ?

For functions like this either the math should be explained so that it's clear what it does, or the function should be documented so I can verify the math, doing neither leaves me guessing. Ideally both are documented.

My understanding is: this function is attempting to find the closest whole period from base_moment of period duration looking back from rata_die. The +/- 1 handles the edge cases, but I'd like a comment explanation explaining how.

I initially wrote this formula before @robertbastian refactored it. The +1 / -1 is to get us to the final millisecond of the day. We're given a RD, but the function returns "on or before", so we add 1 to the RD and then subtract 1 from the millisecond. We do the -1 because if the new moon occurs exactly at midnight, it belongs to the following day.

idk, I can remove this, we don't have a requirement of millisecond precision here

I think we need the +1 / -1. @Manishearth was just asking why.

well it wouldn't be the end of the world if we didn't move the new moon at the millisecond of exactly midnight to next day. lunar times are +-minutes from reality anyway

Either works, if we're being "inaccurate" there I'd still prefer a comment saying where we diverge.

diverge from what? this whole algorithm is hand waving

components/calendar/src/cal/chinese.rs

Manishearth · 2025-10-01T17:39:58Z

components/calendar/src/cal/chinese.rs

    pub(crate) fn md_from_rd(self, rd: RataDie) -> (u8, u8) {
        debug_assert!(
-            rd < self.next_new_year() || !WELL_BEHAVED_ASTRONOMICAL_RANGE.contains(&rd),
+            rd < self.next_new_year(),


issue: I'm worried about hitting these again in V8: can we write a fuzzer for these?

Both RD-to-chinese and chinese-from-fields.

It's pretty easy, use cargo-fuzz.

but we have a test at the extreme values, right? I don't see what you'd be worried about, there's not floating point math that can degenerate

For this PR I'd slightly prefer the well-behaved guard be kept around, and removing them is a separate change.

But if you feel confident about the assertions I guess it's fine.

this isn't using any logic from calendrical_calculations anymore, so that crate should not define how the assertions here behave

In general our calendar code has been full of broken invariants, and I'm not convinced this new code doesn't have that problem. The astronomical range isn't just a calendrical_calculations concept, it's an ICU4X one too, defined in calendrical_calculations because both need them. calendrical_calculations already defines two such constants, one for islamic, and one for chinese. ICU4X could internally define its own for Pingqi.

So I don't think "calendrical_calculations is no longer used so we no longer need to gate the assertions" is a valid reason. The question is whether we should be applying a valid astronomical range (well, a valid "approximation calculation" range) to invariants here at all. I'm okay with the answer being no, but I'm wary about these debug assertions, I already spent a significant amount of time playing whack-a-mole with the earlier ones.

it would be useful to be able to reproduce the issues V8 is having in our repo, so that we can eventually confidently remove these. right now it's unclear if they are needed or not

components/calendar/src/cal/chinese.rs

sffc · 2025-10-06T23:47:47Z

components/calendar/src/cal/chinese.rs

+// 2000-01-06T18:14 https://aa.usno.navy.mil/calculated/moon/phases?date=2000-01-01&nump=1&format=t
+const UTC_NEW_MOON: LocalMoment = LocalMoment {
+    rata_die: calendrical_calculations::gregorian::fixed_from_gregorian(2000, 1, 6),
+    local_milliseconds: ((18 * 60) + 14) * 60 * 1000,
+};


Suggestion: pick a new moon that is as close to the mean new moon as possible, i.e., one where the forces that cause the synodic month length to change are in equilibrium. Here's a good post on the subject: https://astronomy.stackexchange.com/questions/55048/what-causes-the-variation-in-the-length-of-the-synodic-month-besides-the-eccent

(does not block the PR)

the mean synodic month length we use is for the year 2000, which I why I'm using this new moon

sffc · 2025-10-06T23:49:30Z

components/calendar/src/cal/chinese.rs

+    ///
+    /// Stays anchored in the Gregorian calendar, even as the Gregorian calendar drifts
+    /// from the seasons in the distant future and distant past.
+    fn simple(utc_offset: Milliseconds, related_iso: i32) -> LunarChineseYearData {


Suggestion (optional): Make a version of this function where the new moon and solar term calculations are pluggable, and then assert that when you plug in the Reingold ones, they match HKO. This would increase our confidence that this approximation is a principled approximation.

sffc

my remaining comments are stylistic or suitable for a follow-up

robertbastian requested review from a team, Manishearth and sffc as code owners September 30, 2025 19:42

robertbastian force-pushed the pinqi2 branch 3 times, most recently from 4d87b46 to 00dabee Compare September 30, 2025 20:00

sffc requested changes Sep 30, 2025

View reviewed changes

robertbastian force-pushed the pinqi2 branch from e8fcb02 to 77cf05e Compare September 30, 2025 21:46

robertbastian requested a review from sffc September 30, 2025 22:19

Manishearth mentioned this pull request Sep 30, 2025

Hardcode KASI-derived data #7008

Merged

robertbastian marked this pull request as draft October 1, 2025 10:32

robertbastian force-pushed the pinqi2 branch 7 times, most recently from 077ed47 to acf1c4a Compare October 1, 2025 16:52

sffc and others added 6 commits October 1, 2025 19:03

Initial commit of FastPinqi

d0918d2

Pingqi with a G, and add more comments

b774e95

constants

d8df178

don't calculate full sui

439ca13

store durations as milliseconds

1555411

move

c06b435

robertbastian force-pushed the pinqi2 branch from acf1c4a to c06b435 Compare October 1, 2025 17:03

robertbastian marked this pull request as ready for review October 1, 2025 17:03

robertbastian commented Oct 1, 2025

View reviewed changes

Manishearth reviewed Oct 1, 2025

View reviewed changes

move changeover to avoid corrective term

983058a

sffc requested changes Oct 1, 2025

View reviewed changes

components/calendar/src/cal/chinese.rs Outdated Show resolved Hide resolved

components/calendar/src/cal/chinese.rs Outdated Show resolved Hide resolved

fix M12L

bf55426

robertbastian requested a review from sffc October 6, 2025 14:25

sffc reviewed Oct 6, 2025

View reviewed changes

robertbastian added 3 commits October 8, 2025 12:52

Merge remote-tracking branch 'upstream/main' into pinqi2

a070eaf

review

5a1916c

move

e2bd0da

robertbastian requested a review from sffc October 8, 2025 11:17

recover assertion opt-outs

cb02917

robertbastian force-pushed the pinqi2 branch from 4459f75 to cb02917 Compare October 8, 2025 11:23

remove icu consistency checks

48ad05b

sffc approved these changes Oct 8, 2025

View reviewed changes

robertbastian merged commit b31494d into unicode-org:main Oct 8, 2025
31 checks passed

robertbastian deleted the pinqi2 branch October 8, 2025 21:03

Use simple approximation for LunarChinese #7006

Use simple approximation for LunarChinese #7006

Uh oh!

Conversation

robertbastian commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Sep 30, 2025

Uh oh!

sffc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Manishearth commented Sep 30, 2025

Uh oh!

Manishearth commented Sep 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robertbastian commented Sep 30, 2025 •

edited

Loading

sffc left a comment •

edited

Loading