-
-
Notifications
You must be signed in to change notification settings - Fork 382
nv2a: Implement PTIMER alarm interrupts #2568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
nv2a: Implement PTIMER alarm interrupts #2568
Conversation
79e6178 to
4a6a08d
Compare
|
I tested only other game that freezes during intro and is unrelated issue. NHL Rivals 2004 Note: this game goes in-game on cxbx-reloaded. Also tested NFL Fever 2004 no change. |
PatrickvL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a bug
hw/xbox/nv2a/ptimer.c
Outdated
| uint32_t current_time_low = (guest_clock & CLOCK_LOW_MASK) << 5; | ||
| uint32_t current_time_high = (guest_clock >> 27) & CLOCK_HIGH_MASK; | ||
|
|
||
| if (d->ptimer.alarm_time <= current_time_low || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is wrongly succeeding when current time low bits is lower or equal to alarm time (low bits) but current high bits are less than alarm high bits.
It seems wiser to reconstruct the full set of alarm bits and do a single comparison of that against the full bits of the current time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is correct due to the fact that there is no mechanism to specify the high bits of the alarm time.
- If current_time_low < alarm the first condition is false so behavior is governed by whether current_time_high is > alarm (the timer has wrapped).
- If current_time_low >= alarm the timer will fire (I'm assuming you meant that it wrongly succeeds if current time low bits is higher than alarm time?). If the high bits were controllable, this could indeed be erroneous.
I'm not sure what the HW behavior is if the alarm is set into the ambiguous past or if the caller is able to race the clock and set TIME_1 into the past before the alarm fires.
If the values observed being set to the ALARM_0 weren't so close to the current time I would've expected it to be a delta into the future, in which case it would be unambiguous and using 64-bit would certainly make things clearer.
The fact that it seems to really just be the low time bits makes me think that it's doing a 32-bit comparison with special affordance for the case where the value is very close to rollover (as implemented here).
I've also seen values with the high bit set, so it's not a case of using a 31-bit value to catch rollover.
I suspect this may be a case where an assumption was made that the driver writer would do the right thing and that regular users either wouldn't know the register existed or wouldn't attempt ~nonsensical things like setting a nanosecond timer into the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out the alarm is not intended to be a one-shot (the game I was using to infer the behavior is just very spammy about updating the alarm time). I ended up writing a trivial test case by adding a counter to pbkit's DPC that is incremented on alarm interrupts and found that it fires every time the low 32 bits of the counter rolls over.
mborgerson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Some initial feedback
hw/xbox/nv2a/ptimer.c
Outdated
|
|
||
| /* PTIMER - time measurement and time-based alarms */ | ||
| static uint64_t ptimer_get_clock(NV2AState *d) | ||
| static uint64_t ptimer_get_host_clock(NV2AState *d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This host/guest terminology here is a bit confusing. "Host" typically refers to the machine running xemu (and that is not the time domain source we are using--we are actually using the virtual clock here which stops when emulation is paused) and "guest" typically means the emulated machine state. Maybe host/guest → absolute/relative would be more clear in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to absolute/relative. I think it's still unusual enough to justify a bit of documentation.
hw/xbox/nv2a/nv2a.c
Outdated
| VMSTATE_UINT32(ptimer.numerator, NV2AState), | ||
| VMSTATE_UINT32(ptimer.denominator, NV2AState), | ||
| VMSTATE_UINT32(ptimer.alarm_time, NV2AState), | ||
| VMSTATE_UINT32(ptimer.alarm_time_high, NV2AState), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be compatible with older vmstate versions, we should append new fields to the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, also added a note to hopefully ensure future additions are also appended.
hw/xbox/nv2a/nv2a.c
Outdated
| } | ||
|
|
||
| /* PTIMER */ | ||
| ptimer_process_alarm(d); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this function is called regularly, alarm management should be independent from this interrupt management function. We should use QEMUTimer instead for time based event signaling
For reference:
xemu/hw/xbox/mcpx/nvnet/nvnet.c
Line 669 in 9b7a43f
| timer_mod(s->autoneg_timer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks for the pointer!
4a6a08d to
f0310fc
Compare
|
Looking at HW behavior, it seems that the alarm is not reset after firing, it's up to the application to clear it or disable the interrupt. |
|
Updated to match observed HW behavior in a trivial test that sets the alarm to an arbitrary value and increments a counter every time the interrupt fires. Hopefully this also fixes the freezes observed in DJFfNY (I've restarted dozens of times without my tracing plugin attached now and have been unable to repro). |
|
Tested this build and freezing on intro fixed but freezing still occur when on start and first fight intro. |
|
Also tested this with Def Jam Fight for NY. Game loads now, and navigating menus seems reliable. But it freezes randomly during cut scenes (seemingly at different points, so truly random). Game is still unplayable. |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request implements PTIMER alarm interrupts, which appears to have been a partially implemented feature. The changes involve adding the necessary interrupt handling logic, state management for the timer alarm, and VM state integration. My review focuses on the correctness of the timer emulation logic. I've found a potential race condition when reading the 64-bit timer value and a critical issue that could lead to an interrupt storm. The rest of the implementation seems reasonable.
| if (alarm_time <= now) { | ||
| d->ptimer.pending_interrupts |= NV_PTIMER_INTR_0_ALARM; | ||
| d->ptimer.alarm_time_high = now >> 32; | ||
| nv2a_update_irq(d); | ||
| } | ||
|
|
||
| schedule_qemu_timer(d); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of ptimer_alarm_fired will likely cause an interrupt storm. After an alarm fires and the interrupt is raised, schedule_qemu_timer(d) is called unconditionally. Since the alarm time has not been changed by the guest, schedule_qemu_timer will see that the alarm is still in the past and schedule the timer to fire again immediately. This will repeat, causing a high CPU load and flooding the guest with interrupts.
The line d->ptimer.alarm_time_high = now >> 32; does not prevent this and might be incorrect as it modifies the alarm time without guest intervention.
A safer approach is to treat the QEMU timer as a one-shot event. Once it fires, the interrupt is raised, and the timer is not rescheduled. The guest is then responsible for clearing the interrupt and programming a new alarm, which will in turn schedule a new QEMU timer via ptimer_write.
I suggest modifying the logic to only reschedule the timer in case of a spurious wakeup (where alarm_time > now).
| if (alarm_time <= now) { | |
| d->ptimer.pending_interrupts |= NV_PTIMER_INTR_0_ALARM; | |
| d->ptimer.alarm_time_high = now >> 32; | |
| nv2a_update_irq(d); | |
| } | |
| schedule_qemu_timer(d); | |
| } | |
| if (alarm_time <= now) { | |
| d->ptimer.pending_interrupts |= NV_PTIMER_INTR_0_ALARM; | |
| nv2a_update_irq(d); | |
| } else { | |
| /* Spurious wakeup, reschedule. */ | |
| schedule_qemu_timer(d); | |
| } |
| case NV_PTIMER_TIME_0: { | ||
| uint64_t now = get_ptimer_clock(d, ptimer_get_absolute_clock(d)); | ||
| r = now & 0xffffffff; | ||
| } break; | ||
| case NV_PTIMER_TIME_1: { | ||
| uint64_t now = get_ptimer_clock(d, ptimer_get_absolute_clock(d)); | ||
| r = (now >> 32) & CLOCK_HIGH_MASK; | ||
| } break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading a 64-bit timer value from two separate 32-bit register reads can be racy. If the timer ticks over and crosses a 32-bit boundary between the read of the low part (NV_PTIMER_TIME_0) and the high part (NV_PTIMER_TIME_1), the guest will get a corrupted time value. Both case statements call get_ptimer_clock independently, which can return different values.
A common hardware pattern to solve this is to latch the high part of the timer when the low part is read. The subsequent read of the high part returns the latched value.
To implement this, you could add a time_high_latched field to the ptimer struct in nv2a_int.h and initialize it in ptimer_reset. Then, modify the read logic as suggested.
| case NV_PTIMER_TIME_0: { | |
| uint64_t now = get_ptimer_clock(d, ptimer_get_absolute_clock(d)); | |
| r = now & 0xffffffff; | |
| } break; | |
| case NV_PTIMER_TIME_1: { | |
| uint64_t now = get_ptimer_clock(d, ptimer_get_absolute_clock(d)); | |
| r = (now >> 32) & CLOCK_HIGH_MASK; | |
| } break; | |
| case NV_PTIMER_TIME_0: { | |
| uint64_t now = get_ptimer_clock(d, ptimer_get_absolute_clock(d)); | |
| d->ptimer.time_high_latched = (now >> 32) & CLOCK_HIGH_MASK; | |
| r = now & 0xffffffff; | |
| } break; | |
| case NV_PTIMER_TIME_1: { | |
| r = d->ptimer.time_high_latched; | |
| } break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gemini-code-assist: Latching could be the right approach, but we should review access patterns and confirm what hardware actually does. The proposed change might not be correct if NV_PTIMER_TIME_1 is read before NV_PTIMER_TIME_0, for instance. I suspect hardware leaves this to be checked by software anyway and that the current solution is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You raise a very valid point, @mborgerson, regarding the need to confirm actual hardware behavior and access patterns. My suggestion was based on a common hardware pattern to mitigate race conditions when a 64-bit timer is exposed via two separate 32-bit registers.
The core issue is that in the current implementation, both NV_PTIMER_TIME_0 and NV_PTIMER_TIME_1 independently call get_ptimer_clock(d, ptimer_get_absolute_clock(d)). This means that if the guest reads NV_PTIMER_TIME_0, then the timer ticks, and then the guest reads NV_PTIMER_TIME_1, the resulting 64-bit value constructed by the guest will be inconsistent (the high part will be from a later time than the low part). This is a classic race condition that can lead to incorrect time values for the guest OS.
My proposed solution addresses this specific race condition by latching the high 32 bits of the timer when the low 32 bits (NV_PTIMER_TIME_0) are read. This ensures that when the guest subsequently reads the high 32 bits (NV_PTIMER_TIME_1), it receives the value that was current at the time of the NV_PTIMER_TIME_0 read, thus providing a consistent 64-bit snapshot.
Your concern about NV_PTIMER_TIME_1 being read before NV_PTIMER_TIME_0 is important. If the hardware indeed allows NV_PTIMER_TIME_1 to be read independently and still expects consistency, or if it has a different latching mechanism (e.g., a dedicated register write to latch both parts), then my proposed solution might not perfectly mirror that specific hardware behavior. However, in the absence of precise hardware documentation, the read low then read high pattern with implicit latching on the low read is a very common design to ensure atomic 64-bit reads from split registers. The current implementation, without any form of latching, will always be susceptible to the described race condition if the guest attempts to reconstruct a 64-bit timer value from two separate reads.
Even if software can work around this by reading the high part multiple times until it stabilizes (a common but inefficient software workaround), the goal of emulation is often to present a consistent and predictable hardware interface. Providing an atomic 64-bit read through a latching mechanism is generally a more robust emulation of a 64-bit timer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting response. My expectation is also that the HW isn't doing magic here and it really is left up to the SW to detect race-rollover if it actually cares to. When I was debugging the issue that precipitated this change I saw several places where the timer was being read in a tight loop, presumably to work around the potential race.
| case NV_PTIMER_TIME_0: { | ||
| uint64_t now = get_ptimer_clock(d, ptimer_get_absolute_clock(d)); | ||
| r = now & 0xffffffff; | ||
| } break; | ||
| case NV_PTIMER_TIME_1: { | ||
| uint64_t now = get_ptimer_clock(d, ptimer_get_absolute_clock(d)); | ||
| r = (now >> 32) & CLOCK_HIGH_MASK; | ||
| } break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gemini-code-assist: Latching could be the right approach, but we should review access patterns and confirm what hardware actually does. The proposed change might not be correct if NV_PTIMER_TIME_1 is read before NV_PTIMER_TIME_0, for instance. I suspect hardware leaves this to be checked by software anyway and that the current solution is correct.
hw/xbox/nv2a/nv2a.c
Outdated
| .post_save = nv2a_post_save, | ||
| .post_load = nv2a_post_load, | ||
| .pre_load = nv2a_pre_load, | ||
| // NOTE: New fields should be appended, regardless of similarity to existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's drop the comment. VMState is documented elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, though I think I've broken this at least a couple times over the years by inserting or resizing things inline.
I wonder if the review bots read AGENTS.md and could be instructed to look for this specifically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so! Not sure about the actual details. Maybe they can be instructed to read select pages in docs/ based on related subsystem under review.
hw/xbox/nv2a/nv2a_regs.h
Outdated
| #define NV_PMC_INTR_0 0x00000100 | ||
| # define NV_PMC_INTR_0_PFIFO (1 << 8) | ||
| # define NV_PMC_INTR_0_PGRAPH (1 << 12) | ||
| # define NV_PMC_INTR_0_PTIMER_PENDING (1 << 20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To match other fields, drop _PENDING suffix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
|
||
| uint64_t diff_ns = 0; | ||
| if (alarm_time > now) { | ||
| diff_ns = ptimer_ticks_to_ns(d, (alarm_time - now) >> 5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit lost tracking the bit shifting. Is the left shifting in get_ptimer_clock just putting the clock in register format? We shift left in get_ptimer_clock, then shift right here to compute actual clock deltas. I wonder if this means we should really just do the shifting/masking at the register interface (e.g. in ptimer_read)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's just moving in and out of the format used in the registers.
Aren't the read/write callbacks expected to act on data in HW format?
I wonder if some helper macros would be appropriate rather than messing with the guest-accessible callback behavior (but maybe I'm just not following your suggestion).
0dce0ec to
a3daf09
Compare
|
I've been continuing the investigation in #1124 - I think the remaining softlock is timing related rather than an issue with the PTIMER alarm code added in this PR. To summarize my understanding at the moment, the game tries to keep a consistent timer interval and does some checking using the CPU timestamp counter ( This will require a separate fix to address. At this point I've created additional PTIMER tests and am relatively confident that the PTIMER behavior on its own matches hardware. In the meantime you may be able to get the game to work by enabling the qemu |
Looks like the PTIMER alarm handling was started but never finished. This implements my best guess at what the HW is doing (it's hard to test with a nanosecond timer). It fixes the blocked loading for Def Jam: Fight for NY, which I determined to be caused by a event handle never being set due to the lack of an alarm interrupt in the same DPC that handles video blanking and other PMC interrupts. The PTIMER pending flag is known from the nxdk: https://github.com/XboxDev/nxdk/blob/d0ae96acc926a6c3f6254b289a1a9a3c1db182f1/lib/pbkit/outer.h#L83 Fixes xemu-project#1124
- Timers on HW continue to fire every time the low 27 bits wrap around. - Standardized the logic such that the "now" clock is shifted 5 bits to match the setter and observed HW behavior.
a3daf09 to
b03d8a4
Compare




Looks like the PTIMER alarm handling was started but never finished. This implements my best guess at what the HW is doing (it's hard to test with a nanosecond timer). It fixes the blocked loading for Def Jam: Fight for NY, which I determined to be caused by a event handle never being set due to the lack of an alarm interrupt in the same DPC that handles video blanking and other PMC interrupts. The PTIMER pending flag is known from the nxdk:
https://github.com/XboxDev/nxdk/blob/d0ae96acc926a6c3f6254b289a1a9a3c1db182f1/lib/pbkit/outer.h#L83
PTIMER tests
Fixes #1124