-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Hi,
Our SP developers have reported an IRQ‑related issue. The problem is as follows:
Hafnium receives a pIRQ and signals the S‑EL1 UP SP via a vIRQ; the IRQ is level-triggered and the SP’s handler deasserts the interrupt line. However, Hafnium performs EOI in S-EL2 before switching back to the SP, Link
if (v_intid != SPURIOUS_INTID_OTHER_WORLD) {
/*
* End the interrupt to drop the running priority. It also
* deactivates the physical interrupt. If not, the interrupt
* could trigger again after resuming current vCPU.
*/
plat_interrupts_end_of_interrupt(intid);
}
so as soon as we resume the SP the same pIRQ is immediately re-presented and the CPU returns to Hafnium’s irq_lower handler.
With instrumentation we observe the SP’s PC stuck at the IRQ handler entrypoint (0x93603280) and the system bouncing between Hafnium and the SP without making forward progress.
[3.611642][HF] (0) ERROR: ffa_interrupts_handle_secure_interrupt
[3.617131][HF] (0) ERROR: intid: 681 vm: 0x8001, pc: 0x93600094
[3.622696][HF] (0) ERROR: ffa_interrupts_handle_secure_interrupt
[3.628344][HF] (0) ERROR: intid: 681 vm: 0x8001, pc: 0x93603280
[3.633909][HF] (0) ERROR: ffa_interrupts_handle_secure_interrupt
[3.639558][HF] (0) ERROR: intid: 681 vm: 0x8001, pc: 0x93603280
[3.645122][HF] (0) ERROR: ffa_interrupts_handle_secure_interrupt
[3.650771][HF] (0) ERROR: intid: 681 vm: 0x8001, pc: 0x93603280
[3.656336][HF] (0) ERROR: ffa_interrupts_handle_secure_interrupt
[3.661985][HF] (0) ERROR: intid: 681 vm: 0x8001, pc: 0x93603280
[3.667550][HF] (0) ERROR: ffa_interrupts_handle_secure_interrupt
[3.673198][HF] (0) ERROR: intid: 681 vm: 0x8001, pc: 0x93603280
[3.678764][HF] (0) ERROR: ffa_interrupts_handle_secure_interrupt
[3.684412][HF] (0) ERROR: intid: 681 vm: 0x8001, pc: 0x93603280
Additionally, we found that removing Hafnium’s EOI allows the SP to complete its IRQ handler.
We also noticed a comment stating that delaying EOI would cause a re-trigger after resume current vCPU; for level-triggered IRQs our observation appears to be the opposite—early EOI before the device deasserts the line causes an immediate re-trigger and ping-pong. Do you have any suggestions?