-
Notifications
You must be signed in to change notification settings - Fork 1k
Fix triggers for accesses wider than XLEN #2198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
riscv/mmu.cc
Outdated
|
|
||
| for (size_t offset = 0; offset < data_size; offset += sizeof(reg_t)) { | ||
| auto this_size = std::min(data_size - offset, sizeof(reg_t)); | ||
| auto this_data = reg_from_bytes(this_size, bytes + offset); | ||
| check_triggers(operation, addr + offset, virt, this_size, this_data); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, consider the following alternative:
| for (size_t offset = 0; offset < data_size; offset += sizeof(reg_t)) { | |
| auto this_size = std::min(data_size - offset, sizeof(reg_t)); | |
| auto this_data = reg_from_bytes(this_size, bytes + offset); | |
| check_triggers(operation, addr + offset, virt, this_size, this_data); | |
| } | |
| check_triggers(operation, addr, virt, data_size, reg_from_bytes(std::min(data_size, sizeof(reg_t)), bytes)); |
The motivation is:
- AFAIU, an access wider then
sizeof(reg_t)(please note, this is not XLEN) is already split into chunks at this point e.g.:
riscv-isa-sim/riscv/v_ext_macros.h
Lines 1267 to 1271 in b92958a
for (reg_t fn = 0; fn < nf; ++fn) { \ elt_width##_t val = P.VU.elt<elt_width##_t>(vs3 + fn * emul, vreg_inx); \ MMU.store<elt_width##_t>( \ baseAddr + (stride) + (offset) * sizeof(elt_width##_t), val); \ } \
This is why I think the assertion is valid. - However, if there is such an access, an alternative aproach that checks a number of "chunks" that are no wider then the size of
reg_tbreaks address matching.
Consider the following:
- 16-byte-wide access on address 0x0.
- There are two triggers in a chain:
- The first checks whether an address less then 0x1 is accessed.
- The second checks whether an address greater or equal to 0x8 is accessed.
- If the access is split into two 8-byte wide checks, the chain will not match (the first "chunk" matches the first trigger, the second "chunk" matches the second trigger, neither matches the chain as a whole).
While the scenario is synthetic, IMHO the new behavior better complies with the spec recommendation for address match triggers (select is 0 (address)) to consider all the virtual addresses being accessed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@en-sc AMOCAS.Q, FLQ, and FSQ perform accesses wider than sizeof(reg_t). They will fail the assertion. With that in mind, what do you recommend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. I can apply your suggestion without reinstating the assertion. I guess that works for me. Seems like an oversight in the trigger spec to not consider atomic accesses greater than XLEN bits, though.
2915a24 to
1fc5c48
Compare
I believe that @en-sc's comment here is correct: #2161 (comment) Nevertheless, failing an assertion when someone sets a trigger on memory accessed by a wide access is not reasonable behavior for Spike. Better to do something that follows the principle of least surprise, despite the debug spec's lack of clarity on this point.
I'd like to remove this routine eventually, but let's make it a bit less visually unappealing in the meantime.
1fc5c48 to
dfbd749
Compare
I believe that @en-sc's comment here is correct: #2161 (comment)
Nevertheless, failing an assertion when someone sets a trigger on memory accessed by a wide access is not reasonable behavior for Spike. Better to do something that follows the principle of least surprise, despite the debug spec's lack of clarity on this point.
This PR also fixes some unrelated code-quality issues in adjacent code.