-
Notifications
You must be signed in to change notification settings - Fork 29
HyperRAM test software #437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| hyperram_memset_wd: | ||
| addi a2, a2, -8 | ||
| bltz a2, memset_wd_8fix | ||
| memset_wd_8: | ||
| csw a1, -4(ca0) | ||
| csw a1, -8(ca0) | ||
| cincoffset ca0, ca0, -8 | ||
| addi a2, a2, -8 | ||
| bgez a2, memset_wd_8 | ||
| memset_wd_8fix: | ||
| addi a2, a2, 8 | ||
| bgtz a2, memset_b_desc_tail | ||
| cret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why this has an inner loop of two stores rather than the eight of the ascending version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code was just written to perform back-to-back writes until the maximum burst length is achieved; that's just two words for a descending burst, but 8 for an ascending burst. It makes little difference in practice because the write coalescing logic will not time out and flush the burst write to HyperRAM for dozens of cycles and the Ibex will complete the loop overhead in far fewer cycles than that.
| } | ||
| } | ||
|
|
||
| // ----- Avoid the use of randomisation before this point ----- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any use of randomisation after this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not presently, at least not on that loop iteration. The point was to demarcate - above and below - the region where it must be avoided.
elliotb-lowrisc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I appreciated the chance to read some assembly functions.
sw/cheri/common/hyperram_perf_test.S
Outdated
| copy4: | ||
| lw a5, (ca1) | ||
| cincoffset ca1, ca1, 4 | ||
| sw a5, (ca0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these lw and sw instructions be written clw and csw for consistency with copy32 above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manual editing after noting I had been using a mix of clw and lw etc. throughout. Tried to tidy for consistency and just missed those, thanks.
sw/cheri/common/hyperram_perf_test.S
Outdated
| clw a4, (ca1) | ||
| clw a5, 4(ca1) | ||
| clw t0, 8(ca1) | ||
| clw t1, 12(ca1) // End of 16 bytes from first buffer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct? It looks like it just read 16 bytes from the "second" buffer (as defined above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, I'll change that. It just meant the first buffer to be read, rather than how they were named in the function API.
sw/cheri/common/hyperram_perf_test.S
Outdated
| csrw 0x7c0, a0 | ||
| cret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accidental indentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh; what's happened there is that I've copied some code from another .S file that uses tabs instead of spaces, and the tab characters are not visible in the editor I used.
- performance test; memory-to-memory copy using bursts. - write tests; specialised implementations of `memset`-like memory initialisation code to exercise the write coalescing logic of the HyperRAM controller interface. - alignment tests; exercise all possible alignments of source and destination addresses, to check the behaviour of wrapping read bursts and coalsced linear write bursts. - buffering test, issuing pseudo-random write traffic into a destination buffer with known content.
This PR extends the existing HyperRAM test software and adds new tests specific to the recent HyperRAM development work.
Since the HyperRAM interface is now considerably more involved, retaining data in read buffers (both Instruction and Data ports) as well as a write burst that is under construction, additional tests are introduced to catch edge cases and to test for any coherency issues, both between write and read traffic,as well as between the Instruction and Data ports.
To this end a couple of the tests make extensive use of handwritten assembler to achieve the bus traffic and performance. Compiled code was used initially but this results in well-separated individual read or write transactions and thus does not stress the logic.
A note on coding style
Those tests that remain in C++ have also been carefully expressed in places to ensure that the traffic occurs in a short time interval, because otherwise - for example - the hardware will have flushed write traffic out to memory and the test would not be exercising read-write address collisions as intended. In particular note that the pseudo-random number generation is a slow process at the level of CPU cycles/bus transactions, and must thus be avoided between bus transactions that must occur close together.
A brief overview of the modified hardware:
The tests added are as follows:
memset-like memory initialisation code to exercise the write coalescing logic of the HyperRAM controller interface.Sample output from the current HyperRAM implementation (FPGA):
Sample output from the modified HyperRAM implementation (FPGA) - to be raised in another PR.
The test
checks/hyperram_testhas been modified to run indefinitely until one or more failures is observed within an iteration, at which point it will halt. It may thus be used to soak test an FPGA build overnight and has run throughout a couple of nights without failure.