-
-
Notifications
You must be signed in to change notification settings - Fork 80
[EV3] Enable caching #364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EV3] Enable caching #364
Conversation
|
fwiw this seems to improve system performance so much that pybricks/support#2303 doesn't manifest anymore (at least, not in a way that is visible to the naked eye) |
04240ad to
22c31be
Compare
Can confirm visually that this is way better! If you look carefully you can still see it with the NXT color sensor but there should be other ways to smooth it out if we want to. (EDIT: if I apply another pending patch on top of yours to use EDIT2: Woah, this cuts the ADC read time from 2100 us to 86 us 😄 |
|
The new 3 commits rebase cleanly. Let me know if you'd like me to rebase and merge this yet or refine further. |
This function flushes the D-cache, not the I-cache.
This allows the main memory to be write-back cached. Special note regarding DMA buffers and the uncached alias mapping: DMA buffers and uncached data cannot safely share cache lines with unrelated data! We use a special macro to align these buffers so that issues do not occur. If DMA RX buffers share cache lines with other data, it can lead to race conditions which corrupt data. If we flush and invalidate the cache before performing the DMA, reading the unrelated data can cause the cache line to be loaded back into cache, cancelling out the invalidation. If we wait until after performing the DMA and perform a flush and invalidate, the flush can corrupt part of the data which was written by the DMA. If we only perform an invalidate, writes to the unrelated data can be lost. A similar race condition can happen if an uncached alias is used to read/write to data which sits next to unrelated data. This currently applies to the ADC RX buffer. The USB driver's CPPI descriptors are safe as they happen to already be cache line aligned. The solution we use here is to place each buffer in its own cache-line-aligned block of memory and only perform an invalidate upon DMA completion. This makes sure that no unrelated data can get caught up in the same cache lines. This issue does not affect DMA TX buffers written using the normal cached mapping. Those are handled by flushing the cache.
|
I think this is fine. I just wanted someone else to look over it in case I missed a location where a driver depended on DMA. I also noticed that the PRU soft UART doesn't use DMA, which is potential room for improvement in the future. |
|
Yeah, and the PRU writes only one byte at the time. This was copied from other existing implementations (and stripped down by about 75% to drop all kinds of Linux wrappers), but it should be able to write more at once. // Attempt to write one byte with the PRU UART. We could be sending
// more than one byte at a time, but we keep it consistent with the
// current PRU API. We don't send much data to EV3 sensors anyway.
if (pbdrv_uart_ev3_pru_write_bytes(pdata->peripheral_id, &uart->write_buf[uart->write_pos], 1)) {
uart->write_pos++;
}Another issue is that it has two ringbuffers due to how the APIs are glued together, which is also redundant. Thanks, merged too! |
| void pbdrv_cache_prepare_after_dma(const void *buf, size_t sz); | ||
|
|
||
| // Accesses a variable via the uncached memory alias | ||
| #define PBDRV_UNCACHED(x) (*(volatile __typeof__(x) *)((uint32_t)(&(x)) + 0x10000000)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems very specific to EV3, so we should rename things accordingly. Or just put this in the tiam1808 library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely not in tiam1808, since it's specific to the way we've specifically programmed the MMU. PBDRV_EV3_UNCACHED?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK. Then I would expect a macro for the 0x10000000 value then that is shared between this and where we program the MMU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also wondering about the uint32_t. Would it make more sense to use uintptr_t?
This depends on the in-progress USB and MMU work, but this enables the caches on the EV3 (caching the main system RAM).