-
Notifications
You must be signed in to change notification settings - Fork 206
Alternative function for potential optimization of data copying in stream buffers #1233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative function for potential optimization of data copying in stream buffers #1233
Conversation
…am buffers. Useful for optimization in systems that allow DMA to be used only in some memory areas.
|
Thanks for contributing to FreeRTOS+TCP. Since the memory allocator for TCP stream buffers can be configured if needed by defining I'm curious to know about the usage of DMA based |
|
@tony-josi-aws |
|
due to the features of the platform and saving resources for context switching, polling of the readiness flag is used, but this does not prevent you from making the work very fast. |
|
Thanks for the update. So the
That's a good improvement; was it measured using IPERF? Also wondering which hardware platform you are using.
You can take a look at this page: TCP/IP Stack Network Buffers Allocation Schemes and their implication on simplicity, CPU load, and throughput performance if you haven't already to see if |
Yes, that's right. this increases the speed of copying.
No, the check was carried out using an algorithm that is similar to the actual application. I can't say which platform yet, it's a trade secret. But I can describe some of the features. This is a video processing chip. Similar to GoPro or other similar cameras, but with some interesting effects. CPU is 32-bits risc-v. TCM is used for firmware operation. SRAM stores stack buffers and other buffers that DMA should work with. The main algorithm of operation is uploading data over the network to DDR, processing and downloading back to the PC. This is where the bottleneck is. DDR is very slow memory compared to sram. And byte-by-byte copying is a very long operation. DMA does this very quickly and in large transactions.
Thank you for this suggestion. I did this a few days ago and It didn't have the desired effect. The allocator is not currently in use. But that doesn't solve the whole problem. |
|
/bot run formatting |
This reverts commit 22105bc.
Description
There are platforms where copying data using the CPU is not very optimal.
The easiest and fastest way is to use DMA.
To implement this functionality, you need to replace the standard memcpy with memcpy with DMA
Test Steps
No additional actions are required. This functionality improves the flexibility of the code.
To use the alternative function, you need to define pvPortMemCpyStreamBuffer in the FreeRTOSIPConfig.h file.
If not specified, memcpy from the standard library will be used.
Checklist:
Related Issue
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.