Skip to content

perf: optimize IO read/write usage#8219

Closed
j2rong4cn wants to merge 10 commits intoAlistGo:mainfrom
j2rong4cn:perf
Closed

perf: optimize IO read/write usage#8219
j2rong4cn wants to merge 10 commits intoAlistGo:mainfrom
j2rong4cn:perf

Conversation

@j2rong4cn
Copy link
Copy Markdown
Contributor

@j2rong4cn j2rong4cn commented Mar 24, 2025

我只测试了第一个提交被修改的驱动,其他没测试

开发笔记

使用bytes.Buffer作为io.CopyBuffer的写入对象,CopyBuffer会调用Buffer.ReadFrom
即使被写入的数据量与Buffer.Cap一致,Buffer也会扩大

@petercat-assistant
Copy link
Copy Markdown

Walkthrough

This PR optimizes IO read/write operations across multiple drivers by refactoring the handling of file streams. It introduces a more efficient method for caching and hashing files, reducing redundant operations and improving performance.

Changes

Files Summary
drivers/115/driver.go, drivers/115_open/driver.go, drivers/123/driver.go, drivers/139/driver.go, drivers/189pc/utils.go, drivers/aliyundrive_open/upload.go, drivers/baidu_netdisk/driver.go, drivers/baidu_photo/driver.go, drivers/ilanzou/driver.go, drivers/mediatrack/driver.go, drivers/mopan/driver.go, drivers/netease_music/util.go, drivers/pikpak/driver.go, drivers/quark_uc/driver.go, drivers/quqi/driver.go, drivers/terabox/driver.go, drivers/thunder/driver.go, drivers/thunder_browser/driver.go, drivers/thunderx/driver.go Replaced direct file stream handling with a more efficient caching and hashing mechanism using stream.CacheFullInTempFileAndHash.
internal/fs/archive.go, internal/model/obj.go, internal/stream/stream.go, internal/stream/util.go Updated internal stream utilities to support new caching and hashing methods.
🪧 Tips For further assistance, please describe your question in the comments and @petercat-assistant to start a conversation with me.

Comment thread drivers/139/driver.go Outdated
Comment thread drivers/baidu_photo/driver.go
Comment thread drivers/quark_uc/driver.go Outdated
@j2rong4cn
Copy link
Copy Markdown
Contributor Author

@petercat-assistant Re-check

@petercat-assistant
Copy link
Copy Markdown

The pull request titled "perf: optimize IO read/write usage" is indeed present in the AlistGo/alist repository. You can view the details of this pull request here.

If you have any specific questions or need further assistance regarding this pull request, feel free to ask!

🪧 Tips For further assistance, please describe your question in the comments and @petercat-assistant to start a conversation with me.

@j2rong4cn
Copy link
Copy Markdown
Contributor Author

@petercat-assistant Review again

@petercat-assistant
Copy link
Copy Markdown

The pull request titled "perf: optimize IO read/write usage" is indeed present in the AlistGo/alist repository. You can view the details of this pull request here.

If you have any specific questions or need further assistance regarding this pull request, feel free to ask!

🪧 Tips For further assistance, please describe your question in the comments and @petercat-assistant to start a conversation with me.

@j2rong4cn j2rong4cn closed this Mar 25, 2025
@j2rong4cn j2rong4cn reopened this Mar 25, 2025
@Mmx233
Copy link
Copy Markdown
Contributor

Mmx233 commented Mar 27, 2025

I don't see any logic for reusing buf in this PR, such as using sync.Pool. This means that memory usage may increase significantly under high traffic compared to directly using io. Additionally, frequently allocating a large number of buf instances will lead to frequent initialization and memory allocations, as well as increased GC pressure, which introduces additional performance issues.

@Mmx233
Copy link
Copy Markdown
Contributor

Mmx233 commented Mar 27, 2025

Moreover, manually handling or reusing buf is a highly sensitive operation that may lead to data corruption, loss, or even unintended memory leaks, requiring careful testing and review. However, this PR modifies multiple drivers, covering a large codebase and a broad scope of functionality, which makes it quite risky.

@Mmx233
Copy link
Copy Markdown
Contributor

Mmx233 commented Mar 27, 2025

Optimizing I/O operations is a good approach, and I’m not saying that the changes in this PR are unacceptable. However, I hope you understand that the current implementation may not be as effective as expected, and I’m concerned that it could introduce critical bugs.

@j2rong4cn
Copy link
Copy Markdown
Contributor Author

I don't see any logic for reusing buf in this PR, such as using sync.Pool. This means that memory usage may increase significantly under high traffic compared to directly using io. Additionally, frequently allocating a large number of buf instances will lead to frequent initialization and memory allocations, as well as increased GC pressure, which introduces additional performance issues.

I only modified it to not cache the upload stream to a temp file when there is a hash value, and did not use more buf

@j2rong4cn
Copy link
Copy Markdown
Contributor Author

Moreover, manually handling or reusing buf is a highly sensitive operation that may lead to data corruption, loss, or even unintended memory leaks, requiring careful testing and review. However, this PR modifies multiple drivers, covering a large codebase and a broad scope of functionality, which makes it quite risky.

OK. I get it. Then I will just modify the drivers I can test.

@j2rong4cn j2rong4cn closed this Mar 27, 2025
@j2rong4cn
Copy link
Copy Markdown
Contributor Author

I don't see any logic for reusing buf in this PR, such as using sync.Pool. This means that memory usage may increase significantly under high traffic compared to directly using io. Additionally, frequently allocating a large number of buf instances will lead to frequent initialization and memory allocations, as well as increased GC pressure, which introduces additional performance issues.

我重新提交PR了 #8243e3f9b90 这样子?

@Mmx233
Copy link
Copy Markdown
Contributor

Mmx233 commented Mar 28, 2025

我重新提交PR了 #8243e3f9b90 这样子?

I will take a look in my spare time and update to the new PR. Besides that, I personally suggest splitting this large PR. This way, the outcome won't be bad for you, and it will be more friendly to other contributors who want to take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

降低IO占用

3 participants