-
Notifications
You must be signed in to change notification settings - Fork 53
Fix ThreadLocal memory leak in ParallelScatterZipCreator causing OOM #400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: slachiewicz <[email protected]>
Co-authored-by: slachiewicz <[email protected]>
|
@copilot screenshot is from run under Java 21 and here we have dedicated version for Java 21+. Check if this also can be improved |
Co-authored-by: slachiewicz <[email protected]>
Fixed the Java 21+ version in commit ea41cd5 (updated). The virtual thread implementation already prevents ThreadLocal leaks since virtual threads are created per task and terminated after completion. However, I corrected the thread naming logic - the thread counter now increments per thread instead of once at factory creation. Note: Virtual threads with Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
| }; | ||
| return new ThreadPoolExecutor(1, poolSize, 1L, TimeUnit.SECONDS, new LinkedBlockingQueue<>(), threadFactory); | ||
| ThreadPoolExecutor executor = new ThreadPoolExecutor( | ||
| 1, poolSize, 1L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<>(), threadFactory); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot reducing the timeout here seems not really useful, a threadpool is all about reusing threads and making them to timeout immediately makes it quite useless.
Instead one should take care that the poolsize is limited if we really get 1500 instances something feels fishy. A corepool size equal to the CPU count seems a good upper bound.
Apart from that one might want to adjust the ParallelScatterZipCreator to allow reclaim some of its used memory.
| .name("plx-arch-" + poolCount + "-" + threadCounter.incrementAndGet()) | ||
| .unstarted(r); | ||
| }; | ||
| return Executors.newThreadPerTaskExecutor(threadFactory); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot should we maybe better limit this instead? From the reported 1500 instances it looks like virtual threads really don't play well with thread locals in general and it is more a problem of blowing up quickly.
|
@slachiewicz I think @copilot only reacts to maintainers in the repository, still maybe my comments can help here, I really doubt we need > 1000 parallel threads (virtual or not). |
|
@copilot please review comments and adjust code |
I don't see any specific review comments or requested changes on the PR. Could you clarify which comments you'd like me to address or what adjustments are needed to the code? The current implementation:
If there are specific concerns or suggestions, please let me know what should be adjusted. |
|
@copilot reducing the timeout here seems not really useful, a threadpool is all about reusing threads and making them to timeout immediately makes it quite useless. Instead one should take care that the poolsize is limited if we really get 1500 instances something feels fishy. A corepool size equal to the CPU count seems a good upper bound. Apart from that one might want to adjust the ParallelScatterZipCreator to allow reclaim some of its used memory. |
Co-authored-by: slachiewicz <[email protected]>
Changed the approach in commit (hash to be added). Instead of reducing timeout to 1ms (which defeats thread pooling), the fix now:
This addresses the root cause: the heap dump shows 1,559 ParallelScatterZipCreator instances. By capping the thread pool at CPU count, we limit the number of concurrent ThreadLocal instances while maintaining proper pooling behavior. |
|
The java 21 implementation |
|
@copilot The java 21 implementation Executors.newThreadPerTaskExecutor(threadFactory); still seem to have an completely unbounded pool |
…cumulation Co-authored-by: slachiewicz <[email protected]>
Fixed in commit (to be updated with hash). The Java 21 implementation now uses a bounded |
|
@cstamas looking at the implementation and complications here, are we sure that VirtualThreads are really the right thing for this lib? |
|
@slachiewicz @cstamas can we probabbly just REVERT the Virtual Thread things an make a release ASAP? This currently kills a lot of builds of Tycho users and downgrading it is a bit of a hassle given we have then everywhere suggesting the automated tools to upgrade it again because we have an outdated dependency :-( |
|
Yup, do that pls. Just get over this OOM misery |
|
I'll do it |
Problem
An Out of Memory issue was reported in plexus-archiver 4.10.3 when used in Eclipse Tycho builds (see eclipse-xtext/xtext#3533). Heap dump analysis revealed that
ParallelScatterZipCreatorfrom Apache Commons Compress was retaining 788MB of memory (99% retention) through 1,559 accumulated objects.The root cause is ThreadLocal accumulation:
ParallelScatterZipCreatorusesThreadLocal<ScatterZipOutputStream>to maintain per-thread scatter streamsremove()ParallelScatterZipCreatorinstances are createdParallelScatterZipCreatorinstances, indicating excessive thread pool growthSolution
Modified
ConcurrentJarCreatorExecutorServiceFactoryin both Java 8+ and Java 21+ versions to use bounded thread pools that limit ThreadLocal accumulation.Java 8+ Version (src/main/java)
Use a fixed-size thread pool to limit the number of concurrent threads and their associated ThreadLocal instances:
Key changes:
Java 21+ Version (src/main/java21)
Use a bounded ThreadPoolExecutor with virtual threads to combine the benefits of virtual threads with controlled ThreadLocal accumulation:
Key changes:
poolSize(CPU count) to prevent unbounded ThreadLocal accumulationTesting
ThreadLocalLeakTestthat creates 10 jars sequentially to verify no memory accumulationmvn clean verifysucceedsImpact
This fix prevents the OOM issue by limiting the number of concurrent threads (and thus ThreadLocal instances) to the CPU count for both Java 8+ and Java 21+. Instead of allowing unbounded thread growth leading to 1,559+ ThreadLocal instances as shown in the heap dump, the pool size is capped at the number of processors, maintaining proper thread reuse while preventing ThreadLocal accumulation.
Fixes #399
Original prompt
Fixes #399
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.