-
Notifications
You must be signed in to change notification settings - Fork 937
Description
Describe the bug
We have an OpenSource Java application that uses the S3TransferManager and S3CrtAsyncClient, the OpenSearch Migration Assistant.
We have observed some OutOfMemoryErrors when downloading s3 directories with the transfer manager.
While I do not have consistent reproduction steps, I followed advice from @alextwoods regarding limiting transfer manager executor to no avail.
Scenario:
Downloading directory with S3TransferManager#downloadDirectory
Directory consists of 132 files with 53 files between 125MB and 264MB, 12 files from 3MB to 80MB. 2 files ~40KB and 65 files that are 479B
Configuration:
Client:
S3AsyncClient s3Client = S3AsyncClient.crtBuilder()
.region(Region.of(s3Region))
.credentialsProvider(DefaultCredentialsProvider.builder().build())
.retryConfiguration(r -> r.numRetries(3))
.targetThroughputInGbps(8.0)
.maxNativeMemoryLimitInBytes(1073741824L) // 1 GiB
.minimumPartSizeInBytes(8388608L) // 8MiB
.maxConcurrency(RANDOM_SIZE) // (1 - 500, explained later)
.endpointOverride(s3Endpoint)
Transfer Manager:
S3TransferManager.builder()
.s3Client(s3Client)
.executor(executor)
.build())
Executor:
// Modified from TransferManagerConfiguration#defaultExecutor
ThreadPoolExecutor executor = new ThreadPoolExecutor(0, RANDOM_POOL_SIZE, // Pool size between 1 - 500
60, TimeUnit.SECONDS,
new LinkedBlockingQueue<>(1_000),
new ThreadFactoryBuilder()
.threadNamePrefix("rfs-s3-transfer-manager").build());
// Allow idle core threads to time out
executor.allowCoreThreadTimeOut(true);
I ran 41 downloads over the course of 12 hours, total download is 9.2GB on each attempt. Out of the 42 downloads, 7 failed with OOM (logs gathered included below).
I ran with random pool size and maxConcurrency out of options: (1, 5, 10, 20, 50, 100, 200, 500)
(independently chosen) and observed the 7 failures with the following configurations:
- max concurrency 5 pool 50
- max concurrency 50 pool 100
- max concurrency 5 pool 5
- max concurrency 5 pool 500
- max concurrency 5 pool 500
- max concurrency 10 pool 10
Environment:
Base Image: amazoncorretto:17-al2023-headless
Architecture: ARM64
Machine: Fargate 2vCPU 4gb Memory
Disk: EBS GP3 (~500GB)
Java Options: -XX:MaxRAMPercentage=65.0
(2.6 GB)
AWS Region: US-EAST-1
Library Version: aws-crt = "0.38.7"
, aws-sdk = "2.32.4"
Exception:
2025-08-08 06:52:50,922 INFO o.o.m.b.c.S3Repo [main] Downloading blob files from S3 with max concurrency 5 pool 5: s3://migration-artifacts-REDACTED-us-east-1/newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/ to /storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3
--
Exception in thread "AwsEventLoop 2" java.lang.OutOfMemoryError: Java heap space
2025-08-08 06:53:22,198 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [Thread-34] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled. at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397) at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260) at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224) at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58) at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223) at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131) at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840)Exception in thread "AwsEventLoop 1" java.lang.OutOfMemoryError: Java heap spaceException in thread "AwsEventLoop 2" java.lang.OutOfMemoryError: Java heap spaceException in thread "AwsEventLoop 2" java.lang.OutOfMemoryError: Java heap space
2025-08-08 06:53:25,110 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [Thread-32] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled. at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397) at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260) at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224) at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58) at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223) at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131) at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840)
2025-08-08 06:53:26,500 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [Thread-14] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled. at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397) at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260) at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224) at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58) at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223) at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131) at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840)
2025-08-08 06:53:28,271 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [Thread-10] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled. at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397) at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260) at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224) at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58) at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223) at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131) at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840)Exception in thread "AwsEventLoop 1" java.lang.OutOfMemoryError: Java heap spaceException in thread "AwsEventLoop 1" java.lang.OutOfMemoryError: Java heap space
2025-08-08 06:53:33,684 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [AwsEventLoop 1] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled. at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413) at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397) at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260) at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224) at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58) at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244) at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223) at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131) at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840)
2025-08-08 06:53:56,242 INFO o.o.m.b.c.S3Repo [main] Blob file download(s) complete
2025-08-08 06:53:56,243 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__iLfGFpKiQD2AXo5P1ByBCQ, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__iLfGFpKiQD2AXo5P1ByBCQ)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__ZcBNnD9NRQGU7ecOFluSng, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__ZcBNnD9NRQGU7ecOFluSng)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__h7L4rQa1TKaLNPweeF8hRg, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__h7L4rQa1TKaLNPweeF8hRg)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__jq57CPb0SL2_VGJUpyUdkQ, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__jq57CPb0SL2_VGJUpyUdkQ)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__lXRI75L0TySLJ1tsAPwrtw, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__lXRI75L0TySLJ1tsAPwrtw)), exception=software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: subscription has been cancelled. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__pcK2GVEfQ-63-LZKHVvB6w, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__pcK2GVEfQ-63-LZKHVvB6w)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))
Regression Issue
- Select this option if this issue appears to be a regression.
Expected Behavior
No OutOfMemoryError, Easy to configure and understand max on and off heap memory usage of Transfer Manager and CRT Client
Current Behavior
See above
Reproduction Steps
Was not able to create self-contained, concise snippet of code that can be used to reproduce the issue.
Possible Solution
No response
Additional Information/Context
No response
AWS Java SDK version used
2.32.4
JDK version used
corretto 17
Operating System and version
Amazon linux 2023 (ARM64)