Skip to content

OutOfMemoryError with S3TransferManager and S3CrtAsyncClient #6323

@AndreKurait

Description

@AndreKurait

Describe the bug

We have an OpenSource Java application that uses the S3TransferManager and S3CrtAsyncClient, the OpenSearch Migration Assistant.

We have observed some OutOfMemoryErrors when downloading s3 directories with the transfer manager.

While I do not have consistent reproduction steps, I followed advice from @alextwoods regarding limiting transfer manager executor to no avail.

Scenario:
Downloading directory with S3TransferManager#downloadDirectory
Directory consists of 132 files with 53 files between 125MB and 264MB, 12 files from 3MB to 80MB. 2 files ~40KB and 65 files that are 479B

Configuration:
Client:

        S3AsyncClient s3Client = S3AsyncClient.crtBuilder()
            .region(Region.of(s3Region))
            .credentialsProvider(DefaultCredentialsProvider.builder().build())
            .retryConfiguration(r -> r.numRetries(3))
            .targetThroughputInGbps(8.0)
            .maxNativeMemoryLimitInBytes(1073741824L) // 1 GiB
            .minimumPartSizeInBytes(8388608L) // 8MiB
            .maxConcurrency(RANDOM_SIZE) // (1 - 500, explained later)
            .endpointOverride(s3Endpoint)

Transfer Manager:

S3TransferManager.builder()
            .s3Client(s3Client)
            .executor(executor)
            .build())

Executor:

        // Modified from TransferManagerConfiguration#defaultExecutor
        ThreadPoolExecutor executor = new ThreadPoolExecutor(0, RANDOM_POOL_SIZE, // Pool size between 1 - 500
            60, TimeUnit.SECONDS,
            new LinkedBlockingQueue<>(1_000),
            new ThreadFactoryBuilder()
                .threadNamePrefix("rfs-s3-transfer-manager").build());
        // Allow idle core threads to time out
        executor.allowCoreThreadTimeOut(true);

I ran 41 downloads over the course of 12 hours, total download is 9.2GB on each attempt. Out of the 42 downloads, 7 failed with OOM (logs gathered included below).

I ran with random pool size and maxConcurrency out of options: (1, 5, 10, 20, 50, 100, 200, 500) (independently chosen) and observed the 7 failures with the following configurations:

  • max concurrency 5 pool 50
  • max concurrency 50 pool 100
  • max concurrency 5 pool 5
  • max concurrency 5 pool 500
  • max concurrency 5 pool 500
  • max concurrency 10 pool 10

Environment:
Base Image: amazoncorretto:17-al2023-headless
Architecture: ARM64
Machine: Fargate 2vCPU 4gb Memory
Disk: EBS GP3 (~500GB)
Java Options: -XX:MaxRAMPercentage=65.0 (2.6 GB)
AWS Region: US-EAST-1
Library Version: aws-crt = "0.38.7", aws-sdk = "2.32.4"

Exception:


2025-08-08 06:52:50,922 INFO o.o.m.b.c.S3Repo [main] Downloading blob files from S3 with max concurrency 5 pool 5: s3://migration-artifacts-REDACTED-us-east-1/newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/ to /storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3
--
Exception in thread "AwsEventLoop 2" java.lang.OutOfMemoryError: Java heap space
2025-08-08 06:53:22,198 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [Thread-34] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled.	at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397)	at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260)	at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224)	at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58)	at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223)	at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131)	at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241)	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)	at java.base/java.lang.Thread.run(Thread.java:840)Exception in thread "AwsEventLoop 1" java.lang.OutOfMemoryError: Java heap spaceException in thread "AwsEventLoop 2" java.lang.OutOfMemoryError: Java heap spaceException in thread "AwsEventLoop 2" java.lang.OutOfMemoryError: Java heap space
2025-08-08 06:53:25,110 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [Thread-32] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled.	at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397)	at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260)	at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224)	at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58)	at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223)	at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131)	at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241)	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)	at java.base/java.lang.Thread.run(Thread.java:840)
2025-08-08 06:53:26,500 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [Thread-14] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled.	at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397)	at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260)	at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224)	at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58)	at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223)	at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131)	at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241)	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)	at java.base/java.lang.Thread.run(Thread.java:840)
2025-08-08 06:53:28,271 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [Thread-10] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled.	at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397)	at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260)	at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224)	at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58)	at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223)	at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131)	at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241)	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)	at java.base/java.lang.Thread.run(Thread.java:840)Exception in thread "AwsEventLoop 1" java.lang.OutOfMemoryError: Java heap spaceException in thread "AwsEventLoop 1" java.lang.OutOfMemoryError: Java heap space
2025-08-08 06:53:33,684 WARN s.a.a.s.s.i.c.S3CrtResponseHandlerAdapter [AwsEventLoop 1] Exception thrown in responsePublisher#error, ignoringjava.util.concurrent.CancellationException: subscription has been cancelled.	at software.amazon.awssdk.utils.async.SimplePublisher.lambda$doProcessQueue$9(SimplePublisher.java:286)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.get(SimplePublisher.java:413)	at software.amazon.awssdk.utils.async.SimplePublisher$FailureMessage.access$700(SimplePublisher.java:397)	at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:260)	at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224)	at software.amazon.awssdk.utils.async.SimplePublisher.access$1300(SimplePublisher.java:58)	at software.amazon.awssdk.utils.async.SimplePublisher$SubscriptionImpl.cancel(SimplePublisher.java:389)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.async.listener.SubscriberListener$NotifyingSubscriber$NotifyingSubscription.cancel(SubscriberListener.java:124)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:244)	at software.amazon.awssdk.core.internal.async.FileAsyncResponseTransformer$FileSubscriber$1.failed(FileAsyncResponseTransformer.java:223)	at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:131)	at java.base/sun.nio.ch.Invoker$3.run(Invoker.java:241)	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)	at java.base/java.lang.Thread.run(Thread.java:840)
2025-08-08 06:53:56,242 INFO o.o.m.b.c.S3Repo [main] Blob file download(s) complete
2025-08-08 06:53:56,243 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__iLfGFpKiQD2AXo5P1ByBCQ, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__iLfGFpKiQD2AXo5P1ByBCQ)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__ZcBNnD9NRQGU7ecOFluSng, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__ZcBNnD9NRQGU7ecOFluSng)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__h7L4rQa1TKaLNPweeF8hRg, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__h7L4rQa1TKaLNPweeF8hRg)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__jq57CPb0SL2_VGJUpyUdkQ, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__jq57CPb0SL2_VGJUpyUdkQ)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__lXRI75L0TySLJ1tsAPwrtw, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__lXRI75L0TySLJ1tsAPwrtw)), exception=software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: subscription has been cancelled. (SDK Attempt Count: 1))
2025-08-08 06:53:56,244 ERROR o.o.m.b.c.S3Repo [main] FailedFileDownload(request=DownloadFileRequest(destination=/storage/s3_files/indices/PYO6TRrfSyuReHLFwBDXQw/3/__pcK2GVEfQ-63-LZKHVvB6w, getObjectRequest=GetObjectRequest(Bucket=migration-artifacts-REDACTED-transforms-us-east-1, Key=newsnap/indices/PYO6TRrfSyuReHLFwBDXQw/3/__pcK2GVEfQ-63-LZKHVvB6w)), exception=software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: OutOfMemoryError has been raised from JVM. (SDK Attempt Count: 1))

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

No OutOfMemoryError, Easy to configure and understand max on and off heap memory usage of Transfer Manager and CRT Client

Current Behavior

See above

Reproduction Steps

Was not able to create self-contained, concise snippet of code that can be used to reproduce the issue.

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

2.32.4

JDK version used

corretto 17

Operating System and version

Amazon linux 2023 (ARM64)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.p2This is a standard priority issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions