Skip to content

Enabling ChecksumMode when calling getObject increases the response time #6497

@trivikr

Description

@trivikr

Describe the feature

Enabling ChecksumMode when calling getObject increases the response time.

This happens as the response stream is consumed for validating the checksum during the API call.
This was done in #5043 to ensure that checksum validation error is thrown during the API call and an error is thrown. However, since we return a stream, the checksum validation should be delayed till stream is consumed by the callee.

Use Case

Test case

import { S3 } from "@aws-sdk/client-s3"; // v3.654.0
import { equal } from "assert";

const client = new S3();
const Bucket = "test-checksum-mode"; // Replace with your test bucket name.
const Key = "hello-world.txt";
const Body = "Hello World\n".repeat(100_000); // File of size ~1 MB.
const ChecksumAlgorithm = "CRC32";

// The putObject call be commented out for subsequent calls for benchmarking getObject.
await client.putObject({ Bucket, Key, Body, ChecksumAlgorithm });

console.time("getObject");
const response = await client.getObject({
  Bucket,
  Key,
  // ChecksumMode: "ENABLED",
});
console.timeEnd("getObject");

equal(Body, await response.Body.transformToString());

API call times compared with file size/repeatations

File size Repeat # for putObject getObject without ChecksumMode getObject with ChecksumMode
~1 MB 100_000 27ms 85ms
~10 MB 1_000_000 28ms 450ms
~100 MB 10_000_000 39ms 3676ms

The numbers will differ depending on your network speed, but the difference will remain

Proposed Solution

Write a checksum stream wrapper class which consumes the stream for validation. only when end user consumes it.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

SDK version used

3.654.0

Environment details (OS name and version, etc.)

v20.10.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature-requestNew feature or enhancement. May require GitHub community feedback.p2This is a standard priority issuequeuedThis issues is on the AWS team's backlog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions