|
| 1 | +# MD5 checksum interceptor |
| 2 | + |
| 3 | +A recent update to the AWS SDK for Kotlin removed support for MD5 checksums in favor of newer algorithms. This may |
| 4 | +affect SDK compatibility with third-party "S3-like" services, particularly when invoking the |
| 5 | +[`DeleteObjects`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) operation. If you still |
| 6 | +require MD5 checksums for S3-like services, you may re-enable them by writing a |
| 7 | +[a custom interceptor](https://docs.aws.amazon.com/sdk-for-kotlin/latest/developer-guide/interceptors.html). |
| 8 | + |
| 9 | +## Example interceptor code |
| 10 | + |
| 11 | +The following code defines an interceptor which calculates MD5 checksums for S3's `DeleteObjects` operation: |
| 12 | + |
| 13 | +```kotlin |
| 14 | +@OptIn(InternalApi::class) |
| 15 | +class DeleteObjectsMd5Interceptor : HttpInterceptor { |
| 16 | + companion object { |
| 17 | + private const val MD5_HEADER = "Content-MD5" |
| 18 | + private const val OTHER_CHECKSUMS_PREFIX = "x-amz-checksum-" |
| 19 | + private const val TRAILER_HEADER = "x-amz-trailer" |
| 20 | + } |
| 21 | + |
| 22 | + override suspend fun modifyBeforeSigning(context: ProtocolRequestInterceptorContext<Any, HttpRequest>): HttpRequest { |
| 23 | + // Only execute for the `DeleteObjects` operation |
| 24 | + if (context.executionContext.operationName != "DeleteObjects") return context.protocolRequest |
| 25 | + |
| 26 | + val body = context.protocolRequest.body |
| 27 | + val newRequest = context.protocolRequest.toBuilder() |
| 28 | + |
| 29 | + // Remove any conflicting headers |
| 30 | + removeOtherChecksums(newRequest.headers) |
| 31 | + removeOtherChecksums(newRequest.trailingHeaders) |
| 32 | + |
| 33 | + newRequest |
| 34 | + .headers |
| 35 | + .getAll(TRAILER_HEADER) |
| 36 | + .orEmpty() |
| 37 | + .filter { it.startsWith(OTHER_CHECKSUMS_PREFIX, ignoreCase = true) } |
| 38 | + .forEach { newRequest.headers.remove(TRAILER_HEADER, it) } |
| 39 | + newRequest.headers.removeKeysWithNoEntries() |
| 40 | + |
| 41 | + if (body.isEligibleForAwsChunkedStreaming) { |
| 42 | + // Calculate MD5 while streaming, append as a trailing header |
| 43 | + |
| 44 | + val parentJob = context.executionContext.coroutineContext.job |
| 45 | + val deferredHash = CompletableDeferred<String>(parentJob) |
| 46 | + |
| 47 | + newRequest.body = body.toHashingBody(Md5(), body.contentLength).toCompletingBody(deferredHash) |
| 48 | + newRequest.headers.append(TRAILER_HEADER, MD5_HEADER) |
| 49 | + newRequest.trailingHeaders[MD5_HEADER] = deferredHash |
| 50 | + } else { |
| 51 | + val hash = if (body.isOneShot) { |
| 52 | + // One-shot stream must be fully read into memory, hashed, and then body replaced with in-memory bytes |
| 53 | + |
| 54 | + val bytes = body.readAll() ?: byteArrayOf() |
| 55 | + newRequest.body = bytes.toHttpBody() |
| 56 | + bytes.md5().encodeBase64String() |
| 57 | + } else { |
| 58 | + // All other streams can be converted to a channel for which the hash is computed eagerly |
| 59 | + |
| 60 | + val scope = context.executionContext |
| 61 | + val channel = requireNotNull(body.toSdkByteReadChannel(scope)) { "Cannot convert $body to channel" } |
| 62 | + channel.rollingHash(Md5()).encodeBase64String() |
| 63 | + } |
| 64 | + |
| 65 | + newRequest.headers[MD5_HEADER] = hash |
| 66 | + } |
| 67 | + |
| 68 | + return newRequest.build() |
| 69 | + } |
| 70 | + |
| 71 | + private fun removeOtherChecksums(source: ValuesMapBuilder<*>) = |
| 72 | + source |
| 73 | + .entries() |
| 74 | + .map { it.key } |
| 75 | + .filter { it.startsWith(OTHER_CHECKSUMS_PREFIX, ignoreCase = true) } |
| 76 | + .forEach { source.remove(it) } |
| 77 | +} |
| 78 | +``` |
| 79 | + |
| 80 | +A few notes about particular parts of this code: |
| 81 | + |
| 82 | +* `@OptIn(InternalApi::class)` |
| 83 | + |
| 84 | + This example makes use of several SDK APIs which are public but not supported for |
| 85 | + external use. Thus, calling code must [opt in](https://kotlinlang.org/docs/opt-in-requirements.html#opt-in-to-api) to |
| 86 | + successfully build. |
| 87 | + |
| 88 | + |
| 89 | +* `if (context.executionContext.operationName != "DeleteObjects") return context.protocolRequest` |
| 90 | + |
| 91 | + MD5 checksums are generally only required for `DeleteObjects` invocations on third-party S3-like services. If you |
| 92 | + require MD5 for more operations, adjust this predicate accordingly. |
| 93 | + |
| 94 | + |
| 95 | +* `if (body.isOneShot)` |
| 96 | + |
| 97 | + Some streaming payloads come from "one-shot" sources, meaning they cannot be rewound or replayed. This presents |
| 98 | + particular challenges for calculating checksums and for retrying requests which previously failed (e.g., because of a |
| 99 | + transient condition like connection drops or throttling). The only way to correctly handle such payloads is to read |
| 100 | + them completely into memory and then calculate the checksum. This may cause memory issues for very large payloads or |
| 101 | + resource-constrained environments. |
| 102 | + |
| 103 | +## Using the interceptor |
| 104 | + |
| 105 | +Once the interceptor is written, it may be added to an S3 client by way of client config: |
| 106 | + |
| 107 | +```kotlin |
| 108 | +val s3 = S3Client.fromEnvironment { |
| 109 | + interceptors += DeleteObjectsMd5Interceptor() |
| 110 | +} |
| 111 | + |
| 112 | +s3.deleteObjects { ... } // Will calculate and send MD5 checksum for request |
| 113 | +``` |
0 commit comments