Skip to content

ECS credentials provider does not retry on socket errors #1626

@ianbotsf

Description

@ianbotsf

Describe the bug

Multiple users are reporting that the EcsCredentialsProvider does not property retry on socket exceptions such as "IOException: unexpected end of stream". This is because the retry policies for credentials providers are customized and may not include all of the known error types that regular SDK client requests cover.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected behavior

EcsCredentialsProvider and other providers should be updated to use a retry policy which correctly handles common errors. It needn't necessarily be identical to SDK client policies but should cover common error cases which are not specific to SDK client requests.

Current behavior

Example stack trace:

aws.smithy.kotlin.runtime.identity.IdentityProviderException: No identity could be resolved from the chain: SystemPropertyCredentialsProvider -> EnvironmentCredentialsProvider -> aws.sdk.kotlin.runtime.auth.credentials.StsWebIdentityProvider@1e5023d -> ProfileCredentialsProvider -> EcsCredentialsProvider -> ImdsCredentialsProvider
	at aws.smithy.kotlin.runtime.identity.IdentityProviderChain$resolve$2$chainException$1.invoke(IdentityProviderChain.kt:37)
	at aws.smithy.kotlin.runtime.identity.IdentityProviderChain$resolve$2$chainException$1.invoke(IdentityProviderChain.kt:37)
	at kotlin.SynchronizedLazyImpl.getValue(LazyJVM.kt:74)
	at aws.smithy.kotlin.runtime.identity.IdentityProviderChain$resolve$suspendImpl$$inlined$withSpan$default$1.invokeSuspend(CoroutineContextTraceExt.kt:89)
	at aws.smithy.kotlin.runtime.identity.IdentityProviderChain$resolve$suspendImpl$$inlined$withSpan$default$1.invoke(CoroutineContextTraceExt.kt)
	at aws.smithy.kotlin.runtime.identity.IdentityProviderChain$resolve$suspendImpl$$inlined$withSpan$default$1.invoke(CoroutineContextTraceExt.kt)
	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:42)
	at kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:164)
	at kotlinx.coroutines.BuildersKt.withContext(Unknown Source)
	at aws.smithy.kotlin.runtime.identity.IdentityProviderChain.resolve$suspendImpl(IdentityProviderChain.kt:94)
	at aws.smithy.kotlin.runtime.identity.IdentityProviderChain.resolve(IdentityProviderChain.kt)
	at aws.smithy.kotlin.runtime.auth.awscredentials.CredentialsProviderChain.resolve(CredentialsProviderChain.kt:23)
	at aws.smithy.kotlin.runtime.auth.awscredentials.CachedCredentialsProvider$resolve$3.invokeSuspend(CachedCredentialsProvider.kt:63)
	at aws.smithy.kotlin.runtime.auth.awscredentials.CachedCredentialsProvider$resolve$3.invoke(CachedCredentialsProvider.kt)
	at aws.smithy.kotlin.runtime.auth.awscredentials.CachedCredentialsProvider$resolve$3.invoke(CachedCredentialsProvider.kt)
	at aws.smithy.kotlin.runtime.util.CachedValue.getOrLoad(CachedValue.kt:80)
	at aws.smithy.kotlin.runtime.auth.awscredentials.CachedCredentialsProvider.resolve(CachedCredentialsProvider.kt:61)
	at aws.sdk.kotlin.runtime.auth.credentials.DefaultChainCredentialsProvider.resolve(DefaultChainCredentialsProvider.kt:74)
	at aws.smithy.kotlin.runtime.auth.awscredentials.CachedCredentialsProvider$resolve$3.invokeSuspend(CachedCredentialsProvider.kt:63)
	at aws.smithy.kotlin.runtime.auth.awscredentials.CachedCredentialsProvider$resolve$3.invoke(CachedCredentialsProvider.kt)
	at aws.smithy.kotlin.runtime.auth.awscredentials.CachedCredentialsProvider$resolve$3.invoke(CachedCredentialsProvider.kt)
	at aws.smithy.kotlin.runtime.util.CachedValue.getOrLoad(CachedValue.kt:80)
	at aws.smithy.kotlin.runtime.auth.awscredentials.CachedCredentialsProvider.resolve(CachedCredentialsProvider.kt:61)
	at aws.smithy.kotlin.runtime.http.operation.AuthHandler.call(SdkOperationExecution.kt:288)
	at aws.smithy.kotlin.runtime.http.operation.AuthHandler.call(SdkOperationExecution.kt:264)
	at aws.sdk.kotlin.runtime.http.middleware.AwsRetryHeaderMiddleware.handle(AwsRetryHeaderMiddleware.kt:39)
	at aws.sdk.kotlin.runtime.http.middleware.AwsRetryHeaderMiddleware.handle(AwsRetryHeaderMiddleware.kt:26)
	at aws.smithy.kotlin.runtime.io.middleware.DecoratedHandler.call(Middleware.kt:47)
	at aws.smithy.kotlin.runtime.io.middleware.Phase.handle(Phase.kt:68)
	at aws.smithy.kotlin.runtime.io.middleware.DecoratedHandler.call(Middleware.kt:47)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware.tryAttempt-BWLJW6A(RetryMiddleware.kt:87)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware.access$tryAttempt-BWLJW6A(RetryMiddleware.kt:35)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware$handle$result$outcome$1$invokeSuspend$$inlined$withSpan$default$1.invokeSuspend(CoroutineContextTraceExt.kt:92)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware$handle$result$outcome$1$invokeSuspend$$inlined$withSpan$default$1.invoke(CoroutineContextTraceExt.kt)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware$handle$result$outcome$1$invokeSuspend$$inlined$withSpan$default$1.invoke(CoroutineContextTraceExt.kt)
	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:42)
	at kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:164)
	at kotlinx.coroutines.BuildersKt.withContext(Unknown Source)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware$handle$result$outcome$1.invokeSuspend(RetryMiddleware.kt:144)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware$handle$result$outcome$1.invoke(RetryMiddleware.kt)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware$handle$result$outcome$1.invoke(RetryMiddleware.kt)
	at aws.smithy.kotlin.runtime.retries.StandardRetryStrategy.doTryLoop(StandardRetryStrategy.kt:64)
	at aws.smithy.kotlin.runtime.retries.StandardRetryStrategy.retry$suspendImpl(StandardRetryStrategy.kt:44)
	at aws.smithy.kotlin.runtime.retries.StandardRetryStrategy.retry(StandardRetryStrategy.kt)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware.handle(RetryMiddleware.kt:50)
	at aws.smithy.kotlin.runtime.http.middleware.RetryMiddleware.handle(RetryMiddleware.kt:35)
	at aws.smithy.kotlin.runtime.io.middleware.DecoratedHandler.call(Middleware.kt:47)
	at aws.smithy.kotlin.runtime.http.operation.MutateHandler.call(SdkOperationExecution.kt:261)
	at aws.smithy.kotlin.runtime.http.operation.MutateHandler.call(SdkOperationExecution.kt:258)
	at aws.smithy.kotlin.runtime.io.middleware.ModifyRequestMiddleware.handle(ModifyRequest.kt:26)
	at aws.smithy.kotlin.runtime.io.middleware.DecoratedHandler.call(Middleware.kt:47)
	at aws.smithy.kotlin.runtime.io.middleware.ModifyRequestMiddleware.handle(ModifyRequest.kt:26)
	at aws.smithy.kotlin.runtime.io.middleware.DecoratedHandler.call(Middleware.kt:47)
	at aws.smithy.kotlin.runtime.io.middleware.ModifyRequestMiddleware.handle(ModifyRequest.kt:26)
	at aws.smithy.kotlin.runtime.io.middleware.DecoratedHandler.call(Middleware.kt:47)
	at aws.smithy.kotlin.runtime.io.middleware.Phase.handle(Phase.kt:68)
	at aws.smithy.kotlin.runtime.io.middleware.DecoratedHandler.call(Middleware.kt:47)
	at aws.smithy.kotlin.runtime.http.operation.SerializeHandler.call(SdkOperationExecution.kt:254)
	at aws.smithy.kotlin.runtime.http.operation.SerializeHandler.call(SdkOperationExecution.kt:233)
	at aws.smithy.kotlin.runtime.http.operation.InitializeHandler.call(SdkOperationExecution.kt:230)
	at aws.smithy.kotlin.runtime.io.middleware.Phase.handle(Phase.kt:64)
	at aws.smithy.kotlin.runtime.io.middleware.DecoratedHandler.call(Middleware.kt:47)
	at aws.smithy.kotlin.runtime.http.operation.OperationHandler.call(SdkOperationExecution.kt:210)
	at aws.smithy.kotlin.runtime.http.operation.OperationHandler.call(SdkOperationExecution.kt:202)
	at aws.smithy.kotlin.runtime.http.operation.SdkHttpOperationKt$execute$$inlined$withSpan$1.invokeSuspend(CoroutineContextTraceExt.kt:76)
	at aws.smithy.kotlin.runtime.http.operation.SdkHttpOperationKt$execute$$inlined$withSpan$1.invoke(CoroutineContextTraceExt.kt)
	at aws.smithy.kotlin.runtime.http.operation.SdkHttpOperationKt$execute$$inlined$withSpan$1.invoke(CoroutineContextTraceExt.kt)
	at kotlinx.coroutines.intrinsics.UndispatchedKt.startUndispatchedOrReturn(Undispatched.kt:42)
	at kotlinx.coroutines.BuildersKt__Builders_commonKt.withContext(Builders.common.kt:164)
	at kotlinx.coroutines.BuildersKt.withContext(Unknown Source)
	at aws.smithy.kotlin.runtime.http.operation.SdkHttpOperationKt.execute(SdkHttpOperation.kt:219)
	at aws.smithy.kotlin.runtime.http.operation.SdkHttpOperationKt.roundTrip(SdkHttpOperation.kt:107)
	at aws.sdk.kotlin.services.dynamodb.DefaultDynamoDbClient.query(DefaultDynamoDbClient.kt:1866)
	at aws.sdk.kotlin.hll.dynamodbmapper.operations.QueryKt$queryOperation$3.invokeSuspend(Query.kt:215)
	at aws.sdk.kotlin.hll.dynamodbmapper.operations.QueryKt$queryOperation$3.invoke(Query.kt)
	at aws.sdk.kotlin.hll.dynamodbmapper.operations.QueryKt$queryOperation$3.invoke(Query.kt)
	at aws.sdk.kotlin.hll.dynamodbmapper.pipeline.internal.Operation.doLowLevelInvoke(Operation.kt:97)
	at aws.sdk.kotlin.hll.dynamodbmapper.pipeline.internal.Operation.execute(Operation.kt:39)
	at aws.sdk.kotlin.hll.dynamodbmapper.operations.IndexOperationsImpl.query(IndexOperations.kt:71)
	at aws.sdk.kotlin.hll.dynamodbmapper.model.internal.IndexImplKt$indexImpl$2.query(IndexImpl.kt)
	at com.amazon.pomelo.simulation.SimulatorService.getPendingSimulations(SimulatorService.kt:280)
	at com.amazon.pomelo.simulation.SimulatorService.getPendingSimulations$default(SimulatorService.kt:158)
	at com.amazon.pomelo.simulation.PendingSimulationPoller.updateOldestSimulation$PomeloService(PendingSimulationPoller.kt:47)
	at com.amazon.pomelo.simulation.PendingSimulationPoller$startPolling$1.invokeSuspend(PendingSimulationPoller.kt:35)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:101)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:589)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:832)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:720)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:707)
	Suppressed: aws.sdk.kotlin.runtime.auth.credentials.ProviderConfigurationException: Missing value for system property `aws.accessKeyId`
		at aws.sdk.kotlin.runtime.auth.credentials.SystemPropertyCredentialsProvider.requireProperty(SystemPropertyCredentialsProvider.kt:35)
		at aws.sdk.kotlin.runtime.auth.credentials.SystemPropertyCredentialsProvider.resolve(SystemPropertyCredentialsProvider.kt:43)
		at aws.smithy.kotlin.runtime.identity.IdentityProviderChain$resolve$suspendImpl$$inlined$withSpan$default$1.invokeSuspend(CoroutineContextTraceExt.kt:86)
		... 86 more
	Suppressed: aws.sdk.kotlin.runtime.auth.credentials.ProviderConfigurationException: Missing value for environment variable `AWS_ACCESS_KEY_ID`
		at aws.sdk.kotlin.runtime.auth.credentials.EnvironmentCredentialsProvider.requireEnv(EnvironmentCredentialsProvider.kt:35)
		at aws.sdk.kotlin.runtime.auth.credentials.EnvironmentCredentialsProvider.resolve(EnvironmentCredentialsProvider.kt:43)
		at aws.smithy.kotlin.runtime.identity.IdentityProviderChain$resolve$suspendImpl$$inlined$withSpan$default$1.invokeSuspend(CoroutineContextTraceExt.kt:86)
		... 86 more
	Suppressed: aws.sdk.kotlin.runtime.auth.credentials.ProviderConfigurationException: Required field `roleArn` could not be automatically inferred for StsWebIdentityCredentialsProvider. Either explicitly pass a value, set the environment variable `AWS_ROLE_ARN`, or set the JVM system property `aws.roleArn`
		at aws.sdk.kotlin.runtime.auth.credentials.StsWebIdentityCredentialsProvider$Companion.fromEnvironment-TUY-ock(StsWebIdentityCredentialsProvider.kt:197)
		at aws.sdk.kotlin.runtime.auth.credentials.StsWebIdentityCredentialsProvider$Companion.fromEnvironment-TUY-ock$default(StsWebIdentityCredentialsProvider.kt:90)
		at aws.sdk.kotlin.runtime.auth.credentials.StsWebIdentityProvider.resolve(DefaultChainCredentialsProvider.kt:97)
		at aws.smithy.kotlin.runtime.identity.IdentityProviderChain$resolve$suspendImpl$$inlined$withSpan$default$1.invokeSuspend(CoroutineContextTraceExt.kt:86)
		... 86 more
	Suppressed: aws.sdk.kotlin.runtime.auth.credentials.ProviderConfigurationException: could not find source profile default
		at aws.sdk.kotlin.runtime.auth.credentials.profile.ProfileChain$Companion.resolve$aws_config(ProfileChain.kt:355)
		at aws.sdk.kotlin.runtime.auth.credentials.ProfileCredentialsProvider.resolve(ProfileCredentialsProvider.kt:123)
		at aws.sdk.kotlin.runtime.auth.credentials.ProfileCredentialsProvider$resolve$1.invokeSuspend(ProfileCredentialsProvider.kt)
		at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
		at kotlinx.coroutines.UndispatchedCoroutine.afterResume(CoroutineContext.kt:266)
		at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:100)
		at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
		... 5 more
	Suppressed: aws.smithy.kotlin.runtime.auth.awscredentials.CredentialsProviderException: Failed to get credentials from container metadata service
		at aws.sdk.kotlin.runtime.auth.credentials.EcsCredentialsProvider.resolve(EcsCredentialsProvider.kt:110)
		at aws.sdk.kotlin.runtime.auth.credentials.EcsCredentialsProvider$resolve$1.invokeSuspend(EcsCredentialsProvider.kt)
		at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
		at kotlinx.coroutines.UndispatchedCoroutine.afterResume(CoroutineContext.kt:266)
		at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:100)
		at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
		at kotlinx.coroutines.UndispatchedCoroutine.afterResume(CoroutineContext.kt:266)
		at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:100)
		at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46)
		at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:99)
		... 4 more
	Caused by: aws.smithy.kotlin.runtime.http.HttpException: java.io.IOException: unexpected end of stream on http://169.254.170.2/...; HttpErrorCode(CONNECTION_CLOSED)
		at aws.smithy.kotlin.runtime.http.engine.okhttp.OkHttpEngine.roundTrip(OkHttpEngine.kt:196)
		at aws.smithy.kotlin.runtime.http.engine.okhttp.OkHttpEngine$roundTrip$1.invokeSuspend(OkHttpEngine.kt)
		at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
		... 5 more
	Caused by: java.io.IOException: unexpected end of stream on http://169.254.170.2/...
		at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:220)
		at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:114)
		at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:93)
		at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:126)
		at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
		at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:126)
		at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
		at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:126)
		at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
		at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:126)
		at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:72)
		at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:126)
		at aws.smithy.kotlin.runtime.http.engine.okhttp.MetricsInterceptor.intercept(MetricsInterceptor.kt:32)
		at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:126)
		at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:203)
		at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:527)
		at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
		at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
		at java.base/java.lang.Thread.run(Thread.java:1583)
	Caused by: java.io.EOFException: \n not found: limit=0 content=…
		at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.kt:341)
		at okhttp3.internal.http1.HeadersReader.readLine(HeadersReader.kt:29)
		at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:188)
		... 18 more
	Suppressed: aws.smithy.kotlin.runtime.auth.awscredentials.CredentialsProviderException: Failed to load instance profile name
		at aws.sdk.kotlin.runtime.auth.credentials.ImdsCredentialsProviderKt.wrapAsCredentialsProviderException(ImdsCredentialsProvider.kt:236)
		at aws.sdk.kotlin.runtime.auth.credentials.ImdsCredentialsProviderKt.access$wrapAsCredentialsProviderException(ImdsCredentialsProvider.kt:1…

Steps to Reproduce

Unknown, likely related to network conditions or request volume

Possible Solution

No response

Context

No response

AWS SDK for Kotlin version

latest

Platform (JVM/JS/Native)

JVM

Operating system and version

any

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.needs-triageThis issue or PR still needs to be triaged.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions