-
Notifications
You must be signed in to change notification settings - Fork 31
fix: http call waiting for idle connection monitor to finish #1178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
|
|
||
| all { | ||
| languageSettings.optIn("aws.smithy.kotlin.runtime.InternalApi") | ||
| languageSettings.optIn("okhttp3.ExperimentalOkHttpApi") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correctness: We should generally not opt into experimental/internal annotations we don't own across entire modules. Our opt-ins should be narrowly scoped and intentional. Please remove this and add @OptIn(ExperimentalOkHttpApi::class) on the specific methods or lines that require the opt-in.
|
|
||
| @OptIn(ExperimentalOkHttpApi::class) | ||
| internal class ConnectionIdleMonitor(val pollInterval: Duration) : ConnectionListener() { | ||
| private val monitorScope = CoroutineScope(Dispatchers.IO + SupervisorJob()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: HTTP engines have their own coroutine scope. Could we reuse that as a parent scope instead of creating a new one from scratch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, let me test this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's unfortunate. We could probably structure engine closure a bit better to allow for child scopes/jobs but that's a bigger and riskier change. We can stick with this for now.
| fun close(): Unit = runBlocking { | ||
| monitors.values.forEach { it.cancelAndJoin() } | ||
| monitorScope.cancel() | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Cancelling/joining child jobs sequentially in a loop is time-consuming. Moreover, the docs say that cancelling a parent job (even a SupervisorJob) should automatically cancel all child jobs. Does this code accomplish the same thing?
fun close(): Unit = runBlocking {
monitorScope.cancel()
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is that we're not waiting for the child jobs to actually finish cancelling (join). So we could be leaving some resources behind.
I think it should improve performance if we just cancel and don't wait for a join. It shouldn't be an issue unless someone is using a lot of clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this and time to cancel the connection idle monitor(s) went from ~200ms to ~600us. Now that I think of it, if someone is opening and closing clients very often and with a lot of connections per client then not waiting for the connection monitors to join could be a bigger issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, you're right that could be an issue. So then how about:
fun close(): Unit = runBlocking {
monitorScope.cancelAndJoin()
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately there's no cancelAndJoin for CoroutineScope, only Job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh jeez you're right. Well the CoroutineScope.cancel() method implementation pretty much just verifies the scope contains a job and then calls cancel on that job. We can do something similar here:
val monitorJob = requireNotNull(monitorScope.coroutineContext[Job]) { "<some error msg>" }
monitorJob.cancelAndJoin()|
|
||
| private val metrics = HttpClientMetrics(TELEMETRY_SCOPE, config.telemetryProvider) | ||
| private val client = config.buildClient(metrics) | ||
| private val connectionIdleMonitor = if (config.connectionIdlePollingInterval != null) ConnectionIdleMonitor(config.connectionIdlePollingInterval) else null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
private val connectionIdleMonitor = config.connectionIdlePollingInterval?.let { ConnectionIdleMonitor(it) }| @OptIn(ExperimentalOkHttpApi::class) | ||
| val connectionListener = if (config.connectionIdlePollingInterval == null) { | ||
| ConnectionListener.NONE | ||
| } else { | ||
| ConnectionIdleMonitor(connectionIdlePollingInterval) | ||
| } | ||
|
|
||
| // use our own pool configured with the timeout settings taken from config | ||
| @OptIn(ExperimentalOkHttpApi::class) | ||
| val pool = ConnectionPool( | ||
| maxIdleConnections = 5, // The default from the no-arg ConnectionPool() constructor | ||
| keepAliveDuration = config.connectionIdleTimeout.inWholeMilliseconds, | ||
| TimeUnit.MILLISECONDS, | ||
| connectionListener = connectionListener, | ||
| ) | ||
| connectionPool(pool) | ||
| connectionPool(poolOverride ?: pool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: With this change we're now constructing a ConnectionPool we'll just throw away if poolOverride is set. We should only construct the OkHttp4-safe pool if poolOverride is null.
This comment has been minimized.
This comment has been minimized.
| logger.trace { "Attempting to reset soTimeout..." } | ||
| try { | ||
| conn.socket().soTimeout = oldTimeout | ||
| logger.trace { "SoTimeout reset." } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: soTimeout
| override fun shutdown() { | ||
| client.connectionPool.evictAll() | ||
| client.dispatcher.executorService.shutdown() | ||
| metrics.close() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch
This comment has been minimized.
This comment has been minimized.
ianbotsf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved pending cancelAndJoin feedback.
Affected ArtifactsChanged in size
|
Issue #
Original issue that sparked idle connection monitoring: aws/aws-sdk-kotlin#1214
OkHttp4 was broken by idle connection monitoring: #1176
Description of changes
shutdownwas missing fromOkHttp4Engineso added it.OkHttpEngineConfig.buildClientfixes OkHttp4 Support Broken Due to Idle Connection PR #1176 by not includingconnectionListeneras part ofOkHttp4EngineCalling
S3getObjecttwice went from taking ~6 sec to ~370msBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.