Skip to content

Conversation

@herin049
Copy link
Contributor

A few months ago my team began noticing occasional bind: address already in use errors showing up on cold starts when using the Lambda collector layer (see #1740). We were able to come up with a fix internally, but since then it looks like the issue has been mostly solved by #1751. However, from a code maintainability/personal taste perspective the fix implemented is not ideal since it addresses the issue naively by running a brute force search over the ephemeral port range. A more optional solution is to just bind to port 0 which will allow the OS to automatically assign an available ephemeral port (see https://man7.org/linux/man-pages/man7/ip.7.html) This PR switches to using this approach and adds a basic set of tests for the telemetry API listener. Any additional comments are welcome as always.

@herin049 herin049 requested a review from a team as a code owner November 15, 2025 06:00
@wpessers wpessers added go Pull requests that update Go code enhancement New feature or request labels Nov 19, 2025
@wpessers
Copy link
Contributor

Thank you for the work @herin049, I'm definitely more in favour of this approach, getting rid of the random port selection. Have you been running your version of the collector in a production environment?

@herin049
Copy link
Contributor Author

herin049 commented Nov 27, 2025

Thank you for the work @herin049, I'm definitely more in favour of this approach, getting rid of the random port selection. Have you been running your version of the collector in a production environment?

Thank you for the review @wpessers!

Unfortunately, we have not been running these change in production, but we have ran them in our lower environments to validate that the changes fix the original bug.

After a little bit of digging, it seems like other Lambda extension authors opt to use this approach as well, in particular the NewRelic extension: https://github.com/newrelic/newrelic-lambda-extension/blob/2a5c012f5082d2f8f056252e87d2d3eecad5479a/lambda/logserver/logserver.go#L270
so personally, I view these changes as safe to make.

Unfortunately, AWS does not seem to give any guidance on if it is preferable to bind to port 0 when registering telemetry listeners. In my mind this is the only viable approach especially considering the fact that multiple listeners may be created by multiple extensions.

@wpessers
Copy link
Contributor

@herin049 The listener code looks good to me, just some comments / questions I'd like answered before moving forward with this one. Again disclaimer that I'm far from an expert at go, also the reason why it took me some time to get back to you.

@serkan-ozal
Copy link
Contributor

I am OK with the changes here

@herin049 herin049 force-pushed the improve-telemetryapi-listener branch from f031c9d to cb4c892 Compare November 30, 2025 06:48
@herin049
Copy link
Contributor Author

@wpessers I've address all your original comments on the unit tests and made some additional improvements.

@herin049 herin049 requested a review from wpessers November 30, 2025 06:49
Copy link
Contributor

@wpessers wpessers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adressed concerns wherever relevant, looks good to me!

@wpessers
Copy link
Contributor

wpessers commented Nov 30, 2025

Thank you for contributing @herin049 🚀

@wpessers wpessers merged commit cc43f96 into open-telemetry:main Nov 30, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request go Pull requests that update Go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants