-
Notifications
You must be signed in to change notification settings - Fork 16
Merge Lambda Managed Instance feature branch #947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
`INVOKE` event subscription in elevator would crash
* add `ec2-capacity-provider` init type needed for elevator mode * improve debug log for telemetry error while serializing
* update `ReportMetrics` to be an enum to allow `Elevator` metrics allows us to have a diff to check in the other components * set metrics for lifecycle given the type of report * only send log for `OnDemand` metrics * send correct enhanced metrics given the type of report * add doc coments * fmt
route was changed, as opposed to schema version
…ce) mode support with stats generation https://datadoghq.atlassian.net/browse/SVLS-7584 Implement comprehensive LMI mode support for concurrent Lambda invocations: Add background periodic flusher for continuous data collection in LMI mode Implement PlatformReport event handling with proper stats generation Add LMI mode REPORT log formatting with status, duration, and error details Integrate StatsGenerator and StatsConcentratorService throughout event pipeline Add missing stats_generator field to SendingTraceProcessor for both PlatformReport and PlatformRuntimeDone events Architecture improvements: Remove InvocationProcessorService wrapper, use Arc<TokioMutex> directly Simplify event handling by passing stats_concentrator to all event handlers Add #[must_use] attribute to Listener::new() for better API safety
https://datadoghq.atlassian.net/browse/SVLS-7836?atlOrigin=eyJpIjoiMWNmZTMzOGE4NGEwNDE4MTk5Njk0N2ZmMmU3MzExMjgiLCJwIjoiaiJ9 The extension neither creates SnapStart spans nor emits SnapStart metrics. This PR adds both. When a lambda with snapshot enabled is invoked for the first time, we get `Platform.RestoreStart` and `Platform.RestoreReport`. These effectively take the place of `Platform.InitStart` and `Platform.InitReport` events, so our code flow is pretty much identical to how we handle the cold start span and duration metric. Note - When a SnapStart instance is restored, we actually receive the `Platform.InitStart` and `Platform.InitReport` events in addition to the `Platform.RestoreStart` and `Platform.RestoreReport`. However, the `Init` events are not from the sandbox starting for that invoke. These `Init` events are actually generated from when the Snapshot is created. This is very misleading - You can see that this [trace](https://ddserverless.datadoghq.com/serverless/aws/lambda?fromUser=false&graphType=flamegraph&group=&highlight=snapstart-java-cdk-function&panel_end=1761860524106&panel_paused=false&panel_start=1761846124106&shouldShowLegend=true&sp=%5B%7B%22p%22%3A%7B%22entityId%22%3A%22aws-lambda-functions%2Bsnapstart-java-cdk-function%2Bus-east-1%2B425362996713%22%7D%2C%22i%22%3A%22lambda-panel%22%7D%2C%7B%22p%22%3A%7B%22traceID%22%3A%225400520227836710313%22%2C%22selectedSpanID%22%3A%22644948261311059067%22%7D%2C%22i%22%3A%22trace-panel%22%7D%5D&spanID=644948261311059067&text_search=snapstart&traceID=5400520227836710313&traceQuery=&start=1761845683104&end=1761860083104&paused=false) is more than 3 hours long. The lambda was invoked more than 3 hours after the snapshot version was created. (This is the current experience). I deployed my own extension with the changes and confirmed we are now getting a restore span and not an init span, [link](https://ddserverless.datadoghq.com/serverless/aws/lambda?fromUser=false&graphType=flamegraph&group=&panel_end=1761860640000&panel_paused=false&panel_start=1761846240000&shouldShowLegend=true&sp=%5B%7B%22p%22%3A%7B%22entityId%22%3A%22aws-lambda-functions%2Bsnapstart-java-function%2Bus-east-1%2B425362996713%22%7D%2C%22i%22%3A%22lambda-panel%22%7D%2C%7B%22p%22%3A%7B%22traceID%22%3A%226634828896084800457%22%2C%22selectedSpanID%22%3A%222017721198037440020%22%7D%2C%22i%22%3A%22trace-panel%22%7D%5D&spanID=2017721198037440020&text_search=snapstart&traceID=6634828896084800457&traceQuery=&start=1761845683104&end=1761860083104&paused=false).
…ce) mode support with stats generation https://datadoghq.atlassian.net/browse/SVLS-7584 Implement comprehensive LMI mode support for concurrent Lambda invocations: Add background periodic flusher for continuous data collection in LMI mode Implement PlatformReport event handling with proper stats generation Add LMI mode REPORT log formatting with status, duration, and error details Integrate StatsGenerator and StatsConcentratorService throughout event pipeline Add missing stats_generator field to SendingTraceProcessor for both PlatformReport and PlatformRuntimeDone events Architecture improvements: Remove InvocationProcessorService wrapper, use Arc<TokioMutex> directly Simplify event handling by passing stats_concentrator to all event handlers Add #[must_use] attribute to Listener::new() for better API safety
Switch to new value of AWS_LAMBDA_INIT_TYPE Minor fix to ensure successful local testing.
…LS-7879] (#44) * ship logs between invocations without request_id * fmt * test * Minor change to prepare for code merge
…el [SVLS-7906] (#47) * emit fd/threads metrics at shutdown * pause monitoring on no active invocations * fmt
* create empty context on init start to be updated on platform start/invoke * clippy
astuyve
approved these changes
Dec 1, 2025
Contributor
Author
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://datadoghq.atlassian.net/browse/SVLS-8080
Overview
Merge Lambda Managed Instance feature branch
Testing
Covered by individual commits