-
Notifications
You must be signed in to change notification settings - Fork 469
docs: adding new Lambda features #7917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #7917 +/- ##
========================================
Coverage 96.72% 96.72%
========================================
Files 275 275
Lines 13214 13214
Branches 1006 1006
========================================
Hits 12781 12781
Misses 325 325
Partials 108 108 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
|
||
| ### Logger | ||
|
|
||
| The Durable Execution SDK provides a `context.logger` that automatically handles **log deduplication during replays**. You can integrate Powertools Logger to get structured JSON logging while keeping the deduplication benefits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like there should be another word after context.logger, is it a class, and instance, a function?
| Tracer works with Durable Functions. Each execution creates trace segments. | ||
|
|
||
| ???+ note "Trace continuity" | ||
| Due to the replay mechanism, traces may not show a continuous flow. Each execution (including replays) creates separate trace segments. Use the `execution_arn` to correlate traces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Due to the replay mechanism, traces may not show a continuous flow. Each execution (including replays) creates separate trace segments. Use the `execution_arn` to correlate traces. | |
| Due to the replay mechanism, traces may not show contiguously. Each execution (including replays) creates separate trace segments. Use the `execution_arn` to correlate traces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to use a simpler term or expression, as a non-native speaker I'd struggle to understand this term in a tech doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about due to the replay mechanism, traces may be interleaved?
|
|
||
| ### Parameters | ||
|
|
||
| Parameters utility works correctly, but be aware that **cache is per-process**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Parameters utility works correctly, but be aware that **cache is per-process**. | |
| Parameters utility works as expected, but be aware that **caching is per-process**. |
|
|
||
| 1. **VPC Endpoints** - Private connectivity without internet access | ||
| 2. **NAT Gateway** - Internet access from private subnets | ||
| 3. **Public subnet with Internet Gateway** - Direct internet access |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is an egress only IPv6 internet gateway: https://docs.aws.amazon.com/vpc/latest/userguide/egress-only-internet-gateway.html
| ``` | ||
|
|
||
| ???+ note "Other utilities" | ||
| All other Powertools for AWS utilities (Feature Flags, Validation, Parser, Data Masking, etc.) work without any changes. If you encounter any issues, please [open an issue](https://github.com/aws-powertools/powertools-lambda-python/issues/new?template=bug_report.yml){target="_blank"}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All other Powertools for AWS Lambda (Python) utilities ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've already said this at L46:
Powertools for AWS Lambda (Python) works seamlessly with Lambda Managed Instances. All utilities are compatible with the multi-process concurrency model used by Python.
I would remove this "Other utilities" section and just leave the "if you find any issues..." callout.
|
|
||
| ## How Lambda Python runtime handles concurrency | ||
|
|
||
| Unlike Java or Node.js which use threads, the **Lambda Python runtime uses multiple processes** for concurrent requests. Each request runs in a separate process, which provides natural isolation between requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here I'd rather focus on what Python managed runtime does rather than draw a parallel with other languages, if I'm here I'm interested in Python specifically.
|
|
||
| ## Key differences from Lambda (default) | ||
|
|
||
| | Aspect | Lambda (default) | Lambda Managed Instances | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use Lambda On Demand as default
| | **Concurrency** | Single invocation per execution environment | Multiple concurrent invocations per environment | | ||
| | **Python model** | One process, one request | Multiple processes, one request each | | ||
| | **Pricing** | Per-request duration | EC2-based with Savings Plans support | | ||
| | **Scaling** | Scale on demand with cold starts | Async scaling based on CPU, no cold starts | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can still have cold starts if your request volume exceeds the capacity provided in LMI, no?
|
|
||
| This means: | ||
|
|
||
| - **Memory is not shared** between concurrent requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid the negation here, if possible at all
| ## Isolation model | ||
|
|
||
| Lambda Managed Instances use a different isolation model than Lambda (default): | ||
|
|
||
| | Layer | Lambda (default) | Lambda Managed Instances | | ||
| | ---------------------- | ---------------------------------------- | ------------------------------------------ | | ||
| | **Instance level** | Firecracker microVMs on shared AWS fleet | Containers on EC2 Nitro in your account | | ||
| | **Security boundary** | Execution environment | Capacity provider | | ||
| | **Function isolation** | Strong isolation via microVMs | Container-based isolation within instances | | ||
|
|
||
| **Capacity providers** serve as the security boundary. Functions within the same capacity provider share the underlying EC2 instances. For workloads requiring strong isolation between functions, use separate capacity providers. | ||
|
|
||
| For Python specifically, the multi-process model adds another layer of isolation - each concurrent request runs in its own process with separate memory space. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure about this section, should we link to the LMI docs with a "Go here to learn about the isolation model of LMI" (or similar) instead?
Previous sections have an immediate impact on the programming model, this is more indirect and a characteristic of LMI that might not belong in the docs of a toolkit like ours.
| ???+ tip "Cache behavior" | ||
| Since each process has its own cache, you might see more calls to SSM/Secrets Manager during initial warm-up. Once each process has cached the value, subsequent requests within that process use the cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm aware this is not a problem we've solved neither here nor in On Demand, but reading this I can't help but think about cache invalidation for parameters, which is now even more apparent with LMI.
I think a sentence about "you can customize the caching behavior with ..." would help here.
| ### Logger | ||
|
|
||
| Logger works without any changes. Each process has its own logger instance. | ||
|
|
||
| ```python hl_lines="4 7" title="Using Logger with Managed Instances" | ||
| --8<-- "examples/lambda_features/managed_instances/src/using_logger.py" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find these <Utility> works without any changes sections quite repetitive. Does it make sense to merge at least the core utilities in a single code snippet?
We're already saying above that PT works seamlessly with LMI. We can keep those other sections that require special clarifications (if any).
| ## Working with shared resources | ||
|
|
||
| ### The `/tmp` directory | ||
|
|
||
| The `/tmp` directory is **shared across all processes** in the execution environment. Use caution when writing files. | ||
|
|
||
| ```python title="Safe file handling with unique names" | ||
| --8<-- "examples/lambda_features/managed_instances/src/tmp_file_handling.py" | ||
| ``` | ||
|
|
||
| ### Database connections | ||
|
|
||
| Since each process is independent, connection pooling behaves differently than in threaded runtimes. | ||
|
|
||
| ```python title="Database connections per process" | ||
| --8<-- "examples/lambda_features/managed_instances/src/database_connections.py" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the isolation model section, unsure if this should be in our docs.
We've already mentioned above that these are differences in the programming model, this doesn't have a direct impact on Powertools since we don't use the /tmp folder but it's more of a general LMI characteristic, and we need to draw the line somewhere.
Other sections like VPC connectivity instead make a lot of sense.
| ### Is my code thread-safe? | ||
|
|
||
| For Python, you don't need to worry about thread safety because Lambda Managed Instances uses **multiple processes**, not threads. Each request runs in its own process with isolated memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're going for with this Q/A, but I think the answer can be improved.
I'd leave something like:
Lambda Managed Instances uses multiple processes, instead of threads. Each request runs in its own process with isolated memory. If you implement multi-threading you're responsible for it
or something similar.
|
|
||
| ### Do I need to change my existing Powertools for AWS code? | ||
|
|
||
| No changes are required. Your existing code will work as-is with Lambda Managed Instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd call out that they should upgrade to at least version 3.x.x of Powertools for this statement to be true.
| ) | ||
|
|
||
| # Emit only at the end | ||
| metrics.add_metric(name="WorkflowCompleted", unit=MetricUnit.Count, value=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be in its own step?
I know that in this case it gets run only once because it's the last operation, but having it here might lead to think emitting a metric like this in the function body is always safe, which is technically not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like this, plus the examples/lambda_features/durable_functions/src/using_logger.py and the examples/lambda_features/durable_functions/src/log_deduplication.py are repeating the same concept over and over.
Should we have a single snippet that shows a couple logs, one using context.logger and one without with comments that explain what happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, this is a copy of examples/lambda_features/durable_functions/src/best_practice_metrics.py or close to it - can we keep only one?
|
|
||
| @durable_execution | ||
| def handler(event: dict, context: DurableContext) -> str: | ||
| # Parameters are fetched on each execution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this always true?
I thought that if the replay or execution happens within a certain time frame the request could land on the same execution environment and thus hit the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this adds much to our docs tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's really nothing specific to LMI here as far as I can tell - unsure we need a code snippet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing DF specific here, I would consider skipping this snippet.
| | **Steps** | Business logic with built-in retries and progress tracking | | ||
| | **Waits** | Suspend execution without incurring compute charges | | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use singular names like Step and Wait for consistency
|
|
||
| Durable functions use a **checkpoint/replay mechanism**: | ||
|
|
||
| 1. Your code runs from the beginning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 1. Your code runs from the beginning | |
| 1. Your code runs always from the beginning |
|
|
||
| 1. Your code runs from the beginning | ||
| 2. Completed operations are skipped using stored results | ||
| 3. Execution continues from where it left off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wording is slightly off - it might be me - but it confuses me to think it contradicts the first point.
I'd say that execution of new steps continues from where it left off or similar.
|
|
||
| ## Powertools integration | ||
|
|
||
| Powertools for AWS Lambda (Python) works seamlessly with Durable Functions. The [Durable Execution SDK](https://github.com/aws/aws-durable-execution-sdk-python){target="_blank" rel="nofollow"} has native integration with Powertools Logger via `context.set_logger()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with Logger - remove Powertools or you have to use Powertools for AWS Logger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do a find & replace, this is happening multiple times throughout the page
| ### Parser | ||
|
|
||
| Parser works with Durable Functions for validating and parsing event payloads. | ||
|
|
||
| ```python hl_lines="9 14" title="Using Parser with Durable Functions" | ||
| --8<-- "examples/lambda_features/durable_functions/src/using_parser.py" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in the code snippet file, unsure if this is needed since it works exactly the same and it's stateless.
| ## Best practices | ||
|
|
||
| ### Use context.logger for log deduplication | ||
|
|
||
| Always use `context.set_logger()` and `context.logger` instead of using the Powertools Logger directly. This ensures logs are deduplicated during replays. | ||
|
|
||
| ```python title="Recommended logging pattern" | ||
| --8<-- "examples/lambda_features/durable_functions/src/best_practice_logging.py" | ||
| ``` | ||
|
|
||
| ### Emit metrics at workflow completion | ||
|
|
||
| To avoid counting replays as new executions, emit metrics only when the workflow completes successfully. | ||
|
|
||
| ```python title="Metrics at completion" | ||
| --8<-- "examples/lambda_features/durable_functions/src/best_practice_metrics.py" | ||
| ``` | ||
|
|
||
| ### Use Idempotency for ESM triggers | ||
|
|
||
| When your durable function is triggered by Event Source Mappings (SQS, Kinesis, DynamoDB Streams), use the `@idempotent` decorator to protect against duplicate invocations. | ||
|
|
||
| ```python title="Idempotency for ESM" | ||
| --8<-- "examples/lambda_features/durable_functions/src/best_practice_idempotency.py" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a rehashing of what's written above as well as what's in several FAQs below - I'd remove it imo
dreamorosi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments, great work so far, thanks for leading the way




Issue number: closes #7915
Summary
This PR adds documentation for two new Lambda features: Lambda Managed Instances and Durable Functions.
Changes
I created a new
lambda-featuressection in the docs with two pages:Lambda Managed Instances
Durable Functions
context.set_logger()The idea here is to create those pages as integration guides, not feature documentation. They follow a different structure than core utilities because:
1/ The focus is "how Powertools works with X" rather than "how to use Powertools feature Y"
2/ Current customers that users Powertools should find guidance on Powertools compatibility with those new features
3/ The content is more about considerations and gotchas than step-by-step tutorials
User experience
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.