Skip to content

Conversation

@leandrodamascena
Copy link
Contributor

Issue number: closes #7915

Summary

This PR adds documentation for two new Lambda features: Lambda Managed Instances and Durable Functions.

Changes

I created a new lambda-features section in the docs with two pages:

Lambda Managed Instances

  • Explains the multi-process concurrency model used by Python runtime
  • Shows how each Powertools utility works (Logger, Tracer, Metrics, Parameters, Idempotency, Batch)
  • FAQ addressing common questions about cache behavior, thread safety, etc.

Durable Functions

  • Documents the native integration between Powertools Logger and the Durable Execution SDK via context.set_logger()
  • Explains log deduplication during replays
  • Shows how to use Tracer, Metrics, Idempotency, Parser, and Parameters
  • Clarifies when to use Powertools Idempotency vs built-in step idempotency (ESM triggers, methods you don't want as steps)

The idea here is to create those pages as integration guides, not feature documentation. They follow a different structure than core utilities because:

1/ The focus is "how Powertools works with X" rather than "how to use Powertools feature Y"
2/ Current customers that users Powertools should find guidance on Powertools compatibility with those new features
3/ The content is more about considerations and gotchas than step-by-step tutorials

User experience

Please share what the user experience looks like before and after this change


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.

@leandrodamascena leandrodamascena requested a review from a team as a code owner January 7, 2026 11:19
@pull-request-size pull-request-size bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jan 7, 2026
@boring-cyborg boring-cyborg bot added the documentation Improvements or additions to documentation label Jan 7, 2026
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 7, 2026

Quality Gate Failed Quality Gate failed

Failed conditions
2 Security Hotspots
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@codecov
Copy link

codecov bot commented Jan 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.72%. Comparing base (139107a) to head (d5fe7c3).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #7917   +/-   ##
========================================
  Coverage    96.72%   96.72%           
========================================
  Files          275      275           
  Lines        13214    13214           
  Branches      1006     1006           
========================================
  Hits         12781    12781           
  Misses         325      325           
  Partials       108      108           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


### Logger

The Durable Execution SDK provides a `context.logger` that automatically handles **log deduplication during replays**. You can integrate Powertools Logger to get structured JSON logging while keeping the deduplication benefits.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like there should be another word after context.logger, is it a class, and instance, a function?

Tracer works with Durable Functions. Each execution creates trace segments.

???+ note "Trace continuity"
Due to the replay mechanism, traces may not show a continuous flow. Each execution (including replays) creates separate trace segments. Use the `execution_arn` to correlate traces.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Due to the replay mechanism, traces may not show a continuous flow. Each execution (including replays) creates separate trace segments. Use the `execution_arn` to correlate traces.
Due to the replay mechanism, traces may not show contiguously. Each execution (including replays) creates separate trace segments. Use the `execution_arn` to correlate traces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to use a simpler term or expression, as a non-native speaker I'd struggle to understand this term in a tech doc.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about due to the replay mechanism, traces may be interleaved?


### Parameters

Parameters utility works correctly, but be aware that **cache is per-process**.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Parameters utility works correctly, but be aware that **cache is per-process**.
Parameters utility works as expected, but be aware that **caching is per-process**.


1. **VPC Endpoints** - Private connectivity without internet access
2. **NAT Gateway** - Internet access from private subnets
3. **Public subnet with Internet Gateway** - Direct internet access
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is an egress only IPv6 internet gateway: https://docs.aws.amazon.com/vpc/latest/userguide/egress-only-internet-gateway.html

```

???+ note "Other utilities"
All other Powertools for AWS utilities (Feature Flags, Validation, Parser, Data Masking, etc.) work without any changes. If you encounter any issues, please [open an issue](https://github.com/aws-powertools/powertools-lambda-python/issues/new?template=bug_report.yml){target="_blank"}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All other Powertools for AWS Lambda (Python) utilities ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've already said this at L46:

Powertools for AWS Lambda (Python) works seamlessly with Lambda Managed Instances. All utilities are compatible with the multi-process concurrency model used by Python.

I would remove this "Other utilities" section and just leave the "if you find any issues..." callout.


## How Lambda Python runtime handles concurrency

Unlike Java or Node.js which use threads, the **Lambda Python runtime uses multiple processes** for concurrent requests. Each request runs in a separate process, which provides natural isolation between requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here I'd rather focus on what Python managed runtime does rather than draw a parallel with other languages, if I'm here I'm interested in Python specifically.


## Key differences from Lambda (default)

| Aspect | Lambda (default) | Lambda Managed Instances |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use Lambda On Demand as default

| **Concurrency** | Single invocation per execution environment | Multiple concurrent invocations per environment |
| **Python model** | One process, one request | Multiple processes, one request each |
| **Pricing** | Per-request duration | EC2-based with Savings Plans support |
| **Scaling** | Scale on demand with cold starts | Async scaling based on CPU, no cold starts |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can still have cold starts if your request volume exceeds the capacity provided in LMI, no?


This means:

- **Memory is not shared** between concurrent requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid the negation here, if possible at all

Comment on lines +30 to +42
## Isolation model

Lambda Managed Instances use a different isolation model than Lambda (default):

| Layer | Lambda (default) | Lambda Managed Instances |
| ---------------------- | ---------------------------------------- | ------------------------------------------ |
| **Instance level** | Firecracker microVMs on shared AWS fleet | Containers on EC2 Nitro in your account |
| **Security boundary** | Execution environment | Capacity provider |
| **Function isolation** | Strong isolation via microVMs | Container-based isolation within instances |

**Capacity providers** serve as the security boundary. Functions within the same capacity provider share the underlying EC2 instances. For workloads requiring strong isolation between functions, use separate capacity providers.

For Python specifically, the multi-process model adds another layer of isolation - each concurrent request runs in its own process with separate memory space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure about this section, should we link to the LMI docs with a "Go here to learn about the isolation model of LMI" (or similar) instead?

Previous sections have an immediate impact on the programming model, this is more indirect and a characteristic of LMI that might not belong in the docs of a toolkit like ours.

Comment on lines +86 to +87
???+ tip "Cache behavior"
Since each process has its own cache, you might see more calls to SSM/Secrets Manager during initial warm-up. Once each process has cached the value, subsequent requests within that process use the cache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware this is not a problem we've solved neither here nor in On Demand, but reading this I can't help but think about cache invalidation for parameters, which is now even more apparent with LMI.

I think a sentence about "you can customize the caching behavior with ..." would help here.

Comment on lines +48 to +54
### Logger

Logger works without any changes. Each process has its own logger instance.

```python hl_lines="4 7" title="Using Logger with Managed Instances"
--8<-- "examples/lambda_features/managed_instances/src/using_logger.py"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find these <Utility> works without any changes sections quite repetitive. Does it make sense to merge at least the core utilities in a single code snippet?

We're already saying above that PT works seamlessly with LMI. We can keep those other sections that require special clarifications (if any).

Comment on lines +108 to +124
## Working with shared resources

### The `/tmp` directory

The `/tmp` directory is **shared across all processes** in the execution environment. Use caution when writing files.

```python title="Safe file handling with unique names"
--8<-- "examples/lambda_features/managed_instances/src/tmp_file_handling.py"
```

### Database connections

Since each process is independent, connection pooling behaves differently than in threaded runtimes.

```python title="Database connections per process"
--8<-- "examples/lambda_features/managed_instances/src/database_connections.py"
```
Copy link
Contributor

@dreamorosi dreamorosi Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the isolation model section, unsure if this should be in our docs.

We've already mentioned above that these are differences in the programming model, this doesn't have a direct impact on Powertools since we don't use the /tmp folder but it's more of a general LMI characteristic, and we need to draw the line somewhere.

Other sections like VPC connectivity instead make a lot of sense.

Comment on lines +148 to +150
### Is my code thread-safe?

For Python, you don't need to worry about thread safety because Lambda Managed Instances uses **multiple processes**, not threads. Each request runs in its own process with isolated memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're going for with this Q/A, but I think the answer can be improved.

I'd leave something like:

Lambda Managed Instances uses multiple processes, instead of threads. Each request runs in its own process with isolated memory. If you implement multi-threading you're responsible for it

or something similar.


### Do I need to change my existing Powertools for AWS code?

No changes are required. Your existing code will work as-is with Lambda Managed Instances.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call out that they should upgrade to at least version 3.x.x of Powertools for this statement to be true.

)

# Emit only at the end
metrics.add_metric(name="WorkflowCompleted", unit=MetricUnit.Count, value=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in its own step?

I know that in this case it gets run only once because it's the last operation, but having it here might lead to think emitting a metric like this in the function body is always safe, which is technically not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like this, plus the examples/lambda_features/durable_functions/src/using_logger.py and the examples/lambda_features/durable_functions/src/log_deduplication.py are repeating the same concept over and over.

Should we have a single snippet that shows a couple logs, one using context.logger and one without with comments that explain what happens?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, this is a copy of examples/lambda_features/durable_functions/src/best_practice_metrics.py or close to it - can we keep only one?


@durable_execution
def handler(event: dict, context: DurableContext) -> str:
# Parameters are fetched on each execution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this always true?

I thought that if the replay or execution happens within a certain time frame the request could land on the same execution environment and thus hit the cache?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this adds much to our docs tbh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's really nothing specific to LMI here as far as I can tell - unsure we need a code snippet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing DF specific here, I would consider skipping this snippet.

Comment on lines +17 to +19
| **Steps** | Business logic with built-in retries and progress tracking |
| **Waits** | Suspend execution without incurring compute charges |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use singular names like Step and Wait for consistency


Durable functions use a **checkpoint/replay mechanism**:

1. Your code runs from the beginning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Your code runs from the beginning
1. Your code runs always from the beginning


1. Your code runs from the beginning
2. Completed operations are skipped using stored results
3. Execution continues from where it left off
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is slightly off - it might be me - but it confuses me to think it contradicts the first point.

I'd say that execution of new steps continues from where it left off or similar.


## Powertools integration

Powertools for AWS Lambda (Python) works seamlessly with Durable Functions. The [Durable Execution SDK](https://github.com/aws/aws-durable-execution-sdk-python){target="_blank" rel="nofollow"} has native integration with Powertools Logger via `context.set_logger()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with Logger - remove Powertools or you have to use Powertools for AWS Logger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do a find & replace, this is happening multiple times throughout the page

Comment on lines +114 to +120
### Parser

Parser works with Durable Functions for validating and parsing event payloads.

```python hl_lines="9 14" title="Using Parser with Durable Functions"
--8<-- "examples/lambda_features/durable_functions/src/using_parser.py"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the code snippet file, unsure if this is needed since it works exactly the same and it's stateless.

Comment on lines +133 to +157
## Best practices

### Use context.logger for log deduplication

Always use `context.set_logger()` and `context.logger` instead of using the Powertools Logger directly. This ensures logs are deduplicated during replays.

```python title="Recommended logging pattern"
--8<-- "examples/lambda_features/durable_functions/src/best_practice_logging.py"
```

### Emit metrics at workflow completion

To avoid counting replays as new executions, emit metrics only when the workflow completes successfully.

```python title="Metrics at completion"
--8<-- "examples/lambda_features/durable_functions/src/best_practice_metrics.py"
```

### Use Idempotency for ESM triggers

When your durable function is triggered by Event Source Mappings (SQS, Kinesis, DynamoDB Streams), use the `@idempotent` decorator to protect against duplicate invocations.

```python title="Idempotency for ESM"
--8<-- "examples/lambda_features/durable_functions/src/best_practice_idempotency.py"
```
Copy link
Contributor

@dreamorosi dreamorosi Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a rehashing of what's written above as well as what's in several FAQs below - I'd remove it imo

Copy link
Contributor

@dreamorosi dreamorosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, great work so far, thanks for leading the way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docs: Add documentation for Durable functions & Lambda Managed Instances

4 participants