-
Notifications
You must be signed in to change notification settings - Fork 262
Description
Package
langgraph-checkpoint-aws
Checked other resources
- I added a descriptive title to this issue
- I searched the LangChain documentation with the integrated search
- I used the GitHub search to find a similar issue and didn't find it
- I am sure this is a feature request and not a bug report or question
Feature Description
Add a configuration option to AgentCoreMemorySaver that allows checkpointing only at the end of workflow execution, instead of after every graph node (super-step).
Currently, LangGraph checkpoints after every node, which results in a large number of API calls to AgentCore Memory.
For a typical 6-node conversational workflow with tool calling, this generates 62 API calls (49 createEvent + 13 listEvents) adding ~8.7 seconds of latency per user request.
Use Case
We're building a real-time conversational agent using LangGraph with AgentCoreMemorySaver for session persistence. Our workflow:
User Message → LLM → Tool Call → LLM → Tool Call → LLM → Response
Current behavior: Checkpoint saved after each of the 6 nodes = 62 API calls = 8.7s overhead Desired behavior: Checkpoint saved once at the end = 2 API calls = ~300ms overhead For our use case:
- We don't need mid-workflow fault tolerance or recovery
- We only need the final conversation state persisted for session continuity
- Response latency is critical for user experience
Proposed Implementation (optional)
Add a checkpoint_mode parameter to AgentCoreMemorySaver:
from langgraph_checkpoint_aws import AgentCoreMemorySaver
# Current behavior (default)
checkpointer = AgentCoreMemorySaver(
MEMORY_ID,
region_name="us-east-1"
)
# New: checkpoint only at end of workflow
checkpointer = AgentCoreMemorySaver(
MEMORY_ID,
region_name="us-east-1",
checkpoint_mode="end_of_workflow" # or "deferred" / "batch"
)
Implementation options:
1. Buffer writes internally - Override put() and put_writes() to buffer checkpoint data, then flush only when the graph execution completes
2. Expose LangGraph's checkpoint hooks - If LangGraph supports conditional checkpointing, expose that configuration
3. Add a flush() method - Let users manually control when checkpoints are writtenAdditional Context
Additional Context
Performance impact:
- Current (every node): 62 API calls, ~8.7s latency overhead
- End-of-workflow: 2 API calls, ~300ms latency overhead
Breakdown of current API calls:
- createEvent: 49 calls × ~150ms average = 4.1s total
- listEvents: 13 calls × ~350ms average = 4.6s total
Environment:
- langgraph-checkpoint-aws version: latest
- LangGraph workflow: 6 nodes (3 LLM calls, 2 tool executions, 1 conditional)
- Region: eu-west-1
Related: LangGraph checkpointing docs indicate per-node checkpointing is by design for fault tolerance, but many real-time use cases don't require mid-workflow recovery. No existing configuration found in:
- AWS AgentCore Memory docs: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/memory-integrate-lang.html
- langgraph-checkpoint-aws PyPI: https://pypi.org/project/langgraph-checkpoint-aws/