Guidance Needed for Managing Message Retention and Blob Cleanup with PostgresSaver #5166
Replies: 4 comments 4 replies
-
Hey, just wanted to say I’m running into the exact same problem on my end. Even after summarizing old messages and trimming state, the checkpoint_blobs table just keeps growing and eating up disk space like crazy. I also tried cleaning up checkpoints and checkpoint_writes, but those blobs stick around forever. Did you ever figure out a good way to handle this? Or maybe get any hints from the maintainers? Would love to hear if you found any solutions (or workarounds)! 🙏 |
Beta Was this translation helpful? Give feedback.
-
Having the same issue here. Did you manage to get some implementation going? Currently running into statement timeouts and hugh DB size. I read on an issue that a possible solution is to remove base 64 images from state and save only an S3 reference. I'm going to investigate this tomorrow! Would love to hear if you guys found any possible workarounds! |
Beta Was this translation helpful? Give feedback.
-
I have been dealing with the same issue and found something interesting in the repo of PostgresSaver where they have implementation for a shallow checkpointer which basically if you have no interest in travel time, which causes having a lots of checkpoints therefore blobs, you can save the last checkpoint in each iteration. Another observation is that you may get recommended to use Redis as it has ttl, unfortunately in postgres it is not possible without a cron job, if anyone got a snippet they would love to share that would be nice. For more info of the shallow checkpointer go to https://github.com/langchain-ai/langgraph/blob/main/libs/checkpoint-postgres/langgraph/checkpoint/postgres/shallow.py#L526, read the warn where it says what argument to pass to invoke to achieve this. |
Beta Was this translation helpful? Give feedback.
-
Tried ASyncShallowPostgresSaver for a few days and it made absolutely no difference compared to the AsyncPostgresSaver. I found what i believe is the answer to this issue on this documentation from langgraph. I'm going to run some tests this week and see how it behaves. I will be running the graph with the langgraph sdk with
Let's cross fingers this fixes the issue. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Issue Description
Hi team,
I'm using
PostgresSaver
as the checkpointer in my LangGraph setup and have implemented a summarization mechanism that trims older messages from the state (state["messages"]
). Functionally, this works well — after summarization, I remove earlier messages and keep only the latest N messages in memory for downstream use.My next step was to physically delete the older messages from the
checkpoints
andcheckpoint_writes
tables (e.g., messages 1 to 15 if my retention limit is 10). However, I noticed that despite these deletions, the state still reflects all messages — as if nothing was removed.Findings
Upon deeper investigation, I found that:
checkpoint_blobs
table.checkpoints
table is primarily used to track the latest version for a channel via thechannel_versions
mapping.checkpoints.metadata.writes
, the corresponding blobs persist indefinitely incheckpoint_blobs
.As a result, database size keeps growing over time, and there's currently no built-in way to clean up unreferenced or orphaned blobs.
My Use Case and Attempts So Far
checkpoints
andcheckpoint_writes
tables based on message IDs.My Questions
PostgresSaver
to persist only the latest N messages, similar to how TTL works in NoSQL systems?PostgresSaver
or extending the saver to implement a GC (garbage collection) strategy be safe or advisable?What I'm Hoping For
checkpoints
,checkpoint_writes
, andcheckpoint_blobs
.This seems like a common use case (summarization + state pruning), and I imagine others may have encountered similar scalability concerns when persisting state in Postgres.
Beta Was this translation helpful? Give feedback.
All reactions