Guidance Needed for Managing Message Retention and Blob Cleanup with PostgresSaver #5166

NikhilSuthar · 2025-06-23T09:04:46Z

NikhilSuthar
Jun 23, 2025

Issue Description

Hi team,

I'm using PostgresSaver as the checkpointer in my LangGraph setup and have implemented a summarization mechanism that trims older messages from the state (state["messages"]). Functionally, this works well — after summarization, I remove earlier messages and keep only the latest N messages in memory for downstream use.

My next step was to physically delete the older messages from the checkpoints and checkpoint_writes tables (e.g., messages 1 to 15 if my retention limit is 10). However, I noticed that despite these deletions, the state still reflects all messages — as if nothing was removed.

Findings

Upon deeper investigation, I found that:

LangGraph stores all serialized messages in the checkpoint_blobs table.
The checkpoints table is primarily used to track the latest version for a channel via the channel_versions mapping.
Even if messages are no longer referenced in checkpoints.metadata.writes, the corresponding blobs persist indefinitely in checkpoint_blobs.

As a result, database size keeps growing over time, and there's currently no built-in way to clean up unreferenced or orphaned blobs.

My Use Case and Attempts So Far

Implemented a custom reducer that summarizes older messages and retains only the most recent 10 messages.
After summarization, I remove the summarized messages from the state and do not pass them to the LLM.
I also manually deleted entries from the checkpoints and checkpoint_writes tables based on message IDs.
I considered automating cleanup via a scheduled job (cron), but realized that blobs remain untouched, so it doesn't solve the underlying issue.

My Questions

Is this blob retention behavior by design? If so, why store all serialized message history in blobs even when only a subset is retained in state?
Is there a supported or recommended way to delete unused blobs, e.g., those not referenced in any active checkpoint?
Could we have a way to configure PostgresSaver to persist only the latest N messages, similar to how TTL works in NoSQL systems?
Would subclassing PostgresSaver or extending the saver to implement a GC (garbage collection) strategy be safe or advisable?

What I'm Hoping For

An official recommendation or built-in support for pruning unused messages and blobs.
Or guidance on how to safely implement deletion logic that cleans up all three related tables: checkpoints, checkpoint_writes, and checkpoint_blobs.

This seems like a common use case (summarization + state pruning), and I imagine others may have encountered similar scalability concerns when persisting state in Postgres.

KoreanThinker · 2025-07-16T04:52:43Z

KoreanThinker
Jul 16, 2025

Hey, just wanted to say I’m running into the exact same problem on my end. Even after summarizing old messages and trimming state, the checkpoint_blobs table just keeps growing and eating up disk space like crazy. I also tried cleaning up checkpoints and checkpoint_writes, but those blobs stick around forever.

Did you ever figure out a good way to handle this? Or maybe get any hints from the maintainers?
I feel like I must be missing something obvious, but right now the only thing I can think of is hacking together some custom SQL for blob cleanup—which doesn’t feel super safe.

Would love to hear if you found any solutions (or workarounds)! 🙏

0 replies

tomascenteno42 · 2025-08-21T01:05:35Z

tomascenteno42
Aug 21, 2025

Having the same issue here. Did you manage to get some implementation going? Currently running into statement timeouts and hugh DB size. I read on an issue that a possible solution is to remove base 64 images from state and save only an S3 reference. I'm going to investigate this tomorrow!

Would love to hear if you guys found any possible workarounds!

0 replies

FiboDev · 2025-08-26T18:48:29Z

FiboDev
Aug 26, 2025

I have been dealing with the same issue and found something interesting in the repo of PostgresSaver where they have implementation for a shallow checkpointer which basically if you have no interest in travel time, which causes having a lots of checkpoints therefore blobs, you can save the last checkpoint in each iteration. Another observation is that you may get recommended to use Redis as it has ttl, unfortunately in postgres it is not possible without a cron job, if anyone got a snippet they would love to share that would be nice.

For more info of the shallow checkpointer go to https://github.com/langchain-ai/langgraph/blob/main/libs/checkpoint-postgres/langgraph/checkpoint/postgres/shallow.py#L526, read the warn where it says what argument to pass to invoke to achieve this.

1 reply

satyaprakash1729 Aug 30, 2025

@FiboDev Thanks for the pointer, but as per one of the PR discussion looks like it's going to be completely removed from 3.0.0 and currently it's in deprecation state. - #4813

Also check this -

langgraph/libs/checkpoint-postgres/langgraph/checkpoint/postgres/shallow.py

Line 549 in 120ae38

    
           "AsyncShallowPostgresSaver is deprecated as of version 2.0.20 and will be removed in 3.0.0. "

tomascenteno42 · 2025-08-31T00:08:52Z

tomascenteno42
Aug 31, 2025

Tried ASyncShallowPostgresSaver for a few days and it made absolutely no difference compared to the AsyncPostgresSaver. I found what i believe is the answer to this issue on this documentation from langgraph. I'm going to run some tests this week and see how it behaves.

I will be running the graph with the langgraph sdk with durability: exit that esentially will only save the latest checkpoint on storage.

import { Client } from '@langchain/langgraph-sdk';

  const client = new Client({
    apiUrl: getRequiredEnv('LANGGRAPH_ENDPOINT'),
    apiKey: getRequiredEnv('LANGSMITH_API_KEY')
  });
  
client.runs.wait(langsmithThreadId, assistantId, {
      multitaskStrategy: 'enqueue',
      durability: 'exit'
     })

Let's cross fingers this fixes the issue.

3 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Guidance Needed for Managing Message Retention and Blob Cleanup with PostgresSaver #5166

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Guidance Needed for Managing Message Retention and Blob Cleanup with PostgresSaver #5166

Uh oh!

Issue Description

Findings

My Use Case and Attempts So Far

My Questions

What I'm Hoping For

Replies: 4 comments · 4 replies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 4 comments 4 replies