Context is a resource #6164

kfsone · 2025-04-01T07:00:42Z

kfsone
Apr 1, 2025

I had given up on autogen a while back but am circling back because nobody else seems to have avoided the same major pitfalls autogen falls into, in that the idea that the payloads exchanged between endpoints are "conversations".

Foremost, almost every tool tries diligently to dispatch the same content to the LLM on the other end, and from reading code, prompts, data, papers, etc, it's quite clear that the engineers are succumbing during development to thinking of this as an essential.

If you send a model "You are Phillip, an AI expert" what does that do? Does it alter anything persistent or stateful on the machine running the model, on disk?

Instead of wasting ridiculous amounts of time and compute, why don't we:

Stop writing wordy, theatrical, performative prompts. Very few models were actually trained on a significant number of conversations where one user said: "You are X, an AI with significant Python skills". As a result, it is very difficult to be scientific about the impact this has on the inference progress, and it's why we get such varied and unreliable results.
Unbloat the context. The current conversational to contexts guarantees degradation of model performance by diluting attention; we've all seen how models can get stuck doing the very thing you or they say they must not do? This gets worse the more times the problem item or request not to do it appear, because that's how attention actually works.

A great example of Autogen shooting itself in the foot is when two models discuss a change to a large section of text - a large function, a chapter, etc. When the second model replies, there are now two copies of that large text.

This is a total waste of tokens, processing power, and incompatible with how inference actually works.

Autogen ought to be peeling back/reducing the context to stay on track. For instance, the reviewer agent might take the incoming "here's my solution " context, and then probing.

"{base context} Could we change line 325 to 'print(hello, world)'? Would that break anything"
and collecting the feedback from the coder agent. Instead of adding that to the base context, instead the reviewer agent would adapt it's ask to the coder agent "{base context} and assume we had changed line 325 to ... Now what would happen if we changed blue to pink?"

There are also all kinds of cases where an agent should be able to create a temporary context to send to it's own model as a way to do inline reasoning without bloating the progressing context: "How do I enable aarm64 compatibility?" That question doesn't require all 120k of tokens you are currently at.

ekzhu · 2025-04-01T17:28:56Z

ekzhu
Apr 1, 2025

@kfsone I agree with your assessment that the basic conversation pattern is not optimal, and we have been using more customized model context management when we built special agents like WebSurfer and FileSurfer.

There are several built-in knobs you can tune when it comes to context management:

You can also directly code your custom agent with domain-specific context management logics:

Custom agent: https://microsoft.github.io/autogen/dev/user-guide/agentchat-user-guide/custom-agents.html

1 reply

ekzhu Apr 1, 2025

We are also planning to add workflows so the agents don't always need to be "conversational". #4623

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context is a resource #6164

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Context is a resource #6164

Uh oh!

kfsone Apr 1, 2025

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

ekzhu Apr 1, 2025

Uh oh!

ekzhu Apr 1, 2025

kfsone
Apr 1, 2025

Replies: 1 comment 1 reply

ekzhu
Apr 1, 2025