Skip to content
This repository was archived by the owner on Jul 22, 2025. It is now read-only.

Conversation

@romanrizzi
Copy link
Member

We hit a snag with our hot topic gist strategy: the regex we used to split the content didn't work, so we cannot send the original post separately. This was important for letting the model focus on what's new in the topic.

The algorithm doesn’t give us full control over how prompts are written, and figuring out how to format the content isn't straightforward. This means we're having to use more complicated workarounds, like regex.

To tackle this, I'm suggesting we simplify the approach a bit. Let's focus on summarizing as much as we can upfront, then gradually add new content until there's nothing left to summarize.

Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with.

We hit a snag with our hot topic gist strategy: the regex we used to split the content didn't work, so we cannot send the original post separately. This was important for letting the model focus on what's new in the topic.

The algorithm doesn’t give us full control over how prompts are written, and figuring out how to format the content isn't straightforward. This means we're having to use more complicated workarounds, like regex.

To tackle this, I'm suggesting we simplify the approach a bit. Let's focus on summarizing as much as we can upfront, then gradually add new content until there's nothing left to summarize.

Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with.
@xfalcox
Copy link
Member

xfalcox commented Oct 24, 2024

Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with.

Given that 99% of the time we will only do a single pass, shouldn't we only support that to simplify code?

Like

if "topic fits in context"
   summarize with all posts
else
  summarize using best replies mode
end

@romanrizzi
Copy link
Member Author

Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with.

Given that 99% of the time we will only do a single pass, shouldn't we only support that to simplify code?

Like

if "topic fits in context"
   summarize with all posts
else
  summarize using best replies mode
end

This brings up another point: with the average context window being much larger than it was a year ago, are we being overly cautious in how we choose posts to summarize? I think the answer is probably yes, and we could probably start sending the entire topic to the model.

However, I don’t think we should completely drop the folding. It’s a good safety net for avoiding overwhelming the LLM with too much content. I wouldn't expect this approach to change much, especially now that we've separated the "what" and the "how" into different strategies.

@xfalcox
Copy link
Member

xfalcox commented Oct 25, 2024

This brings up another point: with the average context window being much larger than it was a year ago, are we being overly cautious in how we choose posts to summarize? I think the answer is probably yes, and we could probably start sending the entire topic to the model.

LLMM config gives us the exact context size, so we know if it fits or not. We should check, if the topic size is < 80% of the context window we should send it all.

@romanrizzi
Copy link
Member Author

This brings up another point: with the average context window being much larger than it was a year ago, are we being overly cautious in how we choose posts to summarize? I think the answer is probably yes, and we could probably start sending the entire topic to the model.

LLMM config gives us the exact context size, so we know if it fits or not. We should check, if the topic size is < 80% of the context window we should send it all.

Fair enough. I'll follow-up on a different PR.

@romanrizzi romanrizzi merged commit ec97996 into main Oct 25, 2024
5 checks passed
@romanrizzi romanrizzi deleted the fold_revamp branch October 25, 2024 14:51
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants