Skip to content
This repository was archived by the owner on Jul 22, 2025. It is now read-only.

Commit ec97996

Browse files
authored
FIX/REFACTOR: FoldContent revamp (#866)
* FIX/REFACTOR: FoldContent revamp We hit a snag with our hot topic gist strategy: the regex we used to split the content didn't work, so we cannot send the original post separately. This was important for letting the model focus on what's new in the topic. The algorithm doesn’t give us full control over how prompts are written, and figuring out how to format the content isn't straightforward. This means we're having to use more complicated workarounds, like regex. To tackle this, I'm suggesting we simplify the approach a bit. Let's focus on summarizing as much as we can upfront, then gradually add new content until there's nothing left to summarize. Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with. * Fix fold docs * Use #shift instead of #pop to get the first elem, not the last
1 parent 12869f2 commit ec97996

File tree

12 files changed

+229
-261
lines changed

12 files changed

+229
-261
lines changed

app/controllers/discourse_ai/summarization/chat_summary_controller.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ def show
2626
strategy = DiscourseAi::Summarization::Strategies::ChatMessages.new(channel, since)
2727

2828
summarized_text =
29-
if strategy.targets_data[:contents].empty?
29+
if strategy.targets_data.empty?
3030
I18n.t("discourse_ai.summarization.chat.no_targets")
3131
else
3232
summarizer.summarize(current_user)&.summarized_text

lib/summarization/fold_content.rb

Lines changed: 46 additions & 95 deletions
Original file line numberDiff line numberDiff line change
@@ -18,43 +18,26 @@ def initialize(llm, strategy, persist_summaries: true)
1818
attr_reader :llm, :strategy
1919

2020
# @param user { User } - User object used for auditing usage.
21-
#
2221
# @param &on_partial_blk { Block - Optional } - The passed block will get called with the LLM partial response alongside a cancel function.
2322
# Note: The block is only called with results of the final summary, not intermediate summaries.
2423
#
2524
# @returns { AiSummary } - Resulting summary.
2625
def summarize(user, &on_partial_blk)
27-
opts = content_to_summarize.except(:contents)
28-
29-
initial_chunks =
30-
rebalance_chunks(
31-
content_to_summarize[:contents].map do |c|
32-
{ ids: [c[:id]], summary: format_content_item(c) }
33-
end,
34-
)
35-
36-
# Special case where we can do all the summarization in one pass.
37-
result =
38-
if initial_chunks.length == 1
39-
{
40-
summary:
41-
summarize_single(initial_chunks.first[:summary], user, opts, &on_partial_blk),
42-
chunks: [],
43-
}
44-
else
45-
summarize_chunks(initial_chunks, user, opts, &on_partial_blk)
46-
end
26+
base_summary = ""
27+
initial_pos = 0
28+
folded_summary =
29+
fold(content_to_summarize, base_summary, initial_pos, user, &on_partial_blk)
4730

4831
clean_summary =
49-
Nokogiri::HTML5.fragment(result[:summary]).css("ai")&.first&.text || result[:summary]
32+
Nokogiri::HTML5.fragment(folded_summary).css("ai")&.first&.text || folded_summary
5033

5134
if persist_summaries
5235
AiSummary.store!(
5336
strategy.target,
5437
strategy.type,
5538
llm_model.name,
5639
clean_summary,
57-
content_to_summarize[:contents].map { |c| c[:id] },
40+
content_to_summarize.map { |c| c[:id] },
5841
)
5942
else
6043
AiSummary.new(summarized_text: clean_summary)
@@ -96,90 +79,58 @@ def content_to_summarize
9679
end
9780

9881
def latest_sha
99-
@latest_sha ||= AiSummary.build_sha(content_to_summarize[:contents].map { |c| c[:id] }.join)
82+
@latest_sha ||= AiSummary.build_sha(content_to_summarize.map { |c| c[:id] }.join)
10083
end
10184

102-
def summarize_chunks(chunks, user, opts, &on_partial_blk)
103-
# Safely assume we always have more than one chunk.
104-
summarized_chunks = summarize_in_chunks(chunks, user, opts)
105-
total_summaries_size =
106-
llm_model.tokenizer_class.size(summarized_chunks.map { |s| s[:summary].to_s }.join)
107-
108-
if total_summaries_size < available_tokens
109-
# Chunks are small enough, we can concatenate them.
110-
{
111-
summary:
112-
concatenate_summaries(
113-
summarized_chunks.map { |s| s[:summary] },
114-
user,
115-
&on_partial_blk
116-
),
117-
chunks: summarized_chunks,
118-
}
119-
else
120-
# We have summarized chunks but we can't concatenate them yet. Split them into smaller summaries and summarize again.
121-
rebalanced_chunks = rebalance_chunks(summarized_chunks)
85+
# @param items { Array<Hash> } - Content to summarize. Structure will be: { poster: who wrote the content, id: a way to order content, text: content }
86+
# @param summary { String } - Intermediate summaries that we'll keep extending as part of our "folding" algorithm.
87+
# @param cursor { Integer } - Idx to know how much we already summarized.
88+
# @param user { User } - User object used for auditing usage.
89+
# @param &on_partial_blk { Block - Optional } - The passed block will get called with the LLM partial response alongside a cancel function.
90+
# Note: The block is only called with results of the final summary, not intermediate summaries.
91+
#
92+
# The summarization algorithm.
93+
# The idea is to build an initial summary packing as much content as we can. Once we have the initial summary, we'll keep extending using the leftover
94+
# content until there is nothing left.
95+
#
96+
# @returns { String } - Resulting summary.
97+
def fold(items, summary, cursor, user, &on_partial_blk)
98+
tokenizer = llm_model.tokenizer_class
99+
tokens_left = available_tokens - tokenizer.size(summary)
100+
iteration_content = []
122101

123-
summarize_chunks(rebalanced_chunks, user, opts, &on_partial_blk)
124-
end
125-
end
102+
items.each_with_index do |item, idx|
103+
next if idx < cursor
126104

127-
def format_content_item(item)
128-
"(#{item[:id]} #{item[:poster]} said: #{item[:text]} "
129-
end
105+
as_text = "(#{item[:id]} #{item[:poster]} said: #{item[:text]} "
130106

131-
def rebalance_chunks(chunks)
132-
section = { ids: [], summary: "" }
133-
134-
chunks =
135-
chunks.reduce([]) do |sections, chunk|
136-
if llm_model.tokenizer_class.can_expand_tokens?(
137-
section[:summary],
138-
chunk[:summary],
139-
available_tokens,
140-
)
141-
section[:summary] += chunk[:summary]
142-
section[:ids] = section[:ids].concat(chunk[:ids])
143-
else
144-
sections << section
145-
section = chunk
146-
end
147-
148-
sections
107+
if tokenizer.below_limit?(as_text, tokens_left)
108+
iteration_content << item
109+
tokens_left -= tokenizer.size(as_text)
110+
cursor += 1
111+
else
112+
break
149113
end
114+
end
150115

151-
chunks << section if section[:summary].present?
152-
153-
chunks
154-
end
155-
156-
def summarize_single(text, user, opts, &on_partial_blk)
157-
prompt = strategy.summarize_single_prompt(text, opts)
158-
159-
llm.generate(prompt, user: user, feature_name: "summarize", &on_partial_blk)
160-
end
161-
162-
def summarize_in_chunks(chunks, user, opts)
163-
chunks.map do |chunk|
164-
prompt = strategy.summarize_single_prompt(chunk[:summary], opts)
165-
166-
chunk[:summary] = llm.generate(
167-
prompt,
168-
user: user,
169-
max_tokens: 300,
170-
feature_name: "summarize",
116+
prompt =
117+
(
118+
if summary.blank?
119+
strategy.first_summary_prompt(iteration_content)
120+
else
121+
strategy.summary_extension_prompt(summary, iteration_content)
122+
end
171123
)
172124

173-
chunk
125+
if cursor == items.length
126+
llm.generate(prompt, user: user, feature_name: "summarize", &on_partial_blk)
127+
else
128+
latest_summary =
129+
llm.generate(prompt, user: user, max_tokens: 600, feature_name: "summarize")
130+
fold(items, latest_summary, cursor, user, &on_partial_blk)
174131
end
175132
end
176133

177-
def concatenate_summaries(texts_to_summarize, user, &on_partial_blk)
178-
prompt = strategy.concatenation_prompt(texts_to_summarize)
179-
180-
llm.generate(prompt, user: user, &on_partial_blk)
181-
end
182-
183134
def available_tokens
184135
# Reserve tokens for the response and the base prompt
185136
# ~500 words

lib/summarization/strategies/base.rb

Lines changed: 7 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -11,46 +11,35 @@ def initialize(target)
1111
@target = target
1212
end
1313

14-
attr_reader :target
14+
attr_reader :target, :opts
1515

1616
# The summary type differentiates instances of `AiSummary` pointing to a single target.
1717
# See the `summary_type` enum for available options.
1818
def type
1919
raise NotImplementedError
2020
end
2121

22-
# @returns { Hash } - Content to summarize.
22+
# @returns { Array<Hash> } - Content to summarize.
2323
#
24-
# This method returns a hash with the content to summarize and additional information.
25-
# The only mandatory key is `contents`, which must be an array of hashes with
26-
# the following structure:
24+
# This method returns an array of hashes with the content to summarize using the following structure:
2725
#
2826
# {
2927
# poster: A way to tell who write the content,
3028
# id: A number to signal order,
3129
# text: Text to summarize
3230
# }
3331
#
34-
# Additionally, you could add more context, which will be available in the prompt. e.g.:
35-
#
36-
# {
37-
# resource_path: "#{Discourse.base_path}/t/-/#{target.id}",
38-
# content_title: target.title,
39-
# contents: [...]
40-
# }
41-
#
4232
def targets_data
4333
raise NotImplementedError
4434
end
4535

46-
# @returns { DiscourseAi::Completions::Prompt } - Prompt passed to the LLM when concatenating multiple chunks.
47-
def contatenation_prompt(_texts_to_summarize)
36+
# @returns { DiscourseAi::Completions::Prompt } - Prompt passed to the LLM when extending an existing summary.
37+
def summary_extension_prompt(_summary, _texts_to_summarize)
4838
raise NotImplementedError
4939
end
5040

51-
# @returns { DiscourseAi::Completions::Prompt } - Prompt passed to the LLM on each chunk,
52-
# and when the whole content fits in one call.
53-
def summarize_single_prompt(_input, _opts)
41+
# @returns { DiscourseAi::Completions::Prompt } - Prompt passed to the LLM for summarizing a single chunk of content.
42+
def first_summary_prompt(_input)
5443
raise NotImplementedError
5544
end
5645
end

lib/summarization/strategies/chat_messages.rb

Lines changed: 37 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -14,38 +14,60 @@ def initialize(target, since)
1414
end
1515

1616
def targets_data
17-
content = { content_title: target.name }
18-
19-
content[:contents] = target
17+
target
2018
.chat_messages
2119
.where("chat_messages.created_at > ?", since.hours.ago)
2220
.includes(:user)
2321
.order(created_at: :asc)
2422
.pluck(:id, :username_lower, :message)
2523
.map { { id: _1, poster: _2, text: _3 } }
26-
27-
content
2824
end
2925

30-
def contatenation_prompt(texts_to_summarize)
26+
def summary_extension_prompt(summary, contents)
27+
input =
28+
contents
29+
.map { |item| "(#{item[:id]} #{item[:poster]} said: #{item[:text]} " }
30+
.join("\n")
31+
3132
prompt = DiscourseAi::Completions::Prompt.new(<<~TEXT.strip)
32-
You are a summarization bot tasked with creating a cohesive narrative by intelligently merging multiple disjointed summaries.
33-
Your response should consist of well-structured paragraphs that combines these summaries into a clear and comprehensive overview.
34-
Avoid adding any additional text or commentary. Format your output using Discourse forum Markdown.
33+
You are a summarization bot tasked with expanding on an existing summary by incorporating new chat messages.
34+
Your goal is to seamlessly integrate the additional information into the existing summary, preserving the clarity and insights of the original while reflecting any new developments, themes, or conclusions.
35+
Analyze the new messages to identify key themes, participants' intentions, and any significant decisions or resolutions.
36+
Update the summary to include these aspects in a way that remains concise, comprehensive, and accessible to someone with no prior context of the conversation.
37+
38+
### Guidelines:
39+
40+
- Merge the new information naturally with the existing summary without redundancy.
41+
- Only include the updated summary, WITHOUT additional commentary.
42+
- Don't mention the channel title. Avoid extraneous details or subjective opinions.
43+
- Maintain the original language of the text being summarized.
44+
- The same user could write multiple messages in a row, don't treat them as different persons.
45+
- Aim for summaries to be extended by a reasonable amount, but strive to maintain a total length of 400 words or less, unless absolutely necessary for comprehensiveness.
46+
3547
TEXT
3648

3749
prompt.push(type: :user, content: <<~TEXT.strip)
38-
THESE are the summaries, each one separated by a newline, all of them inside <input></input> XML tags:
50+
### Context:
51+
52+
This is the existing summary:
53+
54+
#{summary}
3955
40-
<input>
41-
#{texts_to_summarize.join("\n")}
42-
</input>
56+
These are the new chat messages:
57+
58+
#{input}
59+
60+
Intengrate the new messages into the existing summary.
4361
TEXT
4462

4563
prompt
4664
end
4765

48-
def summarize_single_prompt(input, opts)
66+
def first_summary_prompt(contents)
67+
content_title = target.name
68+
input =
69+
contents.map { |item| "(#{item[:id]} #{item[:poster]} said: #{item[:text]} " }.join
70+
4971
prompt = DiscourseAi::Completions::Prompt.new(<<~TEXT.strip)
5072
You are a summarization bot designed to generate clear and insightful paragraphs that conveys the main topics
5173
and developments from a series of chat messages within a user-selected time window.
@@ -62,7 +84,7 @@ def summarize_single_prompt(input, opts)
6284
TEXT
6385

6486
prompt.push(type: :user, content: <<~TEXT.strip)
65-
#{opts[:content_title].present? ? "The name of the channel is: " + opts[:content_title] + ".\n" : ""}
87+
#{content_title.present? ? "The name of the channel is: " + content_title + ".\n" : ""}
6688
6789
Here are the messages, inside <input></input> XML tags:
6890

0 commit comments

Comments
 (0)