Skip to content

Commit 237fc4f

Browse files
authored
feat: add summarization to optimize GPT usage (#7)
* feat: add summarization of conversation according to config * docs: add summarization logic related readme section, remove disclaimer
1 parent 8e99161 commit 237fc4f

File tree

11 files changed

+474
-94
lines changed

11 files changed

+474
-94
lines changed

README.md

Lines changed: 43 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,18 @@
11
# @wisegpt/slack-bot
22

3-
Slack Bot for communicating with WiseGPT. Uses OpenAI GPT-3 to simulate a ChatGPT like conversation. The GPT model is in early development and not optimized, will consume more token as the conversation gets bigger.
3+
Slack Bot for communicating with WiseGPT. Uses OpenAI GPT-3 to simulate a ChatGPT like conversation. With each new message, the whole conversation is sent to GPT for completion. Summarization of the Conversation is done (per configuration parameters) to keep the prompts small, even if Conversation gets long.
44

55
https://user-images.githubusercontent.com/3743507/209952150-4555aee0-3f1b-4481-893a-0675a6108e3d.mp4
66

77
## Bot Features
88

9-
- Persona of the bot can be customized since it is using GPT-3
9+
- Persona of the bot can be customized
1010
- Only gets involved to the conversations that start with mentioning the bot E.g. `@WiseGPT hello!`
1111
- No need to mention bot again for further messages in the same thread
1212
- Keeps conversation history per thread
13-
- Can keep reference of multiple actors in the same conversation.
13+
- Can keep reference of multiple actors and their details. Mentions actors properly when addressing them.
14+
- Conversations **are summarized** per configuration to keep token usage small
15+
- You can limit maximum token used per conversation
1416
- Simple loading indicator is shown before calling OpenAI
1517
- Markdown and Slack Blocks are used for output
1618

@@ -30,10 +32,47 @@ This project uses AWS CDK for deployment. All infrastructure is automatically pr
3032
6. Deploy with `AWS_PROFILE=your-profile npx projen deploy`
3133
7. Add output Slack Events Request URL to Slack API Events URL
3234

35+
## Token Usage Limits
36+
37+
The configuration of allowed maximum tokens and summarization is stored in `src/config.ts`. Following the below structure:
38+
39+
```typescript
40+
export default {
41+
// ...
42+
conversation: {
43+
// ...
44+
maximumSpentTokens: Number.MAX_SAFE_INTEGER,
45+
// conditions for triggering summarization
46+
summarization: {
47+
// minimum amount the token sum of all human/bot messages (since last summarization) should reach
48+
// the summary size itself is not included into this count
49+
minimumTokens: 500,
50+
// minimum amount of user messages since last summarization
51+
minimumUserMessages: 2,
52+
}
53+
}
54+
}
55+
```
56+
57+
If you want to limit per conversation, how much can be spent. Use the `maximumSpentTokens` to set a limit that no conversation can exceed. The total tokens are calculated according to how much token did all completion and summarization requests used. The source of truth for spent tokens are the OpenAI API itself.
58+
59+
### Summarization Logic
60+
61+
Every time the bot responds to the chat, the configuration is checked to decide; whether a summarization should be done or not. If the summarization is triggered, OpenAI GPT is asked to summarize the whole conversation into a paragraph.
62+
63+
All further completion calls will just use the Summary and the messages since the last summary. Example summary of a conversation is as follows; `<@U04G77EL6CW> asked <@bot> for a recursive Fibonacci function written in Typescript, with comments and explanation. <@bot> provided a code example and asked if there was anything else they could help with. <@U04G77EL6CW> then asked <@bot> to refactor the last code to be iterative, with more explanation. <@bot> provided a code example for an iterative version of the Fibonacci function.`
64+
65+
The decision of summarization is done after every time bot response is added to the conversation. Below checks are done to decide whether to do or not to do summary:
66+
67+
- Conversation is still ongoing, there is no errors, ongoing operations etc.
68+
- Calculates total conversation tokens, according to messages sent since the last summary. Both bot and user messages are counted. Then the `conversation.summarization.minimumTokens` is checked to make sure minimum threshold is reached.
69+
- The amount of non-bot messages in the conversations since the last summary is calculated. `conversation.summarization.minimumUserMessages` is checked to make sure minimum threshold is reached.
70+
71+
Summarization may happen multiple times. Each time, the previous summary, alongside with all messages since the previous summary are sent. The first summarization request does not have any previous sumary.
72+
3373
## Disclaimer
3474

3575
1. The bot is in active early development and there maybe non-backwards compatible change. E.g. some older conversations may stop working after deploying a newer version of the bot.
36-
2. The token usage maybe high since there was no effort to optimize the used tokens for now. It is a planned feature.
3776

3877
## Thanks
3978

src/application/conversation/conversation-command.handler.ts

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import {
66
ConversationCommand,
77
CreateConversationCommand,
88
ProcessCompletionResponseCommand,
9+
ProcessSummaryResponseCommand,
910
} from "../../domain/conversation/conversation.commands";
1011
import { ConversationAggregateDynamodbRepository } from "../../infrastructure/dynamodb/conversation-aggregate-dynamodb.repository";
1112
import { OpenAILambdaInvoke } from "../../infrastructure/lambdas/invoke/open-ai-lambda-invoke";
@@ -29,6 +30,8 @@ export class ConversationCommandHandler {
2930
return this.executeAddUserMessage(cmd);
3031
case "PROCESS_COMPLETION_RESPONSE_COMMAND":
3132
return this.executeProcessCompletionResponse(cmd);
33+
case "PROCESS_SUMMARY_RESPONSE_COMMAND":
34+
return this.executeProcessSummaryResponse(cmd);
3235
default:
3336
return assertUnreachable(cmd);
3437
}
@@ -71,8 +74,18 @@ export class ConversationCommandHandler {
7174
): Promise<void> {
7275
return this.transaction(
7376
cmd.conversationId,
74-
async (aggregate: ConversationAggregate) =>
75-
aggregate.processCompletionResponse(cmd)
77+
(aggregate: ConversationAggregate) =>
78+
aggregate.processCompletionResponse(cmd, this.conversationAIService)
79+
);
80+
}
81+
82+
private async executeProcessSummaryResponse(
83+
cmd: ProcessSummaryResponseCommand
84+
): Promise<void> {
85+
return this.transaction(
86+
cmd.conversationId,
87+
(aggregate: ConversationAggregate) =>
88+
aggregate.processSummaryResponse(cmd, this.conversationAIService)
7689
);
7790
}
7891

src/application/open-ai/open-ai-command.handler.ts

Lines changed: 67 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import { CommandBus, globalCommandBus } from "../../domain/bus/command-bus";
66
import {
77
ConversationAICommand,
88
TriggerCompletionCommand,
9+
TriggerSummaryCommand,
910
} from "../../domain/conversation/ai/conversation-ai.commands";
1011
import { ConversationCommand } from "../../domain/conversation/conversation.commands";
1112
import { OpenAIService } from "../../infrastructure/openai/openai.service";
@@ -21,6 +22,9 @@ export class OpenAICommandHandler {
2122
case "TRIGGER_COMPLETION_COMMAND": {
2223
return this.executeTriggerCompletion(cmd);
2324
}
25+
case "TRIGGER_SUMMARY_COMMAND": {
26+
return this.executeTriggerSummary(cmd);
27+
}
2428
default:
2529
throw new Error(`unknown type of command: ${JSON.stringify(cmd)}`);
2630
}
@@ -30,11 +34,26 @@ export class OpenAICommandHandler {
3034
cmd: TriggerCompletionCommand
3135
): Promise<void> {
3236
try {
33-
const { text, usage } = await this.openAIService.completion(cmd.messages);
37+
// TODO: make debug logging better and count usage with proper metrics
38+
console.log(
39+
JSON.stringify({
40+
cmd: {
41+
...cmd,
42+
conversation: {
43+
summarySize: cmd.conversation.summary?.length,
44+
messagesCount: cmd.conversation.messages.length,
45+
},
46+
},
47+
})
48+
);
49+
50+
const { text, usage } = await this.openAIService.completion(
51+
cmd.conversation
52+
);
3453

3554
await this.conversationCommandBus.send({
3655
type: "PROCESS_COMPLETION_RESPONSE_COMMAND",
37-
botResponseType: "BOT_RESPONSE_SUCCESS",
56+
responseType: "BOT_COMPLETION_SUCCESS",
3857
conversationId: cmd.conversationId,
3958
correlationId: cmd.correlationId,
4059
message: text,
@@ -46,7 +65,52 @@ export class OpenAICommandHandler {
4665
} catch (err: any) {
4766
await this.conversationCommandBus.send({
4867
type: "PROCESS_COMPLETION_RESPONSE_COMMAND",
49-
botResponseType: "BOT_RESPONSE_ERROR",
68+
responseType: "BOT_COMPLETION_ERROR",
69+
conversationId: cmd.conversationId,
70+
correlationId: cmd.correlationId,
71+
error: {
72+
message: err.message,
73+
},
74+
});
75+
}
76+
}
77+
78+
private async executeTriggerSummary(
79+
cmd: TriggerSummaryCommand
80+
): Promise<void> {
81+
try {
82+
// TODO: make debug logging better and count usage with proper metrics
83+
console.log(
84+
JSON.stringify({
85+
cmd: {
86+
...cmd,
87+
conversation: {
88+
summarySize: cmd.conversation.summary?.length,
89+
messagesCount: cmd.conversation.messages.length,
90+
},
91+
},
92+
})
93+
);
94+
95+
const { summary, usage } = await this.openAIService.summary(
96+
cmd.conversation
97+
);
98+
99+
await this.conversationCommandBus.send({
100+
type: "PROCESS_SUMMARY_RESPONSE_COMMAND",
101+
responseType: "BOT_SUMMARY_SUCCESS",
102+
conversationId: cmd.conversationId,
103+
correlationId: cmd.correlationId,
104+
summary,
105+
// We rely on the fact that only 1 completion is done, this number could be wrong
106+
// if we used `best_of` and `n` parameters.
107+
summaryTokens: usage.completionTokens,
108+
totalTokensSpent: usage.totalTokens,
109+
});
110+
} catch (err: any) {
111+
await this.conversationCommandBus.send({
112+
type: "PROCESS_SUMMARY_RESPONSE_COMMAND",
113+
responseType: "BOT_SUMMARY_ERROR",
50114
conversationId: cmd.conversationId,
51115
correlationId: cmd.correlationId,
52116
error: {

src/application/slack-adapter/conversation-event-handler.ts

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import { WebClient } from "@slack/web-api";
22
import {
33
BotResponseAdded,
4-
BotResponseRequested,
4+
BotCompletionRequested,
55
ConversationEnded,
66
ConversationEvent,
77
ConversationStarted,
@@ -20,8 +20,8 @@ export class ConversationEventHandler {
2020
switch (event.type) {
2121
case "CONVERSATION_STARTED":
2222
return this.handleConversationStarted(event);
23-
case "BOT_RESPONSE_REQUESTED":
24-
return this.handleBotResponseRequested(event);
23+
case "BOT_COMPLETION_REQUESTED":
24+
return this.handleBotCompletionRequested(event);
2525
case "BOT_RESPONSE_ADDED":
2626
return this.handleBotResponseAdded(event);
2727
case "CONVERSATION_ENDED":
@@ -43,8 +43,8 @@ export class ConversationEventHandler {
4343
});
4444
}
4545

46-
private async handleBotResponseRequested(
47-
event: BotResponseRequested
46+
private async handleBotCompletionRequested(
47+
event: BotCompletionRequested
4848
): Promise<void> {
4949
const [view, slackService] = await Promise.all([
5050
this.getOrFailByConversationId(event.conversationId),
@@ -98,7 +98,7 @@ export class ConversationEventHandler {
9898

9999
await this.repository.update(updatedView);
100100

101-
if (event.reason.type === "BOT_RESPONSE_ERROR") {
101+
if (event.reason.type === "BOT_COMPLETION_ERROR") {
102102
await this.completeBotMessage({
103103
view: updatedView,
104104
slackService,

src/config.ts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,5 +14,13 @@ export default {
1414
// all summarization and completion requests are counted towards the total tokens
1515
// persona adds overhead to each request
1616
maximumSpentTokens: Number.MAX_SAFE_INTEGER,
17+
// conditions for triggering summarization
18+
summarization: {
19+
// minimum amount the token sum of all human/bot messages (since last summarization) should reach
20+
// the summary size itself is not included into this count
21+
minimumTokens: 500,
22+
// minimum amount of user messages since last summarization
23+
minimumUserMessages: 2,
24+
},
1725
},
1826
};
Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import { Message } from "@wisegpt/gpt-conversation-prompt";
1+
import { Conversation } from "@wisegpt/gpt-conversation-prompt";
22

33
type BaseCommand = {
44
conversationId: string;
@@ -7,7 +7,14 @@ type BaseCommand = {
77

88
export type TriggerCompletionCommand = BaseCommand & {
99
type: "TRIGGER_COMPLETION_COMMAND";
10-
messages: Message[];
10+
conversation: Conversation;
1111
};
1212

13-
export type ConversationAICommand = TriggerCompletionCommand;
13+
export type TriggerSummaryCommand = BaseCommand & {
14+
type: "TRIGGER_SUMMARY_COMMAND";
15+
conversation: Conversation;
16+
};
17+
18+
export type ConversationAICommand =
19+
| TriggerCompletionCommand
20+
| TriggerSummaryCommand;

0 commit comments

Comments
 (0)