WiseGPT
diff --git a/‎README.md‎
Lines changed: 43 additions & 4 deletions b/‎README.md‎
Lines changed: 43 additions & 4 deletions
diff --git a/‎src/application/conversation/conversation-command.handler.ts‎
Lines changed: 15 additions & 2 deletions b/‎src/application/conversation/conversation-command.handler.ts‎
Lines changed: 15 additions & 2 deletions
diff --git a/‎src/application/open-ai/open-ai-command.handler.ts‎
Lines changed: 67 additions & 3 deletions b/‎src/application/open-ai/open-ai-command.handler.ts‎
Lines changed: 67 additions & 3 deletions
diff --git a/‎src/application/slack-adapter/conversation-event-handler.ts‎
Lines changed: 6 additions & 6 deletions b/‎src/application/slack-adapter/conversation-event-handler.ts‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎src/config.ts‎
Lines changed: 8 additions & 0 deletions b/‎src/config.ts‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎src/domain/conversation/ai/conversation-ai.commands.ts‎
Lines changed: 10 additions & 3 deletions b/‎src/domain/conversation/ai/conversation-ai.commands.ts‎
Lines changed: 10 additions & 3 deletions
@@ -1,16 +1,18 @@
 # @wisegpt/slack-bot
 
-Slack Bot for communicating with WiseGPT. Uses OpenAI GPT-3 to simulate a ChatGPT like conversation. The GPT model is in early development and not optimized, will consume more token as the conversation gets bigger.
+Slack Bot for communicating with WiseGPT. Uses OpenAI GPT-3 to simulate a ChatGPT like conversation. With each new message, the whole conversation is sent to GPT for completion. Summarization of the Conversation is done (per configuration parameters) to keep the prompts small, even if Conversation gets long. 
 
 https://user-images.githubusercontent.com/3743507/209952150-4555aee0-3f1b-4481-893a-0675a6108e3d.mp4
 
 ## Bot Features
 
-- Persona of the bot can be customized since it is using GPT-3
+- Persona of the bot can be customized
 - Only gets involved to the conversations that start with mentioning the bot E.g. `@WiseGPT hello!`
 - No need to mention bot again for further messages in the same thread
 - Keeps conversation history per thread
-- Can keep reference of multiple actors in the same conversation.
+- Can keep reference of multiple actors and their details. Mentions actors properly when addressing them. 
+- Conversations **are summarized** per configuration to keep token usage small
+- You can limit maximum token used per conversation
 - Simple loading indicator is shown before calling OpenAI
 - Markdown and Slack Blocks are used for output
 
@@ -30,10 +32,47 @@ This project uses AWS CDK for deployment. All infrastructure is automatically pr
 6. Deploy with `AWS_PROFILE=your-profile npx projen deploy`
 7. Add output Slack Events Request URL to Slack API Events URL
 
+## Token Usage Limits
+
+The configuration of allowed maximum tokens and summarization is stored in `src/config.ts`. Following the below structure:
+
+```typescript
+export default { 
+  // ...
+  conversation: {
+    // ...
+    maximumSpentTokens: Number.MAX_SAFE_INTEGER,
+     // conditions for triggering summarization
+     summarization: { 
+      // minimum amount the token sum of all human/bot messages (since last summarization) should reach
+      // the summary size itself is not included into this count
+      minimumTokens: 500,
+      // minimum amount of user messages since last summarization
+      minimumUserMessages: 2,
+     }
+  }
+}
+```
+
+If you want to limit per conversation, how much can be spent. Use the `maximumSpentTokens` to set a limit that no conversation can exceed. The total tokens are calculated according to how much token did all completion and summarization requests used. The source of truth for spent tokens are the OpenAI API itself.
+
+### Summarization Logic
+
+Every time the bot responds to the chat, the configuration is checked to decide; whether a summarization should be done or not. If the summarization is triggered, OpenAI GPT is asked to summarize the whole conversation into a paragraph.
+
+All further completion calls will just use the Summary and the messages since the last summary. Example summary of a conversation is as follows; `<@U04G77EL6CW> asked <@bot> for a recursive Fibonacci function written in Typescript, with comments and explanation. <@bot> provided a code example and asked if there was anything else they could help with. <@U04G77EL6CW> then asked <@bot> to refactor the last code to be iterative, with more explanation. <@bot> provided a code example for an iterative version of the Fibonacci function.`
+
+The decision of summarization is done after every time bot response is added to the conversation. Below checks are done to decide whether to do or not to do summary:
+
+- Conversation is still ongoing, there is no errors, ongoing operations etc.
+- Calculates total conversation tokens, according to messages sent since the last summary. Both bot and user messages are counted. Then the `conversation.summarization.minimumTokens` is checked to make sure minimum threshold is reached.
+- The amount of non-bot messages in the conversations since the last summary is calculated. `conversation.summarization.minimumUserMessages` is checked to make sure minimum threshold is reached.
+
+Summarization may happen multiple times. Each time, the previous summary, alongside with all messages since the previous summary are sent. The first summarization request does not have any previous sumary.
+
 ## Disclaimer
 
 1. The bot is in active early development and there maybe non-backwards compatible change. E.g. some older conversations may stop working after deploying a newer version of the bot.
-2. The token usage maybe high since there was no effort to optimize the used tokens for now. It is a planned feature.
 
 ## Thanks
 
 
@@ -6,6 +6,7 @@ import {
   ConversationCommand,
   CreateConversationCommand,
   ProcessCompletionResponseCommand,
+  ProcessSummaryResponseCommand,
 } from "../../domain/conversation/conversation.commands";
 import { ConversationAggregateDynamodbRepository } from "../../infrastructure/dynamodb/conversation-aggregate-dynamodb.repository";
 import { OpenAILambdaInvoke } from "../../infrastructure/lambdas/invoke/open-ai-lambda-invoke";
@@ -29,6 +30,8 @@ export class ConversationCommandHandler {
         return this.executeAddUserMessage(cmd);
       case "PROCESS_COMPLETION_RESPONSE_COMMAND":
         return this.executeProcessCompletionResponse(cmd);
+      case "PROCESS_SUMMARY_RESPONSE_COMMAND":
+        return this.executeProcessSummaryResponse(cmd);
       default:
         return assertUnreachable(cmd);
     }
@@ -71,8 +74,18 @@ export class ConversationCommandHandler {
   ): Promise<void> {
     return this.transaction(
       cmd.conversationId,
-      async (aggregate: ConversationAggregate) =>
-        aggregate.processCompletionResponse(cmd)
+      (aggregate: ConversationAggregate) =>
+        aggregate.processCompletionResponse(cmd, this.conversationAIService)
+    );
+  }
+
+  private async executeProcessSummaryResponse(
+    cmd: ProcessSummaryResponseCommand
+  ): Promise<void> {
+    return this.transaction(
+      cmd.conversationId,
+      (aggregate: ConversationAggregate) =>
+        aggregate.processSummaryResponse(cmd, this.conversationAIService)
     );
   }
 
 
@@ -6,6 +6,7 @@ import { CommandBus, globalCommandBus } from "../../domain/bus/command-bus";
 import {
   ConversationAICommand,
   TriggerCompletionCommand,
+  TriggerSummaryCommand,
 } from "../../domain/conversation/ai/conversation-ai.commands";
 import { ConversationCommand } from "../../domain/conversation/conversation.commands";
 import { OpenAIService } from "../../infrastructure/openai/openai.service";
@@ -21,6 +22,9 @@ export class OpenAICommandHandler {
       case "TRIGGER_COMPLETION_COMMAND": {
         return this.executeTriggerCompletion(cmd);
       }
+      case "TRIGGER_SUMMARY_COMMAND": {
+        return this.executeTriggerSummary(cmd);
+      }
       default:
         throw new Error(`unknown type of command: ${JSON.stringify(cmd)}`);
     }
@@ -30,11 +34,26 @@ export class OpenAICommandHandler {
     cmd: TriggerCompletionCommand
   ): Promise<void> {
     try {
-      const { text, usage } = await this.openAIService.completion(cmd.messages);
+      // TODO: make debug logging better and count usage with proper metrics
+      console.log(
+        JSON.stringify({
+          cmd: {
+            ...cmd,
+            conversation: {
+              summarySize: cmd.conversation.summary?.length,
+              messagesCount: cmd.conversation.messages.length,
+            },
+          },
+        })
+      );
+
+      const { text, usage } = await this.openAIService.completion(
+        cmd.conversation
+      );
 
       await this.conversationCommandBus.send({
         type: "PROCESS_COMPLETION_RESPONSE_COMMAND",
-        botResponseType: "BOT_RESPONSE_SUCCESS",
+        responseType: "BOT_COMPLETION_SUCCESS",
         conversationId: cmd.conversationId,
         correlationId: cmd.correlationId,
         message: text,
@@ -46,7 +65,52 @@ export class OpenAICommandHandler {
     } catch (err: any) {
       await this.conversationCommandBus.send({
         type: "PROCESS_COMPLETION_RESPONSE_COMMAND",
-        botResponseType: "BOT_RESPONSE_ERROR",
+        responseType: "BOT_COMPLETION_ERROR",
+        conversationId: cmd.conversationId,
+        correlationId: cmd.correlationId,
+        error: {
+          message: err.message,
+        },
+      });
+    }
+  }
+
+  private async executeTriggerSummary(
+    cmd: TriggerSummaryCommand
+  ): Promise<void> {
+    try {
+      // TODO: make debug logging better and count usage with proper metrics
+      console.log(
+        JSON.stringify({
+          cmd: {
+            ...cmd,
+            conversation: {
+              summarySize: cmd.conversation.summary?.length,
+              messagesCount: cmd.conversation.messages.length,
+            },
+          },
+        })
+      );
+
+      const { summary, usage } = await this.openAIService.summary(
+        cmd.conversation
+      );
+
+      await this.conversationCommandBus.send({
+        type: "PROCESS_SUMMARY_RESPONSE_COMMAND",
+        responseType: "BOT_SUMMARY_SUCCESS",
+        conversationId: cmd.conversationId,
+        correlationId: cmd.correlationId,
+        summary,
+        // We rely on the fact that only 1 completion is done, this number could be wrong
+        // if we used `best_of` and `n` parameters.
+        summaryTokens: usage.completionTokens,
+        totalTokensSpent: usage.totalTokens,
+      });
+    } catch (err: any) {
+      await this.conversationCommandBus.send({
+        type: "PROCESS_SUMMARY_RESPONSE_COMMAND",
+        responseType: "BOT_SUMMARY_ERROR",
         conversationId: cmd.conversationId,
         correlationId: cmd.correlationId,
         error: {
 
@@ -1,7 +1,7 @@
 import { WebClient } from "@slack/web-api";
 import {
   BotResponseAdded,
-  BotResponseRequested,
+  BotCompletionRequested,
   ConversationEnded,
   ConversationEvent,
   ConversationStarted,
@@ -20,8 +20,8 @@ export class ConversationEventHandler {
     switch (event.type) {
       case "CONVERSATION_STARTED":
         return this.handleConversationStarted(event);
-      case "BOT_RESPONSE_REQUESTED":
-        return this.handleBotResponseRequested(event);
+      case "BOT_COMPLETION_REQUESTED":
+        return this.handleBotCompletionRequested(event);
       case "BOT_RESPONSE_ADDED":
         return this.handleBotResponseAdded(event);
       case "CONVERSATION_ENDED":
@@ -43,8 +43,8 @@ export class ConversationEventHandler {
     });
   }
 
-  private async handleBotResponseRequested(
-    event: BotResponseRequested
+  private async handleBotCompletionRequested(
+    event: BotCompletionRequested
   ): Promise<void> {
     const [view, slackService] = await Promise.all([
       this.getOrFailByConversationId(event.conversationId),
@@ -98,7 +98,7 @@ export class ConversationEventHandler {
 
     await this.repository.update(updatedView);
 
-    if (event.reason.type === "BOT_RESPONSE_ERROR") {
+    if (event.reason.type === "BOT_COMPLETION_ERROR") {
       await this.completeBotMessage({
         view: updatedView,
         slackService,
 
@@ -14,5 +14,13 @@ export default {
     // all summarization and completion requests are counted towards the total tokens
     // persona adds overhead to each request
     maximumSpentTokens: Number.MAX_SAFE_INTEGER,
+    // conditions for triggering summarization
+    summarization: {
+      // minimum amount the token sum of all human/bot messages (since last summarization) should reach
+      // the summary size itself is not included into this count
+      minimumTokens: 500,
+      // minimum amount of user messages since last summarization
+      minimumUserMessages: 2,
+    },
   },
 };
@@ -1,4 +1,4 @@
-import { Message } from "@wisegpt/gpt-conversation-prompt";
+import { Conversation } from "@wisegpt/gpt-conversation-prompt";
 
 type BaseCommand = {
   conversationId: string;
@@ -7,7 +7,14 @@ type BaseCommand = {
 
 export type TriggerCompletionCommand = BaseCommand & {
   type: "TRIGGER_COMPLETION_COMMAND";
-  messages: Message[];
+  conversation: Conversation;
 };
 
-export type ConversationAICommand = TriggerCompletionCommand;
+export type TriggerSummaryCommand = BaseCommand & {
+  type: "TRIGGER_SUMMARY_COMMAND";
+  conversation: Conversation;
+};
+
+export type ConversationAICommand =
+  | TriggerCompletionCommand
+  | TriggerSummaryCommand;