SimpleChatTC:Reasoning+: Update readme wrt reasoning, flow cleanup

hanishkvc · hanishkvc · commit d445e4abb1ca · 2025-10-28T20:50:46.000+05:30
Also cleanup the minimal based showing of chat messages a bit

And add github.com to allowed list
diff --git a/tools/server/public_simplechat/local.tools/simpleproxy.json b/tools/server/public_simplechat/local.tools/simpleproxy.json
@@ -37,7 +37,8 @@
         "^theprint\\.in$",
         ".*\\.ndtv\\.com$",
         "^lwn\\.net$",
-        "^arstechnica\\.com$"
+        "^arstechnica\\.com$",
+        ".*\\.github\\.com$"
     ],
     "bearer.insecure": "NeverSecure"
 }
diff --git a/tools/server/public_simplechat/readme.md b/tools/server/public_simplechat/readme.md
@@ -14,17 +14,20 @@ Continue reading for the details.
 ## overview
 
 This simple web frontend, allows triggering/testing the server's /completions or /chat/completions endpoints
-in a simple way with minimal code from a common code base. Inturn additionally it tries to allow single or
-multiple independent back and forth chatting to an extent, with the ai llm model at a basic level, with their
-own system prompts.
+in a simple way with minimal code from a common code base. Additionally it also allows end users to have
+single or multiple independent chat sessions with back and forth chatting to an extent, with the ai llm model
+at a basic level, with their own system prompts.
 
 This allows seeing the generated text / ai-model response in oneshot at the end, after it is fully generated,
 or potentially as it is being generated, in a streamed manner from the server/ai-model.
 
 ![Chat and Settings (old) screens](./simplechat_screens.webp "Chat and Settings (old) screens")
 
 Auto saves the chat session locally as and when the chat is progressing and inturn at a later time when you
-open SimpleChat, option is provided to restore the old chat session, if a matching one exists.
+open SimpleChat, option is provided to restore the old chat session, if a matching one exists. In turn if
+any of those chat sessions were pending wrt user triggering a tool call or submitting a tool call response,
+the ui is setup as needed for end user to continue with those previously saved sessions, from where they
+left off.
 
 The UI follows a responsive web design so that the layout can adapt to available display space in a usable
 enough manner, in general.
@@ -36,12 +39,17 @@ settings ui.
 For GenAi/LLM models supporting tool / function calling, allows one to interact with them and explore use of
 ai driven augmenting of the knowledge used for generating answers as well as for cross checking ai generated
 answers logically / programatically and by checking with other sources and lot more by making using of the
-predefined tools / functions. The end user is provided control over tool calling and response submitting.
+simple yet useful predefined tools / functions provided by this client web ui. The end user is provided full
+control over tool calling and response submitting.
 
-NOTE: Current web service api doesnt expose the model context length directly, so client logic doesnt provide
-any adaptive culling of old messages nor of replacing them with summary of their content etal. However there
-is a optional sliding window based chat logic, which provides a simple minded culling of old messages from
-the chat history before sending to the ai model.
+For GenAi/LLM models which support reasoning, the thinking of the model will be shown to the end user as the
+model is running through its reasoning.
+
+NOTE: As all genai/llm web service apis may or may not expose the model context length directly, and also
+as using ai out of band for additional parallel work may not be efficient given the loading of current systems
+by genai/llm models, so client logic doesnt provide any adaptive culling of old messages nor of replacing them
+with summary of their content etal. However there is a optional sliding window based chat logic, which provides
+a simple minded culling of old messages from the chat history before sending to the ai model.
 
 NOTE: Wrt options sent with the request, it mainly sets temperature, max_tokens and optionaly stream as well
 as tool_calls mainly for now. However if someone wants they can update the js file or equivalent member in
@@ -110,7 +118,7 @@ Once inside
   * try trim garbage in response or not
   * amount of chat history in the context sent to server/ai-model
   * oneshot or streamed mode.
-  * use built in tool calling or not
+  * use built in tool calling or not and its related params.
 
 * In completion mode >> note: most recent work has been in chat mode <<
   * one normally doesnt use a system prompt in completion mode.
@@ -149,6 +157,9 @@ Once inside
   * the user input box will be disabled and a working message will be shown in it.
   * if trim garbage is enabled, the logic will try to trim repeating text kind of garbage to some extent.
 
+* any reasoning / thinking by the model is shown to the end user, as it is occuring, if the ai model
+  shares the same over the http interface.
+
 * tool calling flow when working with ai models which support tool / function calling
   * if tool calling is enabled and the user query results in need for one of the builtin tools to be
     called, then the ai response might include request for tool call.
@@ -159,6 +170,9 @@ Once inside
     ie <tool_response> generated result with meta data </tool_response>
   * if user is ok with the tool response, they can click submit to send the same to the GenAi/LLM.
     User can even modify the response generated by the tool, if required, before submitting.
+  * ALERT: Sometimes the reasoning or chat from ai model may indicate tool call, but you may actually
+    not get/see a tool call, in such situations, dont forget to cross check that tool calling is
+    enabled in the settings.
 
 * just refresh the page, to reset wrt the chat history and or system prompt and start afresh.
   This also helps if you had forgotten to start the bundled simpleproxy.py server before hand.
@@ -372,8 +386,7 @@ needed to help generate better responses. this can also be used for
   * searching for specific topics and summarising the results
   * or so
 
-The tool calling feature has been tested with Gemma3N, Granite4 and GptOss (given that
-reasoning is currently unsupported by this client ui, it can mess with things)
+The tool calling feature has been tested with Gemma3N, Granite4 and GptOss.
 
 ALERT: The simple minded way in which this is implemented, it provides some minimal safety
 mechanism like running ai generated code in web workers and restricting web access to user
@@ -454,7 +467,8 @@ Provide a handler which
 * rather in some cases constructs the code to be run to get the tool / function call job done,
   and inturn pass the same to the provided web worker to get it executed. Use console.log while
   generating any response that should be sent back to the ai model, in your constructed code.
-* once the job is done, return the generated result as needed.
+* once the job is done, return the generated result as needed, along with tool call related meta
+  data like chatSessionId, toolCallId, toolName which was passed along with the tool call.
 
 Update the tc_switch to include a object entry for the tool, which inturn includes
 * the meta data wrt the tool call
@@ -495,24 +509,63 @@ gets executed, before tool calling returns and thus data / error generated by th
 get incorporated in result sent to ai engine on the server side.
 
 
-### ToDo
+### Progress
+
+#### Done
+
+Tool Calling support added, along with a bunch of useful tool calls as well as a bundled simple proxy
+if one wants to access web as part of tool call usage.
+
+Reasoning / thinking response from Ai Models is shown to the user, as they are being generated/shared.
+
+Chat Messages/Session and UI handling have been moved into corresponding Classes to an extent, this
+helps ensure that
+* switching chat sessions or loading a previous auto saved chat session will restore state including
+  ui such that end user can continue the chat session from where they left it, even if in the middle
+  of a tool call handshake.
+* new fields added to http handshake in oneshot or streaming mode can be handled in a structured way
+  to an extent.
+
+#### ToDo
 
 Is the tool call promise land trap deep enough, need to think through and explore around this once later.
 
 Trap error responses.
 
-Handle reasoning/thinking responses from ai models.
-
 Handle multimodal handshaking with ai models.
 
 Add fetch_rss and documents|data_store tool calling, through the simpleproxy.py if and where needed.
 
+Save used config entries along with the auto saved chat sessions and inturn give option to reload the
+same when saved chat is loaded.
+
+MAYBE make the settings in general chat session specific, rather than the current global config flow.
+
+
+### Debuging the handshake and beyond
+
+When working with llama.cpp server based GenAi/LLM running locally, to look at the handshake directly
+from the commandline, you could run something like below
 
-### Debuging the handshake
+* sudo tcpdump -i lo -s 0 -vvv -A host 127.0.0.1 and port 8080 | tee /tmp/td.log
+* or one could also try look at the network tab in the browser developer console
 
-When working with llama.cpp server based GenAi/LLM running locally
+One could always remove message entries or manipulate chat sessions by accessing document['gMe']
+in devel console of the browser
 
-sudo tcpdump -i lo -s 0 -vvv -A host 127.0.0.1 and port 8080 | tee /tmp/td.log
+* if you want the last tool call response you submitted to be re-available for tool call execution and
+  resubmitting of response fresh, for any reason, follow below steps
+  * remove the assistant response from end of chat session, if any, using
+    * document['gMe'].multiChat.simpleChats['SessionId'].xchat.pop()
+  * reset role of Tool response chat message to TOOL.TEMP from tool
+    * toolMessageIndex = document['gMe'].multiChat.simpleChats['SessionId'].xchat.length - 1
+    * document['gMe'].multiChat.simpleChats['SessionId'].xchat[toolMessageIndex].role = "TOOL.TEMP"
+  * clicking on the SessionId at top in UI, should refresh the chat ui and inturn it should now give
+    the option to control that tool call again
+  * this can also help in the case where the chat session fails with context window exceeded
+    * you restart the GenAi/LLM server after increasing the context window as needed
+    * edit the chat session history as mentioned above, to the extent needed
+    * resubmit the last needed user/tool response as needed
 
 
 ## At the end
diff --git a/tools/server/public_simplechat/simplechat.js b/tools/server/public_simplechat/simplechat.js
@@ -261,7 +261,7 @@ class ChatMessageEx {
         let content = ""
         let toolcall = ""
         if (this.ns.reasoning_content.trim() !== "") {
-            reasoning = `!!!Reasoning:  ${this.ns.reasoning_content.trim()} !!!\n`;
+            reasoning = `!!!Reasoning: ${this.ns.reasoning_content.trim()} !!!\n\n`;
         }
         if (this.ns.content !== "") {
             content = this.ns.content;
@@ -898,7 +898,7 @@ class MultiChatUI {
                 }
                 continue
             }
-            let entry = ui.el_create_append_p(`${x.ns.role}: ${x.content_equiv()}`, this.elDivChat);
+            let entry = ui.el_create_append_p(`[[ ${x.ns.role} ]]: ${x.content_equiv()}`, this.elDivChat);
             entry.className = `role-${x.ns.role}`;
             last = entry;
             if (x.ns.role === Roles.Assistant) {

Original file line number	Diff line number	Diff line change
`@@ -261,7 +261,7 @@ class ChatMessageEx {`
`261`	`261`	`let content = ""`
`262`	`262`	`let toolcall = ""`
`263`	`263`	`if (this.ns.reasoning_content.trim() !== "") {`
`264`		- reasoning = `!!!Reasoning: ${this.ns.reasoning_content.trim()} !!!\n`;
	`264`	+ reasoning = `!!!Reasoning: ${this.ns.reasoning_content.trim()} !!!\n\n`;
`265`	`265`	`}`
`266`	`266`	`if (this.ns.content !== "") {`
`267`	`267`	`content = this.ns.content;`
`@@ -898,7 +898,7 @@ class MultiChatUI {`
`898`	`898`	`}`
`899`	`899`	`continue`
`900`	`900`	`}`
`901`		- let entry = ui.el_create_append_p(`${x.ns.role}: ${x.content_equiv()}`, this.elDivChat);
	`901`	+ let entry = ui.el_create_append_p(`[[ ${x.ns.role} ]]: ${x.content_equiv()}`, this.elDivChat);
`902`	`902`	entry.className = `role-${x.ns.role}`;
`903`	`903`	`last = entry;`
`904`	`904`	`if (x.ns.role === Roles.Assistant) {`