Skip to content

Commit d445e4a

Browse files
committed
SimpleChatTC:Reasoning+: Update readme wrt reasoning, flow cleanup
Also cleanup the minimal based showing of chat messages a bit And add github.com to allowed list
1 parent fac185d commit d445e4a

File tree

3 files changed

+76
-22
lines changed

3 files changed

+76
-22
lines changed

tools/server/public_simplechat/local.tools/simpleproxy.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,8 @@
3737
"^theprint\\.in$",
3838
".*\\.ndtv\\.com$",
3939
"^lwn\\.net$",
40-
"^arstechnica\\.com$"
40+
"^arstechnica\\.com$",
41+
".*\\.github\\.com$"
4142
],
4243
"bearer.insecure": "NeverSecure"
4344
}

tools/server/public_simplechat/readme.md

Lines changed: 72 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -14,17 +14,20 @@ Continue reading for the details.
1414
## overview
1515

1616
This simple web frontend, allows triggering/testing the server's /completions or /chat/completions endpoints
17-
in a simple way with minimal code from a common code base. Inturn additionally it tries to allow single or
18-
multiple independent back and forth chatting to an extent, with the ai llm model at a basic level, with their
19-
own system prompts.
17+
in a simple way with minimal code from a common code base. Additionally it also allows end users to have
18+
single or multiple independent chat sessions with back and forth chatting to an extent, with the ai llm model
19+
at a basic level, with their own system prompts.
2020

2121
This allows seeing the generated text / ai-model response in oneshot at the end, after it is fully generated,
2222
or potentially as it is being generated, in a streamed manner from the server/ai-model.
2323

2424
![Chat and Settings (old) screens](./simplechat_screens.webp "Chat and Settings (old) screens")
2525

2626
Auto saves the chat session locally as and when the chat is progressing and inturn at a later time when you
27-
open SimpleChat, option is provided to restore the old chat session, if a matching one exists.
27+
open SimpleChat, option is provided to restore the old chat session, if a matching one exists. In turn if
28+
any of those chat sessions were pending wrt user triggering a tool call or submitting a tool call response,
29+
the ui is setup as needed for end user to continue with those previously saved sessions, from where they
30+
left off.
2831

2932
The UI follows a responsive web design so that the layout can adapt to available display space in a usable
3033
enough manner, in general.
@@ -36,12 +39,17 @@ settings ui.
3639
For GenAi/LLM models supporting tool / function calling, allows one to interact with them and explore use of
3740
ai driven augmenting of the knowledge used for generating answers as well as for cross checking ai generated
3841
answers logically / programatically and by checking with other sources and lot more by making using of the
39-
predefined tools / functions. The end user is provided control over tool calling and response submitting.
42+
simple yet useful predefined tools / functions provided by this client web ui. The end user is provided full
43+
control over tool calling and response submitting.
4044

41-
NOTE: Current web service api doesnt expose the model context length directly, so client logic doesnt provide
42-
any adaptive culling of old messages nor of replacing them with summary of their content etal. However there
43-
is a optional sliding window based chat logic, which provides a simple minded culling of old messages from
44-
the chat history before sending to the ai model.
45+
For GenAi/LLM models which support reasoning, the thinking of the model will be shown to the end user as the
46+
model is running through its reasoning.
47+
48+
NOTE: As all genai/llm web service apis may or may not expose the model context length directly, and also
49+
as using ai out of band for additional parallel work may not be efficient given the loading of current systems
50+
by genai/llm models, so client logic doesnt provide any adaptive culling of old messages nor of replacing them
51+
with summary of their content etal. However there is a optional sliding window based chat logic, which provides
52+
a simple minded culling of old messages from the chat history before sending to the ai model.
4553

4654
NOTE: Wrt options sent with the request, it mainly sets temperature, max_tokens and optionaly stream as well
4755
as tool_calls mainly for now. However if someone wants they can update the js file or equivalent member in
@@ -110,7 +118,7 @@ Once inside
110118
* try trim garbage in response or not
111119
* amount of chat history in the context sent to server/ai-model
112120
* oneshot or streamed mode.
113-
* use built in tool calling or not
121+
* use built in tool calling or not and its related params.
114122

115123
* In completion mode >> note: most recent work has been in chat mode <<
116124
* one normally doesnt use a system prompt in completion mode.
@@ -149,6 +157,9 @@ Once inside
149157
* the user input box will be disabled and a working message will be shown in it.
150158
* if trim garbage is enabled, the logic will try to trim repeating text kind of garbage to some extent.
151159

160+
* any reasoning / thinking by the model is shown to the end user, as it is occuring, if the ai model
161+
shares the same over the http interface.
162+
152163
* tool calling flow when working with ai models which support tool / function calling
153164
* if tool calling is enabled and the user query results in need for one of the builtin tools to be
154165
called, then the ai response might include request for tool call.
@@ -159,6 +170,9 @@ Once inside
159170
ie <tool_response> generated result with meta data </tool_response>
160171
* if user is ok with the tool response, they can click submit to send the same to the GenAi/LLM.
161172
User can even modify the response generated by the tool, if required, before submitting.
173+
* ALERT: Sometimes the reasoning or chat from ai model may indicate tool call, but you may actually
174+
not get/see a tool call, in such situations, dont forget to cross check that tool calling is
175+
enabled in the settings.
162176

163177
* just refresh the page, to reset wrt the chat history and or system prompt and start afresh.
164178
This also helps if you had forgotten to start the bundled simpleproxy.py server before hand.
@@ -372,8 +386,7 @@ needed to help generate better responses. this can also be used for
372386
* searching for specific topics and summarising the results
373387
* or so
374388

375-
The tool calling feature has been tested with Gemma3N, Granite4 and GptOss (given that
376-
reasoning is currently unsupported by this client ui, it can mess with things)
389+
The tool calling feature has been tested with Gemma3N, Granite4 and GptOss.
377390

378391
ALERT: The simple minded way in which this is implemented, it provides some minimal safety
379392
mechanism like running ai generated code in web workers and restricting web access to user
@@ -454,7 +467,8 @@ Provide a handler which
454467
* rather in some cases constructs the code to be run to get the tool / function call job done,
455468
and inturn pass the same to the provided web worker to get it executed. Use console.log while
456469
generating any response that should be sent back to the ai model, in your constructed code.
457-
* once the job is done, return the generated result as needed.
470+
* once the job is done, return the generated result as needed, along with tool call related meta
471+
data like chatSessionId, toolCallId, toolName which was passed along with the tool call.
458472

459473
Update the tc_switch to include a object entry for the tool, which inturn includes
460474
* the meta data wrt the tool call
@@ -495,24 +509,63 @@ gets executed, before tool calling returns and thus data / error generated by th
495509
get incorporated in result sent to ai engine on the server side.
496510

497511

498-
### ToDo
512+
### Progress
513+
514+
#### Done
515+
516+
Tool Calling support added, along with a bunch of useful tool calls as well as a bundled simple proxy
517+
if one wants to access web as part of tool call usage.
518+
519+
Reasoning / thinking response from Ai Models is shown to the user, as they are being generated/shared.
520+
521+
Chat Messages/Session and UI handling have been moved into corresponding Classes to an extent, this
522+
helps ensure that
523+
* switching chat sessions or loading a previous auto saved chat session will restore state including
524+
ui such that end user can continue the chat session from where they left it, even if in the middle
525+
of a tool call handshake.
526+
* new fields added to http handshake in oneshot or streaming mode can be handled in a structured way
527+
to an extent.
528+
529+
#### ToDo
499530

500531
Is the tool call promise land trap deep enough, need to think through and explore around this once later.
501532

502533
Trap error responses.
503534

504-
Handle reasoning/thinking responses from ai models.
505-
506535
Handle multimodal handshaking with ai models.
507536

508537
Add fetch_rss and documents|data_store tool calling, through the simpleproxy.py if and where needed.
509538

539+
Save used config entries along with the auto saved chat sessions and inturn give option to reload the
540+
same when saved chat is loaded.
541+
542+
MAYBE make the settings in general chat session specific, rather than the current global config flow.
543+
544+
545+
### Debuging the handshake and beyond
546+
547+
When working with llama.cpp server based GenAi/LLM running locally, to look at the handshake directly
548+
from the commandline, you could run something like below
510549

511-
### Debuging the handshake
550+
* sudo tcpdump -i lo -s 0 -vvv -A host 127.0.0.1 and port 8080 | tee /tmp/td.log
551+
* or one could also try look at the network tab in the browser developer console
512552

513-
When working with llama.cpp server based GenAi/LLM running locally
553+
One could always remove message entries or manipulate chat sessions by accessing document['gMe']
554+
in devel console of the browser
514555

515-
sudo tcpdump -i lo -s 0 -vvv -A host 127.0.0.1 and port 8080 | tee /tmp/td.log
556+
* if you want the last tool call response you submitted to be re-available for tool call execution and
557+
resubmitting of response fresh, for any reason, follow below steps
558+
* remove the assistant response from end of chat session, if any, using
559+
* document['gMe'].multiChat.simpleChats['SessionId'].xchat.pop()
560+
* reset role of Tool response chat message to TOOL.TEMP from tool
561+
* toolMessageIndex = document['gMe'].multiChat.simpleChats['SessionId'].xchat.length - 1
562+
* document['gMe'].multiChat.simpleChats['SessionId'].xchat[toolMessageIndex].role = "TOOL.TEMP"
563+
* clicking on the SessionId at top in UI, should refresh the chat ui and inturn it should now give
564+
the option to control that tool call again
565+
* this can also help in the case where the chat session fails with context window exceeded
566+
* you restart the GenAi/LLM server after increasing the context window as needed
567+
* edit the chat session history as mentioned above, to the extent needed
568+
* resubmit the last needed user/tool response as needed
516569

517570

518571
## At the end

tools/server/public_simplechat/simplechat.js

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -261,7 +261,7 @@ class ChatMessageEx {
261261
let content = ""
262262
let toolcall = ""
263263
if (this.ns.reasoning_content.trim() !== "") {
264-
reasoning = `!!!Reasoning: ${this.ns.reasoning_content.trim()} !!!\n`;
264+
reasoning = `!!!Reasoning: ${this.ns.reasoning_content.trim()} !!!\n\n`;
265265
}
266266
if (this.ns.content !== "") {
267267
content = this.ns.content;
@@ -898,7 +898,7 @@ class MultiChatUI {
898898
}
899899
continue
900900
}
901-
let entry = ui.el_create_append_p(`${x.ns.role}: ${x.content_equiv()}`, this.elDivChat);
901+
let entry = ui.el_create_append_p(`[[ ${x.ns.role} ]]: ${x.content_equiv()}`, this.elDivChat);
902902
entry.className = `role-${x.ns.role}`;
903903
last = entry;
904904
if (x.ns.role === Roles.Assistant) {

0 commit comments

Comments
 (0)