Skip to content
This repository was archived by the owner on Jul 22, 2025. It is now read-only.

Conversation

@SamSaffron
Copy link
Member

@SamSaffron SamSaffron commented Nov 8, 2024

This re-implements tool support in DiscourseAi::Completions::Llm #generate

Previously tool support was always returned via XML and it would be the responsibility of the caller to parse XML

New implementation has the endpoints return ToolCall objects.

Additionally this simplifies the Llm endpoint interface and gives it more clarity. Llms must implement

decode, decode_chunk (for streaming)

It is the implementers responsibility to figure out how to decode chunks, base no longer implements. To make this easy we ship a flexible json decoder which is easy to wire up.

Also (new)

  • Better debugging for PMs, we now have a next / previous button to see all the Llm messages associated with a PM
  • Token accounting is fixed for vllm (we were not correctly counting tokens)

@SamSaffron SamSaffron marked this pull request as ready for review November 11, 2024 06:20
@SamSaffron
Copy link
Member Author

Notable compromise:

.generate will return either an Array (for tool call + completion) or a single element for a single element array.

This does place some responsibility on caller who may get differently shaped data.

We could always return an array but it will make it more complex to consume for cases where you are not using tools.

@SamSaffron SamSaffron merged commit e817b7d into main Nov 11, 2024
6 checks passed
@SamSaffron SamSaffron deleted the tool-us-no-xml branch November 11, 2024 21:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants