Skip to content

Conversation

@VJHack
Copy link
Collaborator

@VJHack VJHack commented Jan 14, 2025

As of 10940, we are now able to serialize only relavent data.

A suggestion was made here to filter out unnecessary fields in the response. This feature ensures that we are transporting the most minimal response from the server to the client.

There are two places where this change was made:

  1. When we make the curl request in ring_update() to asyncronously process extra_context, we filter out all response fields since we don't take any action on the response.
  2. These are the response fields we want to include when making the main fim call
'response_fields':  [ 
                     "content",
                     "timings/prompt_n", 
                     "timings/prompt_ms", 
                     "timings/prompt_per_token_ms",
                     "timings/prompt_per_second",
                     "timings/predicted_n", 
                     "timings/predicted_ms", 
                     "timings/predicted_per_token_ms",
                     "timings/predicted_per_second",
                     "truncated",
                     "tokens_cached",
                     "generation_settings/n_ctx",
                     ], 

@VJHack VJHack requested a review from ggerganov January 14, 2025 17:30
formatting comma

Co-authored-by: Georgi Gerganov <[email protected]>

let l:n_cached = get(l:response, 'tokens_cached', 0)
let l:truncated = get(l:response, 'truncated', v:false)
let l:n_ctx = get(l:response, 'generation_settings/n_ctx', 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After recent refactoring, the server no longer provides the n_ctx field in the responses. We should remove it from the llama.vim client and instead of displaying:

c: 1234 / 0 | ...

We should simply display:

c: 1234 | ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did notice that. I removed all instances of n_ctx field and adjusted the info message accordingly as shown below.
Screen Shot 2025-01-14 at 3 22 19 PM

Thank you!

@ggerganov ggerganov merged commit 9f5cadd into ggml-org:master Jan 15, 2025
ggerganov added a commit that referenced this pull request Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants