Skip to content

Conversation

@adamgordonbell
Copy link
Contributor

Caching was no longer happening in newest version. Reintroducing here for OpenAI chat.

Not sure if this is the ideal area to insert caching or if model.call could somehow cache for all.

@slundberg
Copy link
Collaborator

Thanks @adamgordonbell ! I am just about merge some extensive client/server updates, then I'll come back and try and loop this in.

@slundberg
Copy link
Collaborator

One quick question on the PR. Adding poetry support sounds nice, but do you know if there is a way to do that without creating two copies of our dependency lists? (we can also just leave that out for now if it slows things down too much)

@adamgordonbell
Copy link
Contributor Author

One quick question on the PR. Adding poetry support sounds nice, but do you know if there is a way to do that without creating two copies of our dependency lists? (we can also just leave that out for now if it slows things down too much)

It's outside of my area of knowledge but I think so. Though I do think the cpp files probably complicate that. I can remove that from this PR, just helped with my development.

Thanks @adamgordonbell ! I am just about merge some extensive client/server updates, then I'll come back and try and loop this in.

Awesome!

@slundberg
Copy link
Collaborator

Merged this with the latest updates. Thanks again!

@slundberg slundberg merged commit 496bc53 into guidance-ai:main Feb 1, 2024
@prescod
Copy link

prescod commented Feb 18, 2024

Thank. you for adding this @adamgordonbell and @slundberg . Caching was about 50% of the reason I used Guidance. It's easy to implement myself but adding it to each project becomes tedious.

@maximegmd
Copy link

Why was this merged?

  • It cannot be disabled
  • It doesn't distinguish models, so calling gpt4 with a prompt after calling with gpt3.5 will return the gpt3.5 data
  • There is no doc on resetting the cache

@Harsha-Nori
Copy link
Member

Harsha-Nori commented May 18, 2024

Hi @maximegmd, I was running into similar challenges with caching, and dropped it as part of a simplification we're making to the codebase (#820).

I think this PR was a strong initial concept (thank you @adamgordonbell!!), but as we've started changing the structure of guidance in the last few months, I'd like to pursue a more universal solution (beyond just OpenAI) that also addresses your concerns. If you want to use a no-cache version of the codebase today, just install from source.

@adamgordonbell
Copy link
Contributor Author

Yeah, this caching implementation had some weakenesses. Excited to see what the next iteration is.

Why was this merged?

  • It cannot be disabled

It was disabled for a call when the temperature was above 0. I thought that made sense, as with 0 temp you are asking for a deterministic result, which temp 0 and caching will give.

@maximegmd
Copy link

Yeah, this caching implementation had some weakenesses. Excited to see what the next iteration is.

Why was this merged?

  • It cannot be disabled

It was disabled for a call when the temperature was above 0. I thought that made sense, as with 0 temp you are asking for a deterministic result, which temp 0 and caching will give.

Except it didn't distinguish between models so with temperature 0 you get the cache of whichever model was used first with a specific prompt and we were doing model comparisons so we scratched our heads a little to understand why all OpenAI models had the same score haha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants