-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Initial support for Gemma 3 models #717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@cjpais What is needed for image support? Also, what is the easiest way to debug or test it? Or should we just go with the text only for now? |
|
I think I should build on the granite support feature branch because that also introduces the attention scale param. |
|
Sweet thank you so much. I will take a look at it tomorrow/tuesday. I suspect the granite branch will be merged in a day or two, so we will probably wait for that overall before this comes in. If it's easier to rebase on it now great, but if it's easier later we can do that. |
|
largely this looks good to me. i will probably do another once over tomorrow again (mostly verifying i've tested it on the 4 sizes and it works @corebonts if you don't mind rebasing it that would be great, otherwise i can resolve tomorrow afternoon. |
|
On it |
Tested only on text-to-text.
|
Thanks @corebonts, I added the |
|
@cjpais Could you tell me why the And it's also mentioned in the technical report: |
|
@corebonts it's there! It's set from the function call below from what I can tell. I tested it as well as this matching the code on the main branch of llama.cpp currently
|
|
A+++ |
(parcially?) resolves #711
I still want to test it further, but I thought an initial review would be great.
The commit is based on the llama.ccp implementation:
TODO: