Replies: 4 comments 2 replies
-
hi ikawrakow, I am not an official developer of KT,@godrosv he is my colleague, and I am very sorry about this matter. After he gave me the code, I started the porting work without asking the source, but I noticed that the author in the file is also the same module's author as Llamafile, which is you. Afterwards, I completed all the porting work but did not modify any author information, because from the beginning KT kept mentioning that they used llamaflile as the core optimization, and I only filled in the complete functionality. I have always felt that the CPU optimization in Llamafile is the best part done. If I really want others to not know that you did it, I can completely modify the variable or function names. However, I have fully ported it, only modifying the necessary interface parts, because I still believe that the iqk part of Llamafile is your contribution! |
Beta Was this translation helpful? Give feedback.
-
Are you planning to correct it? The 1800 lines added in your PR are not a "port", but a direct copy of portions of the code here. It would be very nice if the actual origin was acknowledged by you and by the KT developers. |
Beta Was this translation helpful? Give feedback.
-
Yes, I have always believed that both the early content and the “ported” parts of Llamafile originated from your work. And what I did more was porting and testing, so I never intended to modify (except for necessary interface adjustments) your work. I think this is your contribution! |
Beta Was this translation helpful? Give feedback.
-
The KTransformers devs have now merged this PR, which addresses the concern raised in this discussion => closing. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
This PR is a direct copy from this file in
ik_llama.cpp
. It never acknowledges the source of the changes, and the KTransformers maintainers did not respond to my comment I left in the PR.The PR is being sold as
IQ1_S
implementation, but it copies not just theIQ1_S
GEMM, but also ~1800 LOCs of additional stuff, including theIQ2_XXS
implementation, the new implementation of any float type x any other float type GEMM, and a bunch of other optimizations I have done since my contributions to llamafile (394, 405, 428, 435, 453, and 464)For those who don't know, KTRansformers uses the quantized GEMM/GEMV implementation that I contributed to llamafile.
llamafile
uses the Apache-2.0 license, so I contributed the code under that license. KTransformers have kept the copyright notice in the file, but did not update after merging PR 754, which contains a copy of MIT licensed code.KTransformers PR 754 is interesting anyway. Github user @godrosev entered issue #209 on February 19 asking for
IQ1_S
support inllamafile
. There was already implementation for the row-interleaved variantIQ1_S_R4
inik_llama.cpp
, so I wasn't planning to also have support forIQ1_S
, and suggested to them to use that instead. But after some back-and-fort, I decided to addIQ1_S
, which I did in PR #212 on Feb 20. The KTransformers PR 754 is on March 3 and comes from Github user @moonshadow-25. There are 5 commits in the PR, and the first 2 come from @godrosev. @godrosev and @moonshadow-25 both have no Github activity other the PR (and Issue #209).So now the question is, what do I do about that. Opinions?
Beta Was this translation helpful? Give feedback.
All reactions