What happened?
The new README commit text indicates recent quantization improvements but it's not clear what that means.
e.g.,
- Are they now correct? (previously in error?)
- Are they more accurate? (previously out of spec?)
- Is the implementation more efficient?
- ...during inference?
- ...during quantization?
- ...or more memory efficient?
And similarly,
- Are old quants compatible? (or even valid?)
- Should they be recomputed?
Name and Version
98d1626