This issue tracks two code quality improvements for gemma/model.py to enhance maintainability and API clarity.
1. Redundant Code in Linear and Embedding Classes
- Problem: The
Linear and Embedding classes contain nearly identical boilerplate code for handling quantized weights. This violates the DRY (Don't Repeat Yourself) principle.
- Proposed Solution: Refactor this shared logic into a new
QuantizedWeight base class. This will make the code cleaner, more modular, and easier to maintain.
2. Unused kv_write_indices Parameter
- Problem: The
GemmaForCausalLM.forward method accepts a kv_write_indices parameter that is immediately overwritten by input_positions. The passed argument is never used.
- Proposed Solution: Remove this redundant parameter from the method signature to clean up the API and avoid confusion for developers.
These changes will improve the overall code quality and maintainability without altering the model's functionality.