Skip to content

Commit 1eb6d09

Browse files
committed
update docstrings
Signed-off-by: Kyle Sayers <[email protected]>
1 parent 06d5967 commit 1eb6d09

File tree

2 files changed

+15
-1
lines changed

2 files changed

+15
-1
lines changed

examples/transform/quip_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919

2020
# Configure the quantization algorithm to run.
2121
# * apply spinquant transforms to model in order to make quantization easier
22-
# * quantize the weights to 4 bit with GPTQ with a group size 128
22+
# * quantize the weights to 4 bit with a group size 128
2323
recipe = [
2424
QuIPModifier(transform_type="random-hadamard"),
2525
QuantizationModifier(targets="Linear", scheme="W4A16", ignore=["lm_head"]),

src/llmcompressor/modifiers/transform/quip/base.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,20 @@ class QuIPModifier(Modifier):
3030
QuIP and QuIP# apply transforms to every linear layer, two of which are fused into
3131
the model weights and two of which remain as online rotations computed at runtime.
3232
33+
Lifecycle:
34+
- on_initialize
35+
- infer SpinQuantMappings & NormMappings
36+
- as needed, create transform schemes for R1, R2, R3, & R4
37+
- on_start
38+
- normalize embeddings
39+
- fuse norm layers into subsequent Linear layers
40+
- apply TransformConfig
41+
- fuse transforms into weights for mergeable transforms
42+
- add hooks for online transforms
43+
- on sequential epoch end
44+
- on_end
45+
- on_finalize
46+
3347
:param transform_type: The type of transform to apply to the model.
3448
`"hadamard"` has the least performance cost but only supports sizes which are
3549
powers of power of two.

0 commit comments

Comments
 (0)