Skip to content

Commit 98918b9

Browse files
author
George
authored
Merge branch 'main' into update-readme-quant
2 parents fc761ff + 9d82f35 commit 98918b9

File tree

147 files changed

+2612
-1864
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

147 files changed

+2612
-1864
lines changed

.github/workflows/test-check-transformers.yaml

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,41 @@ env:
1515
CLEARML_API_SECRET_KEY: ${{ secrets.CLEARML_API_SECRET_KEY }}
1616

1717
jobs:
18+
detect-changes:
19+
runs-on: ubuntu-latest
20+
21+
outputs:
22+
changes-present: ${{ steps.changed-files.outputs.any_modified }}
23+
24+
steps:
25+
- name: Checkout
26+
uses: actions/checkout@v4
27+
with:
28+
fetch-depth: 0
29+
- name: Get changed files
30+
id: changed-files
31+
uses: tj-actions/changed-files@v45
32+
with:
33+
files: |
34+
**
35+
!examples/**
36+
!tests/e2e/**
37+
!tests/lmeval/**
38+
!tests/examples/**
39+
!**/*.md
40+
!.github/**
41+
.github/workflows/test-check-transformers.yaml
42+
43+
- name: Log relevant output
44+
run: |
45+
echo "changes-present: ${{ steps.changed-files.outputs.any_modified }}"
46+
echo "all modified files: ${{ steps.changed-files.outputs.all_modified_files }}"
47+
shell: bash
48+
1849
transformers-tests:
50+
needs: [detect-changes]
1951
runs-on: gcp-k8s-vllm-l4-solo
20-
if: contains(github.event.pull_request.labels.*.name, 'ready') || github.event_name == 'push'
52+
if: (contains(github.event.pull_request.labels.*.name, 'ready') || github.event_name == 'push') && needs.detect-changes.outputs.changes-present == 'true'
2153
steps:
2254
- uses: actions/setup-python@v5
2355
with:

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -800,5 +800,6 @@ integrations/pytorch/pytorch_vision*
800800
nm_temp_test_logs/*
801801
sparse_logs/*
802802
wandb/
803+
timings/
803804
output_finetune/
804805
env_log.json

README.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -82,10 +82,9 @@ Note that the model can be swapped for a local or remote HF-compatible checkpoin
8282
Quantization is applied by selecting an algorithm and calling the `oneshot` API.
8383

8484
```python
85-
from llmcompressor.modifiers.quantization import GPTQModifier
8685
from llmcompressor.modifiers.smoothquant import SmoothQuantModifier
87-
from llmcompressor.transformers import oneshot
88-
from transformers import AutoModelForCausalLM
86+
from llmcompressor.modifiers.quantization import GPTQModifier
87+
from llmcompressor import oneshot
8988

9089
# Select quantization algorithm. In this case, we:
9190
# * apply SmoothQuant to make the activations easier to quantize

examples/big_models_with_accelerate/cpu_offloading_fp8.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
from transformers import AutoModelForCausalLM, AutoTokenizer
22

3+
from llmcompressor import oneshot
34
from llmcompressor.modifiers.quantization import QuantizationModifier
4-
from llmcompressor.transformers import oneshot
55

66
MODEL_ID = "meta-llama/Meta-Llama-3-70B-Instruct"
77
OUTPUT_DIR = MODEL_ID.split("/")[1] + "-FP8-Dynamic"

examples/big_models_with_accelerate/mult_gpus_int8_device_map.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22
from datasets import load_dataset
33
from transformers import AutoModelForCausalLM, AutoTokenizer
44

5+
from llmcompressor import oneshot
56
from llmcompressor.modifiers.quantization import GPTQModifier
67
from llmcompressor.modifiers.smoothquant import SmoothQuantModifier
7-
from llmcompressor.transformers import oneshot
88
from llmcompressor.transformers.compression.helpers import calculate_offload_device_map
99

1010
MODEL_ID = "meta-llama/Meta-Llama-3-70B-Instruct"

examples/big_models_with_accelerate/multi_gpu_int8.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
from datasets import load_dataset
22
from transformers import AutoModelForCausalLM, AutoTokenizer
33

4+
from llmcompressor import oneshot
45
from llmcompressor.modifiers.quantization import GPTQModifier
5-
from llmcompressor.transformers import oneshot
66

77
MODEL_ID = "meta-llama/Meta-Llama-3-70B-Instruct"
88
SAVE_DIR = MODEL_ID.split("/")[1] + "-W8A8-Dynamic"

examples/multimodal_audio/whisper_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
from datasets import load_dataset
33
from transformers import WhisperProcessor
44

5+
from llmcompressor import oneshot
56
from llmcompressor.modifiers.quantization import GPTQModifier
6-
from llmcompressor.transformers import oneshot
77
from llmcompressor.transformers.tracing import TraceableWhisperForConditionalGeneration
88

99
# Select model and load it.

examples/multimodal_vision/idefics3_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
from PIL import Image
55
from transformers import AutoProcessor
66

7+
from llmcompressor import oneshot
78
from llmcompressor.modifiers.quantization import GPTQModifier
8-
from llmcompressor.transformers import oneshot
99
from llmcompressor.transformers.tracing import TraceableIdefics3ForConditionalGeneration
1010

1111
# Load model.

examples/multimodal_vision/llava_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
from PIL import Image
44
from transformers import AutoProcessor
55

6+
from llmcompressor import oneshot
67
from llmcompressor.modifiers.quantization import GPTQModifier
7-
from llmcompressor.transformers import oneshot
88
from llmcompressor.transformers.tracing import TraceableLlavaForConditionalGeneration
99

1010
# Load model.

examples/multimodal_vision/mllama_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
from PIL import Image
44
from transformers import AutoProcessor
55

6+
from llmcompressor import oneshot
67
from llmcompressor.modifiers.quantization import GPTQModifier
7-
from llmcompressor.transformers import oneshot
88
from llmcompressor.transformers.tracing import TraceableMllamaForConditionalGeneration
99

1010
# Load model.

0 commit comments

Comments
 (0)