Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
179 commits
Select commit Hold shift + click to select a range
96c3fc0
added base model files to adapt and update.
burcgokden Aug 12, 2025
922bcf2
Replace `logger.warning` with `logger.warning_once` in `GradientCheck…
qgallouedec Aug 12, 2025
c997513
Fix regression in mllama vision encoder (#40083)
Isotr0py Aug 12, 2025
17168a5
Switch the order of args in StaticCache (for BC and future logic) (#4…
Cyrilvallez Aug 12, 2025
2ed9a8e
Fix Qwen3 MoE GGUF architecture mismatch (#39976)
ctcanbol Aug 12, 2025
05e5a95
Fix error on importing unavailable torch.distributed (#40038)
m-gallus Aug 12, 2025
d8b7254
Default to dequantize if cpu in device_map for mxfp4 (#39993)
MekkCyber Aug 12, 2025
c9254db
[`Flash Attention`] Fix flash attention integration (#40002)
vasqu Aug 12, 2025
4faefc6
[trainer] ensure special tokens in model configs are aligned with tok…
gante Aug 12, 2025
c4f39cd
Fix Causality Handling in Flash Attention to Support Bidirectional At…
lucaswychan Aug 12, 2025
ed3288a
[docs] Add reference to HF-maintained `custom_generate` collections (…
gante Aug 12, 2025
dc92a0a
Add model card for MobileViT (#40033)
Shivamjan Aug 12, 2025
2a5f7d8
remove sequence parallel in llama4 (#40084)
3outeille Aug 12, 2025
9504be9
🌐 [i18n-KO] Translated `tiny_agents.md` to Korean (#39913)
AhnJoonSung Aug 13, 2025
1a97c7f
[bugfix] Fix tensor device in Idefics2, Idefics3, and SmolVLM (#39975)
qgallouedec Aug 13, 2025
0a87fce
changed xLSTMRMSNorm to RMSNorm (#40113)
nikitazuevblago Aug 13, 2025
191f561
Fix QuantoQuantizedCache import issues (#40109)
manueldeprada Aug 13, 2025
93d2ceb
[serve] allow array `content` inputs for LLMs (#39829)
gante Aug 13, 2025
a734507
`decoding_method` argument in generate (#40085)
manueldeprada Aug 13, 2025
f28d396
Collated reports (#40080)
ivarflakstad Aug 13, 2025
8cb9dee
DOCS: Add missing space in SECURITY.md (#40087)
shivaheidari Aug 13, 2025
d41bada
[trainer] handle case where EOS token is None in `generation_config` …
gante Aug 13, 2025
bdb2946
Fix hidden torchvision>=0.15 dependency issue (#39928)
yonigozlan Aug 13, 2025
78b3efc
🌐 [i18n-KO] Translated `main_classes/processors.md` to Korean (#39519)
TaskerJang Aug 13, 2025
cbb6231
🌐 [i18n-KO] Translated `jamba.md` to Korean (#39890)
skwh54 Aug 13, 2025
6d0da2b
🌐 [i18n-KO] Translated `main_classes/optimizer_schedules.md` to Korea…
luckyvickyricky Aug 13, 2025
72c36a6
🚨🚨 [generate] ignore `cache_implementation="hybrid"` hub defaults (#…
gante Aug 13, 2025
3835371
🌐 [i18n-KO] Translated `gpt2.md` to Korean (#39808)
taemincode Aug 13, 2025
f4b3450
🌐 [i18n-KO] Translated `optimizers.md` to Korean (#40011)
chelsseeey Aug 13, 2025
71a93b3
🌐 [i18n-KO] Translated grounding-dino.md to Korean (#39861)
TaskerJang Aug 13, 2025
8c50091
🚨 Use lru_cache for sine pos embeddings MaskFormer (#40007)
yonigozlan Aug 13, 2025
155e883
🌐 [i18n-KO] Translated `pipelines.md` to Korean (#39577)
xhaktm00 Aug 13, 2025
e593082
gpt oss is important (#40139)
ArthurZucker Aug 13, 2025
c9f71fc
Fix Janus (#40140)
Cyrilvallez Aug 13, 2025
519c373
Add Segment Anything 2 (SAM2) (#32317)
SangbumChoi Aug 13, 2025
ece1357
[docs] Fix ko toctree (#40138)
stevhliu Aug 13, 2025
9d281bb
Remove an old badly designed test (#40142)
Cyrilvallez Aug 13, 2025
4dc335b
updated visualBERT modelcard (#40057)
Anil-Red Aug 13, 2025
faf91a3
🌐 [i18n-KO] Translated `gemma3.md` to Korean (#39865)
seopp Aug 13, 2025
9fb9a53
Fix quantized cache with only cache_implementation in generate (#40144)
Cyrilvallez Aug 13, 2025
bfb3649
Add pytest marker: `torch_compile_test` and `torch_export_test` (#39950)
ydshieh Aug 13, 2025
3ba5f34
Update Dockerfiles to install packages inside a virtual environment (…
Sai-Suraj-27 Aug 13, 2025
42e8f5d
Create self-scheduled-amd-mi355-caller.yml (#40134)
glegendre01 Aug 13, 2025
afb6ebd
[Cohere2Vision] remove unused arg (#40103)
zucchini-nlp Aug 14, 2025
23dbde4
[efficientloftr] fix bugs and follow original cross attn implementati…
sbucaille Aug 14, 2025
fec6cc0
Fix CI: Use correct import in SAM for torchvision InterpolationMode (…
manueldeprada Aug 14, 2025
3222e6a
[Continous Batching] set head_dim when config.head_dim is None (#40159)
kashif Aug 14, 2025
54579db
Replace `self.tokenizer` by `self.processing_class` (#40119)
qgallouedec Aug 14, 2025
f3a08f5
[FA2] Fix it finally - revert fa kwargs preparation (#40161)
Cyrilvallez Aug 14, 2025
d35b845
[bugfix] fix flash-attention2 unavailable error for Ascend NPU (#40151)
FightingZhen Aug 14, 2025
72e7cfc
Fix docs typo (#40167)
qubvel Aug 14, 2025
1360e9d
build: Add fast image processor tvp (#39529)
adutchengineer Aug 14, 2025
854ce07
Add GptOssForSequenceClassification for GPT-OSS models (#40043)
zyfedward Aug 14, 2025
c7c7d21
Standardize BARTpho model card: badges, new examples, fixed broken im…
eshwanthkartitr Aug 14, 2025
3849c13
Add dates to the model docs (#39320)
MHRDYN7 Aug 14, 2025
752f6e5
Pin torch to 2.7.1 on CircleCI for now (#40174)
ydshieh Aug 14, 2025
498f77d
Update dynamic attnt setter for multimodals (#39908)
zucchini-nlp Aug 14, 2025
19c878c
[MINOR:TYPO] Update base.py (#40169)
cakiki Aug 15, 2025
1885b6f
make model doc device agnostic (#40143)
yao-matrix Aug 15, 2025
5c88c77
fix to avoid modifying a view in place (#40162)
3outeille Aug 15, 2025
1945f00
Fix fsdp for generic-task models (#40191)
Cyrilvallez Aug 15, 2025
6b4637c
Add repr to EncoderDecoderCache (#40195)
Cyrilvallez Aug 15, 2025
f6f5ec9
Fix typos (#40175)
cyyever Aug 15, 2025
edbfd71
Remove _prepare_flash_attention_from_position_ids (#40069)
cyyever Aug 15, 2025
2c77338
Avoid CUDA stream sync (#40060)
cyyever Aug 15, 2025
18b34d3
Fix various Pylint warnings (#40107)
cyyever Aug 15, 2025
dbbd254
Update: add type hints to check_tokenizers.py (#40094)
ajeet214 Aug 15, 2025
76e2d3e
Benchmarking improvements (#39768)
ahadnagy Aug 15, 2025
f1f1320
Add X-Codec model (#38248)
Manalelaidouni Aug 15, 2025
a144fdf
Fix GPT-OSS `swiglu_limit` not passed in for MXFP4 (#40197)
danielhanchen Aug 15, 2025
26d603a
docs: Update LayoutLM model card according to new standardized format…
Jin-HoMLee Aug 15, 2025
fc92053
Revert "Pin torch to 2.7.1 on CircleCI for now" + Final fix for `too …
ydshieh Aug 18, 2025
0501790
Use correct `model_input_names` for PixtralImageProcessor (#40226)
rohitrango Aug 18, 2025
2e18a74
fix error vocab_size at Qwen2_5_VLForConditionalGeneration loss_funct…
killight98 Aug 18, 2025
8c7834e
[SAM 2] Change checkpoints in docs and tests (#40213)
yonigozlan Aug 18, 2025
0364e16
Fix more typos (#40212)
cyyever Aug 18, 2025
5dc0472
Fix ESM token_dropout crash when using inputs_embeds instead of input…
notkisk Aug 18, 2025
d9b0d7a
AMD scheduled CI ref env file (#40243)
ivarflakstad Aug 18, 2025
69617d3
Add Ovis2 model and processor implementation (#37088)
thisisiron Aug 18, 2025
4e4a30a
Fix more pylint warnings (#40204)
cyyever Aug 18, 2025
e36c8d0
🚨 Always return Cache objects in modelings (to align with generate) (…
manueldeprada Aug 18, 2025
1cc32a6
remove transpose_for_scores call in ESM-2 (#40210)
pstjohn Aug 18, 2025
c575c99
Add `chat_template` (`jinja2`) as an extra dependency (#40128)
tboerstad Aug 18, 2025
e04f058
[typing] fix type annotation error in DepthPro model image processor …
MengAiDev Aug 18, 2025
0ab4151
[serve] guard imports (#39825)
gante Aug 18, 2025
c5f1e44
[`CI`] Fix repo consistency (#40249)
vasqu Aug 18, 2025
6e33102
Fixes for EncoderDecoderCache (#40008)
remi-or Aug 18, 2025
3511e6a
fix: Catch correct ConnectionError for additional_chat_templates (#39…
akug Aug 18, 2025
d73780a
Model card for NLLB (#40074)
sahil-kabir Aug 18, 2025
585479b
Correct typo and update notes in docs Readme (#40234)
PavloFesenko Aug 18, 2025
ff734c8
Fix benchmark workflow (#40254)
ahadnagy Aug 18, 2025
b0d2c10
docs: Update OLMo model card (#40233)
rafakatri Aug 18, 2025
9824747
Skip broken tests (#40157)
zucchini-nlp Aug 19, 2025
d2ece7c
Remove MI300 CI (#40270)
ivarflakstad Aug 19, 2025
357aa63
set inputs_embeds to None while generate to avoid audio encoder forwa…
BakerBunker Aug 19, 2025
5f0337f
[detection] fix attention mask for RT-DETR-based models (#40269)
materight Aug 19, 2025
8a9d254
Fix slow static cache export tests (#40261)
jackzhxng Aug 19, 2025
88cb955
🚨🚨 Switch default compilation to fullgraph=False (#40137)
Cyrilvallez Aug 19, 2025
80874d8
Fix setting attention for multimodal models (#39984)
zucchini-nlp Aug 19, 2025
c020d0b
[detection] fix correct `k_proj` weight and bias slicing in D-FINE (#…
notkisk Aug 19, 2025
86c15dc
Add Kosmos-2.5 (#31711)
tic-top Aug 19, 2025
ea01821
Skipping pytree registration in case fsdp is enabled (#40075)
romitjain Aug 19, 2025
3917632
Update image_processing_perception_lm_fast.py to allow for proper ove…
tyleryzhu Aug 19, 2025
efbab42
fix which routing method (#40283)
ArthurZucker Aug 19, 2025
48828ff
Fix chat CLI GPU loading and request_id validation issues (#40230) (#…
robin-ede Aug 19, 2025
0346cae
docs(layoutlm): add missing `id=usage` to `<hfoptions>` tag in Layout…
Jin-HoMLee Aug 19, 2025
75ef97a
Standardize RAG model card (#40222)
aayush226 Aug 19, 2025
7dbdf2f
docs: Update TrOCR model card to new format (#40240)
AceHunterr Aug 19, 2025
ab035fe
Update model card for gpt neox japanese (#39862)
ahnjj Aug 19, 2025
3c45aa3
SmolVLM and InternVL: Ensure pixel values are converted to the correc…
qgallouedec Aug 19, 2025
00387c5
Standardize BertGeneration model card (#40250)
nemitha2005 Aug 19, 2025
ef49bb4
Adjust ROCm test output expectations (#40279)
ahadnagy Aug 19, 2025
7ca285d
SmolVLM test fixes (#40275)
ahadnagy Aug 19, 2025
4f1cc77
make model docs device agnostic (2) (#40256)
yao-matrix Aug 19, 2025
8bd0c3e
[3/3] make docs device agnostic, all en docs for existing models done…
yao-matrix Aug 20, 2025
3313a92
Add MetaCLIP 2 (#39826)
NielsRogge Aug 20, 2025
9783ece
Allow to be able to run `torch.compile` tests with `fullgraph=True` (…
ydshieh Aug 20, 2025
7c9cb05
[`FA`] Fix dtype in varlen with position ids (#40295)
vasqu Aug 20, 2025
64e302d
[docs] delete more TF/Flax docs (#40289)
gante Aug 20, 2025
fbcda48
Clean up X-Codec. (#40271)
ebezzam Aug 20, 2025
214ee72
Remove OTel SDK dependencies (#40305)
anuraaga Aug 20, 2025
a25a926
Fix GOT-OCR2 and Cohere2Vision image processor patches caculation (#4…
Isotr0py Aug 20, 2025
3dd101b
[`fix`] Pass adamw optimizer parameters to StableAdamW (#40184)
emapco Aug 20, 2025
879f4f1
chore: fix typo in `find_executable_batch_size` to match new 0.9 rati…
MilkClouds Aug 20, 2025
219a63c
:rotating_light: [`Flash Attention`] Fix sliding window size (#40163)
vasqu Aug 20, 2025
ad0f8a8
Remove unnecessary contiguous calls for modern torch (#40315)
Rocketknight1 Aug 20, 2025
47f6028
Add support for Florence-2 (#38188)
ducviet00 Aug 20, 2025
4cbdeac
Qwen2.5-Omni test fixes (#40307)
ahadnagy Aug 20, 2025
9193d46
Add back `_tp_plan` attribute (#39944)
rishub-tamirisa Aug 20, 2025
cd434dd
byebye torch 2.1 (#40317)
Rocketknight1 Aug 20, 2025
de27972
No more `natten` (#40287)
ydshieh Aug 20, 2025
7cca954
[`GPT OSS`] Refactor the tests as it was not properly checking the ou…
ArthurZucker Aug 20, 2025
90a9871
Update CI with nightly torch workflow file (#40306)
ydshieh Aug 20, 2025
49d574f
Fix: Apply `get_placeholder_mask` in Ovis2 (#40280)
thisisiron Aug 20, 2025
4c54e65
Update notification service amd_daily_ci_workflows definition (#40314)
ivarflakstad Aug 20, 2025
5a8a382
One cache class to rule them all (#40276)
Cyrilvallez Aug 20, 2025
0caa9dc
Fix chunked attention mask with left-padding (#40324)
Cyrilvallez Aug 21, 2025
22cd5e2
[docs] remove flax references from `/en/model_doc` (#40311)
gante Aug 21, 2025
cf6b79c
Fix qwen-omni processor text only mode (#40336)
yuekaizhang Aug 21, 2025
7651143
Change Qwen2RMSNorm to RMSNorm from PyTorch (#40066)
cyyever Aug 21, 2025
d2f3be7
Add DeepseekV3ForSequenceClassification for Deepseek V3 models (#40200)
abdokaseb Aug 21, 2025
235dee2
Fix deprecation warning version (#40343)
Cyrilvallez Aug 21, 2025
efb1668
Add missing arguments to class constructors (#40068)
cyyever Aug 21, 2025
a0b051d
[docs] remove TF references from `/en/model_doc` (#40344)
gante Aug 21, 2025
46300ef
Fix: Only call Trainer.align_special_tokens if model has "config" att…
tomaarsen Aug 21, 2025
c4ccb0e
add type hints (#40319)
wirthual Aug 21, 2025
edb5bc6
Fix an infinite loop bug in recursive search of relative imports (#40…
eladsegal Aug 21, 2025
e546bf0
Fix links in Glm4vMoe configuration classes to point to the correct H…
vvvdwbvvv Aug 21, 2025
afb4851
T5 test and target device fixes (#40313)
ahadnagy Aug 21, 2025
6daa0be
Update `test_spm_converter_bytefallback_warning` (#40284)
ydshieh Aug 21, 2025
b67ff23
(small) fix conditional for input_ids and input_embeds in marian (#40…
cyntqliu Aug 21, 2025
9b7dbb7
Fix attention vizualizer (#40285)
molbap Aug 21, 2025
c4c14b2
[ModernBert] Prevent the attention mask from being None in ModernBert…
ashmikuz Aug 21, 2025
c77923e
Clean up XCodec and other codecs (#40348)
ebezzam Aug 21, 2025
80806c1
[serve] add cors warnings (#40112)
gante Aug 21, 2025
742e596
[detection] use consistent dtype for Conditional and DAB DETR positio…
agkphysics Aug 21, 2025
edf45e3
Remove more PyTorch 2.2 compatible code (#40337)
cyyever Aug 21, 2025
9b39d4e
[`FA`] Fix some model tests (#40350)
vasqu Aug 21, 2025
b00cde2
Qwen2.5-VL test fixes for ROCm (#40308)
ahadnagy Aug 21, 2025
ebda6b7
[generate] handle support for cache classes when num enc layers != nu…
gante Aug 21, 2025
48c2865
[4/N]more docs to device agnostic (#40355)
yao-matrix Aug 21, 2025
d9af01c
DOCS: Clarification on the use of `label_names` as an argument to Tra…
huzaifa-jawad367 Aug 22, 2025
d3cb4ba
HunYuan opensource (#39606)
yjc9696 Aug 22, 2025
6aca74f
Fix idefics3 vision embeddings indices dtype (#40360)
Isotr0py Aug 22, 2025
8855f1a
wav2vec2 fixes (#40341)
remi-or Aug 22, 2025
bad10bf
Change multimodal data links to HF hub (#40309)
zucchini-nlp Aug 22, 2025
6c12c94
[pipelines] add support to `skip_special_tokens` in the main text gen…
gante Aug 22, 2025
9808e0c
⚠️⚠️ Use `dtype` instead of `torch_dtype` everywhere! (#39782)
Cyrilvallez Aug 22, 2025
afb5c51
[processor] move commonalities to mixin (#40339)
zucchini-nlp Aug 22, 2025
2085855
[configuration] allow to overwrite kwargs from subconfigs (#40241)
zucchini-nlp Aug 22, 2025
8c56754
fix(example): align parameter names with the latest function definiti…
developer0hye Aug 22, 2025
a59748a
Addiing ByteDance Seed Seed-OSS (#40272)
Fazziekey Aug 22, 2025
7847025
Add GptOssForTokenClassification for GPT-OSS models (#40190)
abdokaseb Aug 22, 2025
33f092d
Bug Fix: Dynamically set return_lse flag in FlexAttention (#40352)
amd-lalithnc Aug 22, 2025
67c023a
Chat Template Doc Fixes (#40173)
Rocketknight1 Aug 22, 2025
b3dbe1f
Rework the Cache documentation (#40373)
Cyrilvallez Aug 22, 2025
f3aee54
Update README_zh-hans.md (#40380)
TardC Aug 22, 2025
5347cdd
HF papers in doc (#40381)
qgallouedec Aug 22, 2025
a28acc8
initial commit of PLDR-LLM model files.
burcgokden Aug 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
15 changes: 8 additions & 7 deletions .circleci/create_circleci_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,9 @@ def __post_init__(self):
self.docker_image[0]["image"] = f"{self.docker_image[0]['image']}:dev"
print(f"Using {self.docker_image} docker image")
if self.install_steps is None:
self.install_steps = ["uv venv && uv pip install ."]
self.install_steps = ["uv pip install ."]
# Use a custom patched pytest to force exit the process at the end, to avoid `Too long with no output (exceeded 10m0s): context deadline exceeded`
self.install_steps.append("uv pip install git+https://github.com/ydshieh/[email protected]")
if self.pytest_options is None:
self.pytest_options = {}
if isinstance(self.tests_to_run, str):
Expand Down Expand Up @@ -213,7 +215,7 @@ def job_name(self):
docker_image=[{"image": "huggingface/transformers-torch-light"}],
# networkx==3.3 (after #36957) cause some issues
# TODO: remove this once it works directly
install_steps=["uv venv && uv pip install ."],
install_steps=["uv pip install ."],
marker="generate",
parallelism=6,
)
Expand Down Expand Up @@ -250,7 +252,7 @@ def job_name(self):
additional_env={"OMP_NUM_THREADS": 8},
docker_image=[{"image":"huggingface/transformers-examples-torch"}],
# TODO @ArthurZucker remove this once docker is easier to build
install_steps=["uv venv && uv pip install . && uv pip install -r examples/pytorch/_tests_requirements.txt"],
install_steps=["uv pip install . && uv pip install -r examples/pytorch/_tests_requirements.txt"],
pytest_num_workers=4,
)

Expand All @@ -259,7 +261,7 @@ def job_name(self):
additional_env={"HUGGINGFACE_CO_STAGING": True},
docker_image=[{"image":"huggingface/transformers-torch-light"}],
install_steps=[
'uv venv && uv pip install .',
'uv pip install .',
'git config --global user.email "[email protected]"',
'git config --global user.name "ci"',
],
Expand All @@ -273,7 +275,6 @@ def job_name(self):
"onnx",
docker_image=[{"image":"huggingface/transformers-torch-tf-light"}],
install_steps=[
"uv venv",
"uv pip install .[testing,sentencepiece,onnxruntime,vision,rjieba]",
],
pytest_options={"k onnx": None},
Expand Down Expand Up @@ -303,7 +304,7 @@ def job_name(self):
docker_image=[{"image": "huggingface/transformers-torch-light"}],
# networkx==3.3 (after #36957) cause some issues
# TODO: remove this once it works directly
install_steps=["uv venv && uv pip install .[serving]"],
install_steps=["uv pip install .[serving]"],
marker="not generate",
parallelism=6,
)
Expand All @@ -321,7 +322,7 @@ def job_name(self):
additional_env={"TRANSFORMERS_VERBOSITY": "error", "DATASETS_VERBOSITY": "error", "SKIP_CUDA_DOCTEST": "1"},
install_steps=[
# Add an empty file to keep the test step running correctly even no file is selected to be tested.
"uv venv && pip install .",
"uv pip install .",
"touch dummy.py",
command,
"cat pr_documentation_tests_temp.txt",
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ jobs:

- name: Run database init script
run: |
psql -f benchmark/init_db.sql
psql -f benchmark/utils/init_db.sql
env:
PGDATABASE: metrics
PGHOST: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGHOST }}
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/check_failed_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ on:
report_repo_id:
required: true
type: string
commit_sha:
required: false
type: string


env:
Expand Down Expand Up @@ -87,7 +90,7 @@ jobs:
- name: Update clone
working-directory: /transformers
if: ${{ env.process == 'true' }}
run: git fetch && git checkout ${{ github.sha }}
run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

- name: Get target commit
working-directory: /transformers/utils
Expand Down
49 changes: 49 additions & 0 deletions .github/workflows/collated-reports.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: CI collated reports

on:
workflow_call:
inputs:
job:
required: true
type: string
report_repo_id:
required: true
type: string
machine_type:
required: true
type: string
gpu_name:
description: Name of the GPU used for the job. Its enough that the value contains the name of the GPU, e.g. "noise-h100-more-noise". Case insensitive.
required: true
type: string

jobs:
collated_reports:
name: Collated reports
runs-on: ubuntu-22.04
if: always()
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4

- name: Collated reports
shell: bash
env:
ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
CI_SHA: ${{ github.sha }}
TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
run: |
pip install huggingface_hub
python3 utils/collated_reports.py \
--path /transformers/reports/ \
--machine-type ${{ inputs.machine_type }} \
--commit-hash ${{ env.CI_SHA }} \
--job ${{ inputs.job }} \
--report-repo-id ${{ inputs.report_repo_id }} \
--gpu-name ${{ inputs.gpu_name }}

- name: Upload collated reports
uses: actions/upload-artifact@v4
with:
name: collated_reports_${{ env.CI_SHA }}.json
path: collated_reports_${{ env.CI_SHA }}.json
5 changes: 4 additions & 1 deletion .github/workflows/model_jobs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ on:
docker:
required: true
type: string
commit_sha:
required: false
type: string
report_name_prefix:
required: false
default: run_models_gpu
Expand Down Expand Up @@ -70,7 +73,7 @@ jobs:

- name: Update clone
working-directory: /transformers
run: git fetch && git checkout ${{ github.sha }}
run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
Expand Down
37 changes: 13 additions & 24 deletions .github/workflows/self-nightly-caller.yml
Original file line number Diff line number Diff line change
@@ -1,43 +1,32 @@
name: Self-hosted runner (nightly-ci)

name: Nvidia CI with nightly torch

on:
repository_dispatch:
schedule:
- cron: "17 2 * * *"
# triggered when the daily scheduled Nvidia CI is completed.
# This way, we can compare the results more easily.
workflow_run:
workflows: ["Nvidia CI"]
branches: ["main"]
types: [completed]
push:
branches:
- run_nightly_ci*
- run_ci_with_nightly_torch*

jobs:
build_nightly_ci_images:
name: Build Nightly CI Docker Images
if: (github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_nightly_ci'))
build_nightly_torch_ci_images:
name: Build CI Docker Images with nightly torch
uses: ./.github/workflows/build-nightly-ci-docker-images.yml
secrets: inherit

model-ci:
name: Model CI
needs: [build_nightly_ci_images]
needs: build_nightly_torch_ci_images
uses: ./.github/workflows/self-scheduled.yml
with:
job: run_models_gpu
slack_report_channel: "#transformers-ci-past-future"
runner: ci
docker: huggingface/transformers-all-latest-torch-nightly-gpu
ci_event: Nightly CI
secrets: inherit

deepspeed-ci:
name: DeepSpeed CI
needs: [build_nightly_ci_images]
uses: ./.github/workflows/self-scheduled.yml
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#transformers-ci-past-future"
runner: ci
# test deepspeed nightly build with the latest release torch
docker: huggingface/transformers-pytorch-deepspeed-latest-gpu
ci_event: Nightly CI
working-directory-prefix: /workspace
report_repo_id: hf-internal-testing/transformers_daily_ci_with_torch_nightly
commit_sha: ${{ github.event.workflow_run.head_sha || github.sha }}
secrets: inherit
25 changes: 0 additions & 25 deletions .github/workflows/self-push-amd-mi300-caller.yml

This file was deleted.

4 changes: 4 additions & 0 deletions .github/workflows/self-scheduled-amd-mi325-caller.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi325
report_repo_id: optimum-amd/transformers_daily_ci
env_file: /etc/podinfo/gha-gpu-isolation-settings
secrets: inherit

torch-pipeline:
Expand All @@ -36,6 +37,7 @@ jobs:
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi325
report_repo_id: optimum-amd/transformers_daily_ci
env_file: /etc/podinfo/gha-gpu-isolation-settings
secrets: inherit

example-ci:
Expand All @@ -48,6 +50,7 @@ jobs:
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi325
report_repo_id: optimum-amd/transformers_daily_ci
env_file: /etc/podinfo/gha-gpu-isolation-settings
secrets: inherit

deepspeed-ci:
Expand All @@ -60,4 +63,5 @@ jobs:
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi325
report_repo_id: optimum-amd/transformers_daily_ci
env_file: /etc/podinfo/gha-gpu-isolation-settings
secrets: inherit
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
name: Self-hosted runner scale set (AMD mi300 scheduled CI caller)
name: Self-hosted runner scale set (AMD mi355 scheduled CI caller)

# Note: For every job in this workflow, the name of the runner scale set is finalized in the runner yaml i.e. huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml
# For example, 1gpu scale set: amd-mi300-ci-1gpu
# 2gpu scale set: amd-mi300-ci-2gpu
# For example, 1gpu : amd-mi355-ci-1gpu
# 2gpu : amd-mi355-ci-2gpu

on:
workflow_run:
Expand All @@ -20,9 +20,9 @@ jobs:
with:
job: run_models_gpu
slack_report_channel: "#amd-hf-ci"
runner_scale_set: amd-mi300-ci
runner_scale_set: amd-mi355-ci
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi300
ci_event: Scheduled CI (AMD) - mi355
report_repo_id: optimum-amd/transformers_daily_ci
secrets: inherit

Expand All @@ -32,9 +32,9 @@ jobs:
with:
job: run_pipelines_torch_gpu
slack_report_channel: "#amd-hf-ci"
runner_scale_set: amd-mi300-ci
runner_scale_set: amd-mi355-ci
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi300
ci_event: Scheduled CI (AMD) - mi355
report_repo_id: optimum-amd/transformers_daily_ci
secrets: inherit

Expand All @@ -44,9 +44,9 @@ jobs:
with:
job: run_examples_gpu
slack_report_channel: "#amd-hf-ci"
runner_scale_set: amd-mi300-ci
runner_scale_set: amd-mi355-ci
docker: huggingface/transformers-pytorch-amd-gpu
ci_event: Scheduled CI (AMD) - mi300
ci_event: Scheduled CI (AMD) - mi355
report_repo_id: optimum-amd/transformers_daily_ci
secrets: inherit

Expand All @@ -56,8 +56,8 @@ jobs:
with:
job: run_torch_cuda_extensions_gpu
slack_report_channel: "#amd-hf-ci"
runner_scale_set: amd-mi300-ci
runner_scale_set: amd-mi355-ci
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
ci_event: Scheduled CI (AMD) - mi300
ci_event: Scheduled CI (AMD) - mi355
report_repo_id: optimum-amd/transformers_daily_ci
secrets: inherit
11 changes: 8 additions & 3 deletions .github/workflows/self-scheduled-caller.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
name: Self-hosted runner (scheduled)

name: Nvidia CI

on:
repository_dispatch:
schedule:
- cron: "17 2 * * *"
push:
branches:
- run_scheduled_ci*
- run_nvidia_ci*
workflow_dispatch:
inputs:
prev_workflow_run_id:
Expand Down Expand Up @@ -54,6 +53,7 @@ jobs:
docker: huggingface/transformers-all-latest-gpu
ci_event: Daily CI
report_repo_id: hf-internal-testing/transformers_daily_ci
commit_sha: ${{ github.sha }}
secrets: inherit

torch-pipeline:
Expand All @@ -65,6 +65,7 @@ jobs:
docker: huggingface/transformers-pytorch-gpu
ci_event: Daily CI
report_repo_id: hf-internal-testing/transformers_daily_ci
commit_sha: ${{ github.sha }}
secrets: inherit

example-ci:
Expand All @@ -76,6 +77,7 @@ jobs:
docker: huggingface/transformers-all-latest-gpu
ci_event: Daily CI
report_repo_id: hf-internal-testing/transformers_daily_ci
commit_sha: ${{ github.sha }}
secrets: inherit

trainer-fsdp-ci:
Expand All @@ -87,6 +89,7 @@ jobs:
docker: huggingface/transformers-all-latest-gpu
ci_event: Daily CI
report_repo_id: hf-internal-testing/transformers_daily_ci
commit_sha: ${{ github.sha }}
secrets: inherit

deepspeed-ci:
Expand All @@ -99,6 +102,7 @@ jobs:
ci_event: Daily CI
working-directory-prefix: /workspace
report_repo_id: hf-internal-testing/transformers_daily_ci
commit_sha: ${{ github.sha }}
secrets: inherit

quantization-ci:
Expand All @@ -110,4 +114,5 @@ jobs:
docker: huggingface/transformers-quantization-latest-gpu
ci_event: Daily CI
report_repo_id: hf-internal-testing/transformers_daily_ci
commit_sha: ${{ github.sha }}
secrets: inherit
Loading
Loading