-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Description
System Info
- `Accelerate` version: 1.12.0
- Platform: Linux-6.8.0-1029-aws-x86_64-with-glibc2.39
- `accelerate` bash location: /home/ubuntu/shon/remake-site2site/.venv/bin/accelerate
- Python version: 3.12.3
- Numpy version: 2.2.6
- PyTorch version: 2.9.0+cu129
- PyTorch accelerator: CUDA
- System RAM: 372.73 GB
- GPU type: NVIDIA L40S
- `Accelerate` default config:
Not foundInformation
- The official example scripts
- My own modified scripts
Tasks
- One of the scripts in the examples/ folder of Accelerate or an officially supported
no_trainerscript in theexamplesfolder of thetransformersrepo (such asrun_no_trainer_glue.py) - My own task or dataset (give details below)
Reproduction
while calling
mpu = UlyssesSPAttentionHF.register_with_transformers(
model_name_or_path=model,
sequence_parallel_size=sp_size,
seq_length=sp_handler.sp_seq_length,
seq_length_is_variable=sp_handler.sp_seq_length_is_variable,
core_attn_implementation=sp_handler.sp_attn_implementation,
micro_batch_size=batch_size_per_device,
)
with PEFT model,
this code snippet (from deepspeed )is failing
from transformers import PreTrainedModel
if isinstance(model_name_or_path, PreTrainedModel):
# we already have the model
hf_model_config = model_name_or_path.config
else:
# if we don't have the model yet at this stage
hf_model_config = AutoConfig.from_pretrained(model_name_or_path)
since PEFT model is not instance of PreTrainedModel
Expected behavior
accept also PEFT models
Metadata
Metadata
Assignees
Labels
No labels