调用PeftModel.from_pretrained函数时出现报错：'fast_safe_open' object has no attribute 'offset_keys'

## Environment
此报错是在使用昇思大模型平台的实训项目板块中的Jupyter在线编程功能时出现的，选择的运行环境为Ascend环境，配置的规格为1*ascend-snt9b|ARM: 19核 180GB，镜像为python3.9-ms2.7.0-cann8.2.RC1

## Describe the current behavior
使用JupyterLab中的notebook进行模型微调实验时，在完成前面的步骤后进行微调后推理时，调用PeftModel.from_pretrained函数来加载lora权重，就会出现如下报错：
`AttributeError: 'fast_safe_open' object has no attribute 'offset_keys'`
（相同代码在VS Code上能成功运行，不会出现报错）

## Steps to reproduce the issue
我是根据下面文档中的命令及代码进行环境配置与实验的：
[DeepSeek-R1-Distill-Qwen-1.5B-LoRA微调实验手册.docx](https://github.com/user-attachments/files/22705777/DeepSeek-R1-Distill-Qwen-1.5B-LoRA.docx)

具体来说，先在JupyterLab的终端中输入以下命令配置实验环境：
`pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple`
`pip install modelscope`
`modelscope download --model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --local_dir ./DeepSeek-R1-Distill-Qwen-1.5B`
`git clone https://gitee.com/mindspore-lab/mindnlp.git`
`pip install peft==0.17.1`
`pip install accelerate==1.10.1`
`pip install transformers==4.55.4`
`cd mindnlp`
`git checkout 6719e5e60e72df9a1c7b9473029c7ccf62ff8a12`
`bash scripts/build_and_reinstall.sh`
`pip install ipykernel`
`python -m ipykernel install --prefix=/home/mindspore/.local --name=py310 --display-name "Python 3.10"`

之后使用Python 3.10内核打开notebook，运行如下代码：
import mindnlp
import mindspore
from mindnlp import core
from datasets import Dataset
import pandas as pd
from transformers import AutoTokenizer, AutoModelForCausalLM, DataCollatorForSeq2Seq, TrainingArguments, Trainer, GenerationConfig
from peft import LoraConfig, TaskType, get_peft_model, PeftModel

!wget "https://gh-proxy.com/https://raw.githubusercontent.com/datawhalechina/self-llm/refs/heads/master/dataset/huanhuan.json" -O huanhuan.json --no-check-certificate  # 下载数据集

df = pd.read_json('./huanhuan.json')
ds = Dataset.from_pandas(df)  # 数据加载并进行格式转换

tokenizer = AutoTokenizer.from_pretrained('deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B', use_fast=False, trust_remote_code=True)  # 实例化tokenizer

def process_func(example):
    MAX_LENGTH = 384    # Llama分词器会将一个中文字切分为多个token，因此需要放开一些最大长度，保证数据的完整性
    input_ids, attention_mask, labels = [], [], []
    instruction = tokenizer(f"<|im_start|>system\n现在你要扮演皇帝身边的女人--甄嬛<|im_end|>\n<|im_start|>user\n{example['instruction'] + example['input']}<|im_end|>\n<|im_start|>assistant\n", add_special_tokens=False)  # add_special_tokens 不在开头加 special_tokens
    response = tokenizer(f"{example['output']}", add_special_tokens=False)
    input_ids = instruction["input_ids"] + response["input_ids"] + [tokenizer.pad_token_id]
    attention_mask = instruction["attention_mask"] + response["attention_mask"] + [1]  # 因为eos token咱们也是要关注的所以 补充为1
    labels = [-100] * len(instruction["input_ids"]) + response["input_ids"] + [tokenizer.pad_token_id]  
    if len(input_ids) > MAX_LENGTH:  # 做一个截断
        input_ids = input_ids[:MAX_LENGTH]
        attention_mask = attention_mask[:MAX_LENGTH]
        labels = labels[:MAX_LENGTH]
    return {
        "input_ids": input_ids,
        "attention_mask": attention_mask,
        "labels": labels
    }

tokenized_id = ds.map(process_func, remove_columns=ds.column_names)

model = AutoModelForCausalLM.from_pretrained('deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B', ms_dtype=mindspore.bfloat16, device_map=0)  # 加载基础模型

model.enable_input_require_grads()  # 开启梯度检查点时，要执行该方法

model = model.npu()  # host to device

prompt = "你是谁？"
inputs = tokenizer.apply_chat_template([{"role": "system", "content": "现在你要扮演皇帝身边的女人--甄嬛"},{"role": "user", "content": prompt}],
                                       add_generation_prompt=True,
                                       tokenize=True,
                                       return_tensors="ms",
                                       return_dict=True
                                       ).to('cuda')

gen_kwargs = {"max_length": 2500, "do_sample": True, "top_k": 1}
with core.no_grad():
    outputs = model.generate(**inputs, **gen_kwargs)
    outputs = outputs[:, inputs['input_ids'].shape[1]:]
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))

config = LoraConfig(
    task_type=TaskType.CAUSAL_LM, 
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    inference_mode=False, # 训练模式
    r=8, # Lora 秩
    lora_alpha=32, # Lora alaph，具体作用参见 Lora 原理
    lora_dropout=0.1 # Dropout 比例
)  # 配置LoRA

model = get_peft_model(model, config) # 根据上述的lora配置，为模型添加lora部分

args = TrainingArguments(
    output_dir="./output_1.5bf/Qwen2.5_instruct_lora",
    per_device_train_batch_size=4,
    gradient_accumulation_steps=5,
    logging_steps=10,
    num_train_epochs=1,
    save_steps=100, 
    learning_rate=1e-4,
    save_on_each_node=True,
)  # 定义训练超参数

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized_id,
    data_collator=DataCollatorForSeq2Seq(tokenizer=tokenizer, padding=True),
)

trainer.train()

mode_path = 'deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B'
lora_path = './output_1.5bf/Qwen2.5_instruct_lora/checkpoint-187' # 这里改称你的 lora 输出对应 checkpoint 地址

tokenizer = AutoTokenizer.from_pretrained(mode_path, trust_remote_code=True)  # 加载tokenizer

model = AutoModelForCausalLM.from_pretrained(mode_path, ms_dtype=mindspore.bfloat16, trust_remote_code=True).eval()  # 加载模型

model = PeftModel.from_pretrained(model, model_id=lora_path)  # 加载lora权重

运行到这一步时就会出现报错（如下面截图所示），询问ai后根据建议调整版本、修改代码等，但报错还是一直出现

## Related log / screenshot

<img width="2018" height="1147" alt="Image" src="https://github.com/user-attachments/assets/3caa8204-e17a-456b-8330-6c3490c4acff" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

调用PeftModel.from_pretrained函数时出现报错：'fast_safe_open' object has no attribute 'offset_keys' #343

Environment

Describe the current behavior

Steps to reproduce the issue

Related log / screenshot

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

调用PeftModel.from_pretrained函数时出现报错：'fast_safe_open' object has no attribute 'offset_keys' #343

Description

Environment

Describe the current behavior

Steps to reproduce the issue

Related log / screenshot

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions