Skip to content

lora微调训练没问题,导出的模型运行后一对话就报错 total bytes of NDArray > 2**32' #10245

@Coomfu

Description

@Coomfu

Reminder

  • I have read the above rules and searched the existing issues.

System Info

  • llamafactory version: 0.9.4
  • Platform: macOS-15.1.1-arm64-arm-64bit
  • Python version: 3.11.14
  • PyTorch version: 2.10.0
  • Transformers version: 4.57.1
  • Datasets version: 4.0.0
  • Accelerate version: 1.11.0
  • PEFT version: 0.17.1
  • TRL version: 0.24.0
  • Default data directory: detected
  • 型号名称: MacBook Pro
  • 芯片: Apple M1 Pro
  • 核总数: 10(8性能和2能效)
  • 内存: 16 GB

Reproduction

Assistant: [WARNING|logging.py:328] 2026-03-04 11:29:51,568 >> `generation_config` default values have been modified to match model-specific defaults: {'top_k': 20, 'repetition_penalty': 1.1, 'bos_token_id': 151643}. If this is not desired, please set these values explicitly.
/AppleInternal/Library/BuildRoots/4b66fb3c-7dd0-11ef-b4fb-4a83e32a47e1/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:850: failed assertion `[MPSTemporaryNDArray initWithDevice:descriptor:isTextureBacked:] Error: total bytes of NDArray > 2**32'

Others

训练参数:

### model
model_name_or_path: Qwen/Qwen2-0.5B-Instruct
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 8
lora_target: all

### dataset
dataset: identity_xjf
template: qwen
cutoff_len: 2048
max_samples: 1000
preprocessing_num_workers: 1
dataloader_num_workers: 0

### output
output_dir: saves/qwen2-0.5b/lora/sft
logging_steps: 10
save_steps: 100
plot_loss: true
overwrite_output_dir: true
save_only_model: false
report_to: none  # choices: [none, wandb, tensorboard, swanlab, mlflow]

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 5.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: false
ddp_timeout: 180000000
resume_from_checkpoint: null

导出命令:

llamafactory-cli export \                                     
    --model_name_or_path Qwen/Qwen2-0.5B-Instruct \        
    --adapter_name_or_path saves/qwen2-0.5b/lora/sft  \
    --template qwen \
    --export_dir saves/qwen2-0.5b/lora/sft_merged \
    --export_size 5 \
    --export_device cpu \
    --export_legacy_format false

然后运行后对话:

llamafactory-cli chat \                                    
    --model_name_or_path saves/qwen2-0.5b/lora/sft_merged \
    --template qwen

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpendingThis problem is yet to be addressed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions