terminate called after throwing an instance of 'std::bad_alloc'   what():  std::bad_alloc

### System Info / 系統信息

========================================
System Information Diagnostic Script
========================================
Python Version: 3.10.12 (main, Aug 15 2025, 14:32:43) [GCC 11.4.0]
--------------------
OS: Linux 6.6.87.2-microsoft-standard-WSL2 (#1 SMP PREEMPT_DYNAMIC Thu Jun  5 18:30:46 UTC 2025)
Platform: Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.35
--------------------
CPU: x86_64
Physical Cores: 24
Logical Cores: 24
Total RAM: 62.53 GB
Available RAM: 28.10 GB
--------------------
Transformers Version: 4.51.3
--------------------
PyTorch Version: 2.9.1+cu128
CUDA Available: True
PyTorch CUDA Version: 12.8
CUDA Device Count: 1
Device 0: NVIDIA GeForce RTX 3090
  Total Memory: 24.00 GB
  Major/Minor: 8.6
--------------------
NVCC Version (System):
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Apr__9_19:24:57_PDT_2025
Cuda compilation tools, release 12.9, V12.9.41
Build cuda_12.9.r12.9/compiler.35813241_0

--------------------
nvidia-smi Output:
Tue Dec 23 13:37:25 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.10              Driver Version: 581.29         CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        On  |   00000000:02:00.0  On |                  N/A |
|  0%   32C    P8             39W /  460W |    2178MiB /  24576MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A               1      C   /python3.10                           N/A      |
|    0   N/A  N/A               1      C   /python3.10                           N/A      |
|    0   N/A  N/A               1      C   /python3.10                           N/A      |
|    0   N/A  N/A              36      C   /python3.10                           N/A      |
+-----------------------------------------------------------------------------------------+

========================================

### Who can help? / 谁可以帮助到您？

_No response_

### Information / 问题信息

- [ ] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务

### Reproduction / 复现过程

按照 requirements.txt 装好环境后，下载模型，运行 python inference.py --checkpoint_dir GLM-ASR-Nano-2512 --audio examples/example_en.wav --device cuda

就会报错，modelscope和huggingface的模型都下载测试一样的结果

### Expected behavior / 期待表现

root@868442d56a08:/app# python inference.py --checkpoint_dir GLM-ASR-Nano-2512 --audio examples/example_en.wav --device cuda
[inference] start transcribe checkpoint_dir=GLM-ASR-Nano-2512 audio=examples/example_en.wav device=cuda max_new_tokens=128
[inference] cuda mem free=24.44GB total=25.77GB
[inference] load tokenizer from GLM-ASR-Nano-2512
[inference] load config from GLM-ASR-Nano-2512
You are using a model of type glmasr to instantiate a model of type Glmasr. This is not supported for all configurations of models and can yield errors.
[inference] load model with dtype=torch.float16
[inference] cuda mem free=24.44GB total=25.77GB
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc #24

System Info / 系統信息

========================================
System Information Diagnostic Script

Python Version: 3.10.12 (main, Aug 15 2025, 14:32:43) [GCC 11.4.0]

OS: Linux 6.6.87.2-microsoft-standard-WSL2 (#1 SMP PREEMPT_DYNAMIC Thu Jun 5 18:30:46 UTC 2025)
Platform: Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.35

CPU: x86_64
Physical Cores: 24
Logical Cores: 24
Total RAM: 62.53 GB
Available RAM: 28.10 GB

Transformers Version: 4.51.3

PyTorch Version: 2.9.1+cu128
CUDA Available: True
PyTorch CUDA Version: 12.8
CUDA Device Count: 1
Device 0: NVIDIA GeForce RTX 3090
Total Memory: 24.00 GB
Major/Minor: 8.6

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc #24

Description

System Info / 系統信息

======================================== System Information Diagnostic Script

Python Version: 3.10.12 (main, Aug 15 2025, 14:32:43) [GCC 11.4.0]

OS: Linux 6.6.87.2-microsoft-standard-WSL2 (#1 SMP PREEMPT_DYNAMIC Thu Jun 5 18:30:46 UTC 2025) Platform: Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.35

CPU: x86_64 Physical Cores: 24 Logical Cores: 24 Total RAM: 62.53 GB Available RAM: 28.10 GB

Transformers Version: 4.51.3

PyTorch Version: 2.9.1+cu128 CUDA Available: True PyTorch CUDA Version: 12.8 CUDA Device Count: 1 Device 0: NVIDIA GeForce RTX 3090 Total Memory: 24.00 GB Major/Minor: 8.6

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

========================================
System Information Diagnostic Script

OS: Linux 6.6.87.2-microsoft-standard-WSL2 (#1 SMP PREEMPT_DYNAMIC Thu Jun 5 18:30:46 UTC 2025)
Platform: Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.35

CPU: x86_64
Physical Cores: 24
Logical Cores: 24
Total RAM: 62.53 GB
Available RAM: 28.10 GB

PyTorch Version: 2.9.1+cu128
CUDA Available: True
PyTorch CUDA Version: 12.8
CUDA Device Count: 1
Device 0: NVIDIA GeForce RTX 3090
Total Memory: 24.00 GB
Major/Minor: 8.6