[BUG] Quits without obvious error during model load

### OS

Windows

### GPU Library

CUDA 12.x

### Python version

3.12

### Pytorch version

2.7.0

### Model

Llama-3.1-8B-Instruct-exl2 (turboderp) - but occurs with all I've tested

### Describe the bug

Clearly something I've done, but I don't know what - and other CUDA apps are working fine still (llama.cpp server)

Exits without error during model load - initially noticed with TabbyAPI but it also occurs with minimal_chat.py from the examples dir

Running in a debugger it appears to get to line 583 of model.py (module.load()) and then exits, no error is reported
module is ExLlamaV2Embedding, and within that it seems to be line 49 of embedding.py (w = self.load_weight() ),
then 143 of module.py (tensors = self.load_multi(key, ["weight"],cpu=cpu)

The exit seems to occur at line 96 of module.py (stfile.get_tensor())

I noticed that on "Building C++/CUDA extension" it never goes above 0% although the software continues to load. I read about an issue with setuptools so downgraded to 75.8.2 in case that was the issue, but no result.

### Reproduction steps

Windows 10, NVIDIA A6000, CUDA 12.4 - upgraded to 12.8 which was when this issue developed (also upgraded exllamav2 from 0.2.8 to 0.2.9, but downgrading doesn't fix this). I've since upgraded to CUDA 12.9

minimal_chat.py is sufficient to exhibit this behaviour (loading the turboderp version of Llama-3.1-8B-Instruct-exl2)

### Expected behavior

Model loads, or at least an error message appears

### Logs

```
Package                   Version
------------------------- -----------
aiofiles                  24.1.0
aiohappyeyeballs          2.6.1
aiohttp                   3.11.18
aiosignal                 1.3.2
annotated-types           0.7.0
anyio                     4.9.0
async-lru                 2.0.5
attrs                     25.3.0
certifi                   2025.4.26
charset-normalizer        3.4.2
click                     8.1.8
colorama                  0.4.6
cramjam                   2.10.0
einops                    0.8.1
exllamav2                 0.2.9
fastapi-slim              0.115.12
fastparquet               2024.11.0
filelock                  3.18.0
flash_attn                2.7.4.post1
formatron                 0.4.11
frozendict                2.4.6
frozenlist                1.6.0
fsspec                    2025.3.2
general_sam               1.0.2
h11                       0.16.0
httptools                 0.6.4
huggingface-hub           0.30.2
idna                      3.10
Jinja2                    3.1.6
jsonschema                4.23.0
jsonschema-specifications 2025.4.1
kbnf                      0.4.1
loguru                    0.7.3
markdown-it-py            3.0.0
MarkupSafe                3.0.2
mdurl                     0.1.2
mpmath                    1.3.0
multidict                 6.4.3
networkx                  3.4.2
ninja                     1.11.1.4
numpy                     2.2.5
packaging                 25.0
pandas                    2.2.3
pillow                    11.2.1
pip                       24.0
propcache                 0.3.1
psutil                    7.0.0
pydantic                  2.11.4
pydantic_core             2.33.2
Pygments                  2.19.1
python-dateutil           2.9.0.post0
pytz                      2025.2
PyYAML                    6.0.2
referencing               0.36.2
regex                     2024.11.6
requests                  2.32.3
rich                      14.0.0
rpds-py                   0.24.0
ruamel.yaml               0.18.10
ruamel.yaml.clib          0.2.12
safetensors               0.5.3
sentencepiece             0.2.0
setuptools                75.8.2
six                       1.17.0
sniffio                   1.3.1
sse-starlette             2.3.4
starlette                 0.46.2
sympy                     1.14.0
tabbyAPI                  0.0.1
tokenizers                0.21.1
torch                     2.7.0+cu128
tqdm                      4.67.1
typing_extensions         4.13.2
typing-inspection         0.4.0
tzdata                    2025.2
urllib3                   2.4.0
uvicorn                   0.34.2
websockets                15.0.1
win32_setctime            1.2.0
winloop                   0.1.8
yarl                      1.20.0

```

### Additional context

_No response_

### Acknowledgements

- [x] I have looked for similar issues before submitting this one.
- [x] I understand that the developers have lives and my issue will be answered when possible.
- [x] I understand the developers of this program are human, and I will ask my questions politely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BUG] Quits without obvious error during model load #787

OS

GPU Library

Python version

Pytorch version

Model

Describe the bug

Reproduction steps

Expected behavior

Logs

Additional context

Acknowledgements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[BUG] Quits without obvious error during model load #787

Description

OS

GPU Library

Python version

Pytorch version

Model

Describe the bug

Reproduction steps

Expected behavior

Logs

Additional context

Acknowledgements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions