Fix hardcoded CUDA device in api.py to support MPS and CPU fallback by Mr-Neutr0n · Pull Request #1516 · zai-org/ChatGLM-6B

Mr-Neutr0n · 2026-02-11T18:30:02Z

Problem

The API server (api.py) crashes immediately on non-CUDA systems because:

DEVICE is hardcoded to "cuda", causing failures on macOS (Apple Silicon/MPS) and CPU-only machines
Model loading uses .cuda() directly, which raises RuntimeError when CUDA is unavailable
torch_gc() only handles CUDA cleanup, missing MPS cache management

Fix

Auto-detect the best available device at startup: CUDA > MPS > CPU
Replace .cuda() with .to(DEVICE) for portable device placement
Update torch_gc() to clear MPS cache when on Apple Silicon and skip CUDA-specific cleanup when not on a CUDA device

Testing

Verified the logic is consistent with PyTorch's device detection APIs
Backward compatible: behavior is identical on CUDA systems since torch.cuda.is_available() returns True and DEVICE resolves to "cuda" as before

The API server crashes on non-CUDA systems (e.g., macOS with Apple Silicon or CPU-only machines) because DEVICE is hardcoded to "cuda" and the model loading uses .cuda() directly. Changes: - Auto-detect the best available device (CUDA > MPS > CPU) - Replace .cuda() with .to(DEVICE) for portable device placement - Update torch_gc() to handle MPS cache clearing and skip CUDA-specific cleanup when not on a CUDA device

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix hardcoded CUDA device in api.py to support MPS and CPU fallback#1516

Fix hardcoded CUDA device in api.py to support MPS and CPU fallback#1516
Mr-Neutr0n wants to merge 1 commit intozai-org:mainfrom
Mr-Neutr0n:fix/device-handling-api

Mr-Neutr0n commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Mr-Neutr0n commented Feb 11, 2026

Problem

Fix

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant