-
Notifications
You must be signed in to change notification settings - Fork 90
Description
cmd Log:
(venv) C:\HeartMula\HeartMuLa-Studio>python -m uvicorn backend.app.main:app --host 0.0.0.0 --port 8000
INFO: Started server process [7004]
INFO: Waiting for application startup.
[Models] All models found at C:\HeartMula\HeartMuLa-Studio\backend\models
[Config] Auto-detected: 4-bit quantization = True
[Config] Auto-detected: sequential offload = True
[Quantization] 4-bit quantization ENABLED - model will use ~3GB instead of ~11GB
[12GB GPU Mode] Using lazy codec loading for 12GB GPU
[Quantization] Loading HeartMuLa with 4-bit NF4 quantization...
torch_dtype is deprecated! Use dtype instead!
Failed to load Heartlib model: Torch not compiled with CUDA enabled
ERROR: Traceback (most recent call last):
File "C:\HeartMula\HeartMuLa-Studio\venv\Lib\site-packages\starlette\routing.py", line 694, in lifespan
async with self.lifespan_context(app) as maybe_state:
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\AppData\Local\Programs\Python\Python312\Lib\contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "C:\HeartMula\HeartMuLa-Studio\backend\app\main.py", line 55, in lifespan
await music_service.initialize()
File "C:\HeartMula\HeartMuLa-Studio\backend\app\services\music_service.py", line 835, in initialize
raise e
File "C:\HeartMula\HeartMuLa-Studio\backend\app\services\music_service.py", line 828, in initialize
self.pipeline = await loop.run_in_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\AppData\Local\Programs\Python\Python312\Lib\concurrent\futures\thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\HeartMula\HeartMuLa-Studio\backend\app\services\music_service.py", line 830, in
lambda mp=model_path, v=version: self._load_pipeline_multi_gpu(mp, v)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\HeartMula\HeartMuLa-Studio\backend\app\services\music_service.py", line 693, in _load_pipeline_multi_gpu
pipeline = create_quantized_pipeline(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\HeartMula\HeartMuLa-Studio\backend\app\services\music_service.py", line 409, in create_quantized_pipeline
heartmula = HeartMuLa.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\HeartMula\HeartMuLa-Studio\venv\Lib\site-packages\transformers\modeling_utils.py", line 277, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\HeartMula\HeartMuLa-Studio\venv\Lib\site-packages\transformers\modeling_utils.py", line 5051, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\HeartMula\HeartMuLa-Studio\venv\Lib\site-packages\transformers\modeling_utils.py", line 5435, in load_pretrained_model
caching_allocator_warmup(model, expanded_device_map, hf_quantizer)
File "C:\HeartMula\HeartMuLa-Studio\venv\Lib\site-packages\transformers\modeling_utils.py", line 6092, in caching_allocator_warmup
index = device.index if device.index is not None else torch_accelerator_module.current_device()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\HeartMula\HeartMuLa-Studio\venv\Lib\site-packages\torch\cuda_init.py", line 878, in current_device
lazy_init()
File "C:\HeartMula\HeartMuLa-Studio\venv\Lib\site-packages\torch\cuda_init.py", line 305, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
ERROR: Application startup failed. Exiting.