-
Notifications
You must be signed in to change notification settings - Fork 478
Description
GLM-4.7-Flash shows some pretty impressive benchmarks, could be a viable local model. Trying to run Heretic on it results in an error due to glm4_moe_lite not being a recognized architecture. I ran pip install --upgrade on the transformers package, however it resulted in the same error output. Will this be resolved by a future update of transformers package? or will support for GLM-4.7-Flash require revisions to heretic?
`
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀ v1.1.0
█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░
▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀ https://github.com/p-e-w/heretic
No GPU or other accelerator detected. Operations will be slow.
Loading model zai-org/GLM-4.7-Flash...
- Trying dtype auto... Failed (The checkpoint you are trying to load has model type
glm4_moe_litebut
Transformers does not recognize this architecture. This could be because of an
issue with the checkpoint, or because your version of Transformers is out of
date.
You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the checkpoint is very new, then there
may not be a release version that supports this model yet. In this case, you can
get the most up-to-date code by installing Transformers from source with the
command pip install git+https://github.com/huggingface/transformers.git)
- Trying dtype float16... Failed (The checkpoint you are trying to load has model type
glm4_moe_litebut
Transformers does not recognize this architecture. This could be because of an
issue with the checkpoint, or because your version of Transformers is out of
date.
You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the checkpoint is very new, then there
may not be a release version that supports this model yet. In this case, you can
get the most up-to-date code by installing Transformers from source with the
command pip install git+https://github.com/huggingface/transformers.git)
- Trying dtype bfloat16... Failed (The checkpoint you are trying to load has model type
glm4_moe_litebut
Transformers does not recognize this architecture. This could be because of an
issue with the checkpoint, or because your version of Transformers is out of
date.
You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the checkpoint is very new, then there
may not be a release version that supports this model yet. In this case, you can
get the most up-to-date code by installing Transformers from source with the
command pip install git+https://github.com/huggingface/transformers.git)
- Trying dtype float32... Failed (The checkpoint you are trying to load has model type
glm4_moe_litebut
Transformers does not recognize this architecture. This could be because of an
issue with the checkpoint, or because your version of Transformers is out of
date.
You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the checkpoint is very new, then there
may not be a release version that supports this model yet. In this case, you can
get the most up-to-date code by installing Transformers from source with the
command pip install git+https://github.com/huggingface/transformers.git)
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/bin/heretic:8 in │
│ │
│ 5 from heretic.main import main │
│ 6 if name == 'main': │
│ 7 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /usr/local/lib/python3.12/dist-packages/heretic/main.py:576 in main │
│ │
│ 573 │ install() │
│ 574 │ │
│ 575 │ try: │
│ ❱ 576 │ │ run() │
│ 577 │ except BaseException as error: │
│ 578 │ │ # Transformers appears to handle KeyboardInterrupt (or BaseExc │
│ 579 │ │ # internally in some places, which can re-raise a different er │
│ │
│ /usr/local/lib/python3.12/dist-packages/heretic/main.py:133 in run │
│ │
│ 130 │ # Silence the warning about multivariate TPE being experimental. │
│ 131 │ warnings.filterwarnings("ignore", category=ExperimentalWarning) │
│ 132 │ │
│ ❱ 133 │ model = Model(settings) │
│ 134 │ │
│ 135 │ print() │
│ 136 │ print(f"Loading good prompts from [bold]{settings.good_prompts.dat │
│ │
│ /usr/local/lib/python3.12/dist-packages/heretic/model.py:92 in init │
│ │
│ 89 │ │ │ break │
│ 90 │ │ │
│ 91 │ │ if self.model is None: │
│ ❱ 92 │ │ │ raise Exception("Failed to load model with all configured │
│ 93 │ │ │
│ 94 │ │ print(f"* Transformer model with [bold]{len(self.get_layers()) │
│ 95 │ │ print("* Abliterable components:") │
╰──────────────────────────────────────────────────────────────────────────────╯
Exception: Failed to load model with all configured dtypes.
`