Request: Support for GLM-4.7-Flash

GLM-4.7-Flash shows some pretty impressive benchmarks, could be a viable local model. Trying to run Heretic on it results in an error due to glm4_moe_lite not being a recognized architecture. I ran pip install --upgrade on the transformers package, however it resulted in the same error output. Will this be resolved by a future update of transformers package? or will support for GLM-4.7-Flash require revisions to heretic?

`
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀  v1.1.0
█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░
▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀  https://github.com/p-e-w/heretic

No GPU or other accelerator detected. Operations will be slow.

Loading model zai-org/GLM-4.7-Flash...
* Trying dtype auto... Failed (The checkpoint you are trying to load has model type `glm4_moe_lite` but
Transformers does not recognize this architecture. This could be because of an 
issue with the checkpoint, or because your version of Transformers is out of 
date.

You can update Transformers with the command `pip install --upgrade 
transformers`. If this does not work, and the checkpoint is very new, then there
may not be a release version that supports this model yet. In this case, you can
get the most up-to-date code by installing Transformers from source with the 
command `pip install git+https://github.com/huggingface/transformers.git`)
* Trying dtype float16... Failed (The checkpoint you are trying to load has model type `glm4_moe_lite` but
Transformers does not recognize this architecture. This could be because of an 
issue with the checkpoint, or because your version of Transformers is out of 
date.

You can update Transformers with the command `pip install --upgrade 
transformers`. If this does not work, and the checkpoint is very new, then there
may not be a release version that supports this model yet. In this case, you can
get the most up-to-date code by installing Transformers from source with the 
command `pip install git+https://github.com/huggingface/transformers.git`)
* Trying dtype bfloat16... Failed (The checkpoint you are trying to load has model type `glm4_moe_lite` but
Transformers does not recognize this architecture. This could be because of an 
issue with the checkpoint, or because your version of Transformers is out of 
date.

You can update Transformers with the command `pip install --upgrade 
transformers`. If this does not work, and the checkpoint is very new, then there
may not be a release version that supports this model yet. In this case, you can
get the most up-to-date code by installing Transformers from source with the 
command `pip install git+https://github.com/huggingface/transformers.git`)
* Trying dtype float32... Failed (The checkpoint you are trying to load has model type `glm4_moe_lite` but
Transformers does not recognize this architecture. This could be because of an 
issue with the checkpoint, or because your version of Transformers is out of 
date.

You can update Transformers with the command `pip install --upgrade 
transformers`. If this does not work, and the checkpoint is very new, then there
may not be a release version that supports this model yet. In this case, you can
get the most up-to-date code by installing Transformers from source with the 
command `pip install git+https://github.com/huggingface/transformers.git`)
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /usr/local/bin/heretic:8 in <module>                                         │
│                                                                              │
│   5 from heretic.main import main                                            │
│   6 if __name__ == '__main__':                                               │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])     │
│ ❱ 8 │   sys.exit(main())                                                     │
│   9                                                                          │
│                                                                              │
│ /usr/local/lib/python3.12/dist-packages/heretic/main.py:576 in main          │
│                                                                              │
│   573 │   install()                                                          │
│   574 │                                                                      │
│   575 │   try:                                                               │
│ ❱ 576 │   │   run()                                                          │
│   577 │   except BaseException as error:                                     │
│   578 │   │   # Transformers appears to handle KeyboardInterrupt (or BaseExc │
│   579 │   │   # internally in some places, which can re-raise a different er │
│                                                                              │
│ /usr/local/lib/python3.12/dist-packages/heretic/main.py:133 in run           │
│                                                                              │
│   130 │   # Silence the warning about multivariate TPE being experimental.   │
│   131 │   warnings.filterwarnings("ignore", category=ExperimentalWarning)    │
│   132 │                                                                      │
│ ❱ 133 │   model = Model(settings)                                            │
│   134 │                                                                      │
│   135 │   print()                                                            │
│   136 │   print(f"Loading good prompts from [bold]{settings.good_prompts.dat │
│                                                                              │
│ /usr/local/lib/python3.12/dist-packages/heretic/model.py:92 in __init__      │
│                                                                              │
│    89 │   │   │   break                                                      │
│    90 │   │                                                                  │
│    91 │   │   if self.model is None:                                         │
│ ❱  92 │   │   │   raise Exception("Failed to load model with all configured  │
│    93 │   │                                                                  │
│    94 │   │   print(f"* Transformer model with [bold]{len(self.get_layers()) │
│    95 │   │   print("* Abliterable components:")                             │
╰──────────────────────────────────────────────────────────────────────────────╯
Exception: Failed to load model with all configured dtypes.

`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request: Support for GLM-4.7-Flash #113

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Request: Support for GLM-4.7-Flash #113

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions