Support for GLM Models

### ⚠️ Please check that this feature request hasn't been suggested before.

- [X] I searched previous [Ideas in Discussions](https://github.com/OpenAccess-AI-Collective/axolotl/discussions/categories/ideas) didn't find any similar feature requests.
- [X] I searched previous [Issues](https://github.com/OpenAccess-AI-Collective/axolotl/labels/enhancement) didn't find any similar feature requests.

### 🔖 Feature description

Can you add the support for the recent GLM models that are released like GLM-4-9b-chat models in axolotl?

### ✔️ Solution

I have tried with the following script but this doesn't run sucessfully:

```
base_model: THUDM/glm-4-9b-chat
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

trust_remote_code: true

load_in_8bit: true
load_in_4bit: false
strict: false

datasets:
    - path: Datasets2.json
      ds_type: json
      type: alpaca
dataset_prepared_path: last_run_prepared/glm
val_set_size: 0
output_dir: ./outputs/glm-lora

sequence_len: 2048
sample_packing: false
pad_to_sequence_len: true

adapter: lora
lora_model_dir:
lora_r: 64
lora_alpha: 32
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project: ArabicGLM-8B
wandb_entity:
wandb_watch:
wandb_name: ArabicGLM-8B
wandb_log_model:

gradient_accumulation_steps: 8
micro_batch_size: 4
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: 

warmup_steps: 95
evals_per_epoch: 4
eval_table_size:
eval_max_new_tokens: 128
save_total_limit: 5
save_steps: 200

debug:
deepspeed: deepspeed_configs/zero2.json
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
  pad_token: "<|endoftext|>"
  eos_token: "<|endoftext|>"
```



Error Trace:

```
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/mnt/beegfs/fahad.khan/axolotl/src/axolotl/monkeypatch/data/batch_dataset_fetcher.py", line 32, in fetch
    return self.collate_fn(data)
  File "/mnt/beegfs/fahad.khan/axolotl/src/axolotl/utils/collators.py", line 106, in __call__
    features = self.tokenizer.pad(
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3355, in pad
    return BatchEncoding(batch_outputs, tensor_type=return_tensors)
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 224, in __init__
    self.convert_to_tensors(tensor_type=tensor_type, prepend_batch_axis=prepend_batch_axis)
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 775, in convert_to_tensors
    raise ValueError(
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`position_ids` in this case) have excessive nesting (inputs type `list` where ty
pe `int` is expected).
Traceback (most recent call last):
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 759, in convert_to_tensors
    tensor = as_tensor(value)
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 721, in as_tensor
    return torch.tensor(value)
ValueError: expected sequence of length 773 at dim 1 (got 438) 

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/mnt/beegfs/fahad.khan/axolotl/src/axolotl/cli/train.py", line 59, in <module>
    fire.Fire(do_cli)
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/mnt/beegfs/fahad.khan/axolotl/src/axolotl/cli/train.py", line 35, in do_cli
    return do_train(parsed_cfg, parsed_cli_args)
  File "/mnt/beegfs/fahad.khan/axolotl/src/axolotl/cli/train.py", line 55, in do_train
    return train(cfg=cfg, cli_args=cli_args, dataset_meta=dataset_meta)
  File "/mnt/beegfs/fahad.khan/axolotl/src/axolotl/train.py", line 163, in train
    trainer.train(resume_from_checkpoint=resume_from_checkpoint)
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 1859, in train
    return inner_training_loop(
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 2165, in _inner_training_loop
    for step, inputs in enumerate(epoch_iterator):
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/data_loader.py", line 452, in __iter__
    current_batch = next(dataloader_iter)
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/mnt/beegfs/fahad.khan/axolotl/src/axolotl/monkeypatch/data/batch_dataset_fetcher.py", line 32, in fetch
    return self.collate_fn(data)
  File "/mnt/beegfs/fahad.khan/axolotl/src/axolotl/utils/collators.py", line 106, in __call__
    features = self.tokenizer.pad(
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3355, in pad
    return BatchEncoding(batch_outputs, tensor_type=return_tensors)
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 224, in __init__
    self.convert_to_tensors(tensor_type=tensor_type, prepend_batch_axis=prepend_batch_axis)
  File "/home/ashmal.vayani/anaconda3/envs/axolotl/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 775, in convert_to_tensors
    raise ValueError(
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`position_ids` in this case) have excessive nesting (inputs type `list` where type `int` is expected).
```

I think the main error lies in this line because I've set the padding to True by default: 
> ValueError: expected sequence of length 773 at dim 1 (got 438) 

### ❓ Alternatives

-

### 📝 Additional Context

-

### Acknowledgements

- [X] My issue title is concise, descriptive, and in title casing.
- [X] I have searched the existing issues to make sure this feature has not been requested yet.
- [X] I have provided enough information for the maintainers to understand and evaluate this request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support for GLM Models #1701

⚠️ Please check that this feature request hasn't been suggested before.

🔖 Feature description

✔️ Solution

❓ Alternatives

📝 Additional Context

Acknowledgements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Support for GLM Models #1701

Description

⚠️ Please check that this feature request hasn't been suggested before.

🔖 Feature description

✔️ Solution

❓ Alternatives

📝 Additional Context

Acknowledgements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions