Transformers can no longer load santacoder-fast-inference model

### System Info

- `transformers` version: 4.27.0.dev0
- Platform: Linux-5.15.0-1026-aws-x86_64-with-glibc2.29
- Python version: 3.8.10
- Huggingface_hub version: 0.12.0
- PyTorch version (GPU?): 1.13.1+cu117 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed

### Who can help?

@jlamypoirier 

### Information

- [x] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

`from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("bigcode/santacoder-fast-inference")`

### Expected behavior

Model should load without issue, but instead we get a bunch of errors of the form

```
RuntimeError: Error(s) in loading state_dict for GPTBigCodeLMHeadModel:
        size mismatch for transformer.h.0.attn.c_attn.weight: copying a param with shape torch.Size([2048, 2304]) from checkpoint, the shape in current model is torch.Size([2304, 2048]).
        size mismatch for transformer.h.0.mlp.c_fc.weight: copying a param with shape torch.Size([2048, 8192]) from checkpoint, the shape in current model is torch.Size([8192, 2048]).
        size mismatch for transformer.h.0.mlp.c_proj.weight: copying a param with shape torch.Size([8192, 2048]) from checkpoint, the shape in current model is torch.Size([2048, 8192]).```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transformers can no longer load santacoder-fast-inference model #15

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Transformers can no longer load santacoder-fast-inference model #15

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions