Running Santcoder-fast-inference throws UserWarning: FALLBACK path has been taken inside

### System Info

- `transformers` version: 4.27.0.dev0
- Platform: Linux-4.18.0 x86_64-with-glibc2.28
- Python version: 3.9.0
- Huggingface_hub version: 0.11.1
- PyTorch version (GPU?): 1.13.1 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No

### Who can help?

While testing `bigcode/santacoder-fast-inference` model on `openai_human_eval` dataset. I am getting the following warning. Is there something to be concerned about?

```bash
anaconda3/envs/NLPWorkSpace/lib/python3.9/site-packages/transformers-4.27.0.dev0-py3.9.egg/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py:259: UserWarning: FALLBACK path has been
taken inside: runCudaFusionGroup. This is an indication that codegen Failed for some reason.                                                                               
To debug try disable codegen fallback path via setting the env variable `export PYTORCH_NVFUSER_DISABLE=fallback`                                                                                                  
 (Triggered internally at /opt/conda/conda-bld/pytorch_1670525551200/work/torch/csrc/jit/codegen/cuda/manager.cpp:331.)                                                                                            
  attn_weights = upcast_masked_softmax(attn_weights, attention_mask, mask_value, unscale, softmax_dtype)

```

@mayank31398 @joel

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Tasks

- [X] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

Running inference on OpenAI's HumanEval dataset leads to this warning. Specifically when I use `temperature = 0.2` and `top_p = 0.2` in `model.generate` method

### Expected behavior

No Warning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Running Santcoder-fast-inference throws UserWarning: FALLBACK path has been taken inside #12

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Running Santcoder-fast-inference throws UserWarning: FALLBACK path has been taken inside #12

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions