Does INITIAL_PROMPT has a meaning, or just a placeholder for prompt re-use?

Thanks for your great work!

I looked into the [prompt_reuse](https://github.com/huggingface/huggingface-llama-recipes/blob/main/performance_optimization/prompt_reuse.py) script. 

It basically first feed the "INITIAL_PROMPT" through the model:
```
inputs = tokenizer(INITIAL_PROMPT, return_tensors="pt").to("cuda")
with torch.no_grad():
    prompt_cache = model(**inputs, past_key_values = prompt_cache).past_key_values
```

Then, to use this cache with another prompt suffix, they use:
```
prompt = "Why are french people obsessed with french?"
new_inputs = tokenizer(INITIAL_PROMPT + prompt, return_tensors="pt").to("cuda")
past_key_values = copy.deepcopy(prompt_cache)
outputs = model.generate(**new_inputs, past_key_values=past_key_values,max_new_tokens=20) 
```

However, I am trying to understand if the INITIAL_PROMPT tokens have a meaning, other than being "position_ids" placeholder. Like, let's say we would change INITIAL_PROMPT to be some random prompt with same token length. Would we expect other results? I assume that since these tokens KV are taken from the cache, they're only placeholders.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does INITIAL_PROMPT has a meaning, or just a placeholder for prompt re-use? #75

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Does INITIAL_PROMPT has a meaning, or just a placeholder for prompt re-use? #75

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions