Hello,
Can you send me the inference sample python code snippet of llama sft 7 model that shows how to do inference with llama model?
I know about these special tokens from code are used to learn model differences between user & assistant prompts. But i want to know once user enters prompt/message, what is the pipeline for inference. like EOS tokens used, Stop sequences used. where & how to use special tokens to frame user & assistant responses.
I was looking through inference code but its quite complex to understand for new user.