-
Notifications
You must be signed in to change notification settings - Fork 242
Description
Hi,
I am new to using this repo, but it seems like exactly what I am looking for for building training datasets from unstructured text. One thing I am confused about, is how to properly manage token limits on the models I am using. I am running models on an inference server in my lab, but because of hardware constraints, I am using Llama 3.0 (8k tokens) and Llama 3.1 (128k normally, but I only have 50k available because of GPU memory).
I am getting errors when it runs documents that exceed that available context length, which makes sense, but I'm wondering if this is a configuration issue or an input text setup issue. Is there a setting somewhere that I should specify the max context length of my model, or do I need to break up the input texts to work better for the smaller context length?
Thanks