Replies: 1 comment
-
| Not sure you're still interested in the issue, but LlamaDiskCache does the job, even if it has a slight bug now. | 
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
-
Is it possible to preprocess a set of prompts, save them to disk, and then reload them during inference to avoid the need to regenerate cached prompts every time? For instance, if I have a 2000-token prompt that I use daily in a memory-intensive Python program, is there a way to pre-process and save it to avoid the delay of ingesting the prompt each time I start the program? What are the options in this scenario?
Beta Was this translation helpful? Give feedback.
All reactions