If the chatbot needs to generate longer answers in the future, you should increase the max_new_tokens limit accordingly.
The setting can be found here:
|
def generate_response(self, prompt, max_new_tokens=500, num_return_sequences=1): |
You can base on the PR #105 to set it