Replies: 1 comment
-
🤖 Hello, Thank you for reaching out and providing detailed information about your issue. The gibberish you're seeing when However, there's a way to only stream the final response and not the thought process. You can achieve this by setting question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT, verbose=False)
doc_chain = load_qa_with_sources_chain(
llm=streaming_llm, chain_type="map_reduce", verbose=False
) This will only stream the final response and not the thought process. Please note that when you define your model, you need to set LLM = ChatOpenAI(
model_name="gpt-3.5-turbo",
temperature=0.1,
openai_api_key=OPENAI_KEY,
streaming=True,
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
verbose=True
) In this case, You can find more information about this in a similar solved issue: Support for streaming when using LLMchain? I hope this helps! If you have any other questions or if the issue persists, please let me know. Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! I've noticed that when enabling
streaming=True
I found that the LLM usually outputs gibberish and then after streaming, takes a couple of seconds to form a coherent response. Is there a way to have it stream the outputting of the coherent response instead? Here's my current code:Here's a sample gibberish streaming response outputted to the console:
And here's the coherent response returned after:
Beta Was this translation helpful? Give feedback.
All reactions