Are the tokens in ChatBufferMemory also charged? #10247
Replies: 1 comment
-
🤖 Yes, the tokens in the ChatBufferMemory are also charged. When you make a call to the OpenAI API, the total number of tokens in the API call is what you're billed for. This includes both the input and output tokens. In your case, when you store the first input and output into a ChatTokenBufferMemory and then make subsequent calls to the OpenAI API, the tokens from the chat history (ChatBufferMemory) are included in the total token count for each API call. This is because the chat history is also fed into the API call each time. The token usage is managed by the LangChain framework itself. The actual number of tokens used in a request is calculated in the Here is the relevant code from the LangChain repository that handles token usage: def update_token_usage(
keys: Set[str], response: Dict[str, Any], token_usage: Dict[str, Any]
) -> None:
"""Update token usage."""
_keys_to_use = keys.intersection(response["usage"])
for _key in _keys_to_use:
if _key not in token_usage:
token_usage[_key] = response["usage"][_key]
else:
token_usage[_key] += response["usage"][_key] So, while your intention to save money by reducing the number of tokens in your prompt is a good one, keep in mind that the tokens from the chat history are also included in the total token count for each API call. Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I’m trying to extract info from a bunch of input texts with langchain and gpt3.5. Instead of using the same prompt template for the instructions and substituting the input text each time, I’m thinking of storing the first input and output into a ChatTokenBufferMemory, so that in later calls of openai api I can just say “can you do the same to this text”. But do the tokens in ChatBufferMemory also cost money? My original intention is to save money because “can you do the same to this text” has way less tokens than my whole prompt template with the instructions, but I’m wondering that the chat history is also fed into api call each time?
Beta Was this translation helpful? Give feedback.
All reactions