tokens and usage cost #3397
-
|
I am using hugging face endpoints for my LLM and I have used langchain and langfuse callbacks to get traces but I want to know the token used and the cost for each trace |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
Hey @Karanaiinfox! I'm here to assist you with any bugs, questions, or contributions you have. Just let me know how I can help! To determine the tokens used and the cost for each trace when using Hugging Face endpoints with LangChain and LangFuse callbacks, you can use the explicit cost and token ingestion method. Here’s how you can do it:
Here is an example of how you might implement this in code: # Example of tracking tokens and cost with LangChain and LangFuse
from langchain import HuggingFaceEndpoint
from langfuse import LangFuseCallback
# Initialize your Hugging Face endpoint
hf_endpoint = HuggingFaceEndpoint(api_key="your_hugging_face_api_key")
# Initialize LangFuse callback
langfuse_callback = LangFuseCallback(api_key="your_langfuse_api_key")
# Make a request to the Hugging Face endpoint
response = hf_endpoint.call("your_model_name", input_data)
# Extract token usage and cost from the response
tokens_used = response['usage']['total_tokens']
cost = response['usage']['total_cost']
# Pass the token usage and cost to LangFuse
langfuse_callback.track_usage(
model_name="your_model_name",
tokens=tokens_used,
cost=cost
)This code assumes that the response from the Hugging Face endpoint includes Additionally, Langfuse offers automated token counts for OpenAI and Anthropic models, which can simplify the process. Langfuse automatically calculates token counts based on the ingested prompts and completions, reducing the integration effort. For more detailed information and examples, please refer to the cost and tokens in Langfuse documentation [1][2][3][4][5]. |
Beta Was this translation helpful? Give feedback.
-
|
If this does not work out of the box via the langchain integration, please open a bug report issue. You should see toke counts in langfuse. |
Beta Was this translation helpful? Give feedback.
If this does not work out of the box via the langchain integration, please open a bug report issue. You should see toke counts in langfuse.