-
Notifications
You must be signed in to change notification settings - Fork 3
Python
IntelliNode edited this page Sep 9, 2023
·
3 revisions
You can run evaluations across multiple models and select the suitable model for your use case based on quantitive methods.
Below is an example to compare llama 13b-chat, openai gpt-3.5, and cohere command models.
import requests
import json
url = "http://localhost/evaluate/llm"
payload = json.dumps({
"userInput": "User input or question.",
"targetAnswers": [
"optimal answer example1.",
"optimal answer example2.",
"optimal answer example3." ],
"semantic": {
"api_key": "",
"provider": "openai"
},
"evaluate": [
{
"apiKey": "",
"provider": "replicate",
"type": "chat",
"model": "13b-chat",
"maxTokens": 50
},
{
"apiKey": "",
"provider": "cohere",
"type": "completion",
"model": "command",
"maxTokens": 50
},
{
"apiKey": "",
"provider": "openai",
"type": "chat",
"model": "gpt-3.5-turbo",
"maxTokens": 50,
"temperature": 0.7
}
]
})
headers = {
'X-API-KEY': '<microservice-key>',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)Below is a snapshot of the expected output format:
{
'openai/gpt-3.5-turbo': [
{
prediction: 'Photosynthesis is how plants make food for themselves....',
score_cosine_similarity: 0.9566836802012463,
score_euclidean_distance: 0.29175853870023755
}
],
'cohere/command': [
{
prediction: "Photosynthesis is the process by which plants use the energy .....",
score_cosine_similarity: 0.9378139154300577,
score_euclidean_distance: 0.3512465738424273
}
],
'replicate/13b-chat': [
prediction: "Here's an explanation of photosynthesis in simple terms .....",
score_cosine_similarity: 0.9096764395396765,
score_euclidean_distance: 0.4248874961328429
],
lookup: {
cosine_similarity: 'a value closer to 1 indicates a higher degree of similarity between two vectors',
euclidean_distance: 'the lower the value, the closer the two points'
}
}