|
1 | | -# Benchmark comparison for the models |
2 | | -- Hardware: Macbook pro 14 inches with m1 pro and 16 GB of ram |
| 1 | +# Benchmark analysis |
| 2 | +# Local models |
| 3 | +The 3 websites benchmark are: |
| 4 | +- Example 1: https://perinim.github.io/projects |
| 5 | +- Example 2: https://www.wired.com |
| 6 | +- Example 3: https://www.amazon.it/s?k=alexa&__mk_it_IT=ÅMÅŽÕÑ&crid=1WWVF1RGDBBSB&sprefix=alex%2Caps%2C114&ref=nb_sb_noss_2 |
3 | 7 |
|
| 8 | +The time is measured in seconds |
| 9 | + |
| 10 | +The model runned for this benchmark is Mistral on Ollama with nomic-embed-text |
| 11 | + |
| 12 | +| Hardware | Example 1 | Example 2 | Example 3 | |
| 13 | +| ----------------------- | --------- | --------- | --------- | |
| 14 | +| Macbook pro 14 inches | 26.10<br> | 60.915 | 200.77 | |
| 15 | +| Ubuntu with Radeon M260 | 296.98 | 1003.56 | / | |
| 16 | +**Note**: the examples on Docker are not runned on other devices than the Macbook because the performance are to slow (10 times slower than Ollama). Indeed the results are the following: |
| 17 | + |
| 18 | +| Hardware | Example 1 | Example 2 | Example 3 | |
| 19 | +| --------------------- | --------- | --------- | --------- | |
| 20 | +| Macbook pro 14 inches | 240.22 | 612.48 | 2008.32 | |
| 21 | +# Performance on APIs services |
4 | 22 | ### Example 1: personal portfolio |
5 | 23 | **URL**: https://perinim.github.io/projects |
6 | 24 | **Task**: List me all the projects with their description. |
7 | 25 |
|
8 | | -| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | |
9 | | -| ----------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | |
10 | | -| gpt-3.5-turbo | 35.98 | 858 | 512 | 346 | 2 | 0.00146 | |
11 | | -| gpt-4-turbo-preview | 13.907 | 866 | 512 | 354 | 2 | 0.01574 | |
12 | | -| Ollama with Mistral and embeddings | 26.10 | 0 | 0 | 0 | 0 | 0 | |
13 | | -| Docker with Mistral and embeddings | 240.22 | 0 | 0 | 0 | 0 | 0 | |
| 26 | +| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | |
| 27 | +| ------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | |
| 28 | +| gpt-3.5-turbo | 35.98 | 858 | 512 | 346 | 2 | 0.00146 | |
| 29 | +| gpt-4-turbo-preview | 13.907 | 866 | 512 | 354 | 2 | 0.01574 | |
| 30 | + |
14 | 31 | ### Example 2: Wired |
15 | 32 | **URL**: https://www.wired.com |
16 | 33 | **Task**: List me all the articles with their description. |
17 | 34 |
|
18 | | -| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | |
19 | | -| ----------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | |
20 | | -| gpt-3.5-turbo | 87.03 | 3780 | 3760 | 3000 | 2 | 0.01319 | |
21 | | -| gpt-4-turbo-preview | 74.90 | 5306 | 3060 | 2246 | 2 | 0.09798 | |
22 | | -| Ollama with Mistral and embeddings | 60.915 | 0 | 0 | 0 | 0 | 0 | |
23 | | -| Docker with Mistral and embeddings | 612.48 | 0<br> | 0<br> | 0<br> | 0<br> | 0<br> | |
| 35 | +| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | |
| 36 | +| ------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | |
| 37 | +| gpt-3.5-turbo | 87.03 | 3780 | 3760 | 3000 | 2 | 0.01319 | |
| 38 | +| gpt-4-turbo-preview | 74.90 | 5306 | 3060 | 2246 | 2 | 0.09798 | |
24 | 39 |
|
25 | 40 | ### Example 3: Amazon product page |
26 | 41 | **URL**: https://www.amazon.it/s?k=alexa&__mk_it_IT=ÅMÅŽÕÑ&crid=1WWVF1RGDBBSB&sprefix=alex%2Caps%2C114&ref=nb_sb_noss_2 |
27 | 42 | **Task**: List me all the articles with their the costs and image url. |
28 | 43 |
|
29 | | -| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | |
30 | | -| ----------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | |
31 | | -| gpt-3.5-turbo | 145.55 | 26038 | 18091 | 7947 | 5 | 0.04303 | |
32 | | -| gpt-4-turbo-preview | 82.38 | 15640 | 13698 | 1942 | 2 | 0.19524 | |
33 | | -| Ollama with Llama2 and embeddings | 200.77 | 0<br> | 0<br> | 0<br> | 0<br> | 0<br> | |
34 | | -| Docker with Mistral and embeddings | 2008.32 | 0<br> | 0<br> | 0<br> | 0<br> | 0<br> | |
| 44 | +| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | |
| 45 | +| ------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | |
| 46 | +| gpt-3.5-turbo | 145.55 | 26038 | 18091 | 7947 | 5 | 0.04303 | |
| 47 | +| gpt-4-turbo-preview | 82.38 | 15640 | 13698 | 1942 | 2 | 0.19524 | |
| 48 | + |
35 | 49 | ## Hosting services |
36 | 50 | [[💻 Provider costs informations]] |
0 commit comments