Skip to content

sensoris/semcache-benchmark

Repository files navigation

semcache-benchmark

set up

Symlink gptcache folder to data ln -s GPTCache/examples/benchmark data/

Extract the data from the GPTCache benchmark folder tar -xvzf similiar_qqp_full.json.gz

Run dummy upstream server uvicorn dummy_server:app --reload --host 0.0.0.0 --port 8081

Run load test python similarity_test.py

Notes from stress test

Cache eviction

The cache eviction seems to work:

  • memory usage stays constant
  • I see plenty of eviction logs

miss latency

  • miss average latency is 3.03 seconds when I set the dummy upstream server to sleep in 3 seconds with fixed small cache size, meaning cache eviction is done in roughly .03 seconds
  • total miss latency when cache is missed and dummy upstream returns instantly is average 0.08ms

Hit latency

  • hit latency is solidly around 0.03ms

memory usage

  • at a cache size of 682 190mb
  • at a cache size of 1798 194 mb
  • at a cache size of 2888 200mb
  • at a cache size of 5334 204mb
  • at a cache size of 7583 209mb
  • at a cache size of 9488 212mb
  • at a cache size of 11310 213mb

If we run a linear regression on this we find that the mb's scale with roughly 0.0002 per new entry with a minimum of 191mb's at startup (system overhead)

Request rate

  • managed to get it up to 45 req/s running everything locally without issue

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages