A minimal Python demo showing how to use Redis LangCache with OpenAI to implement semantic caching for LLM queries.
This example caches responses based on semantic similarity, reducing latency and API usage costs.
.
├── main.py # Main script for running the demo
├── requirements.txt # Python dependencies
├── .env.EXAMPLE # Example environment variable configuration
└── .env # Your actual environment variables (not committed)
- Python 3.10+
- A Redis LangCache instance (Redis Cloud)
- An OpenAI API key
-
Clone this repository
git clone https://github.com/<your-repo>/gabs-redis-langcache.git cd gabs-redis-langcache
-
Create and activate a virtual environment
python3 -m venv .venv source .venv/bin/activate # Mac/Linux .venv\Scripts\activate # Windows
-
Install dependencies
pip install -r requirements.txt
-
Configure environment variables
- Copy
.env.EXAMPLE
to.env
- Fill in your credentials:
OPENAI_API_KEY=sk-proj-<your-openai-key> OPENAI_MODEL=gpt-4o LANGCACHE_SERVICE_KEY=<your-langcache-service-key> LANGCACHE_CACHE_ID=<your-langcache-cache-id> LANGCACHE_BASE_URL=https://gcp-us-east4.langcache.redis.io
- Copy
Run the demo:
python main.py
Example interaction:
LangCache Semantic Cache Chat - Type 'exit' to quit.
Ask something: What is Redis LangCache?
[CACHE MISS]
[Latency] Cache miss search took 0.023 seconds
[Latency] OpenAI response took 0.882 seconds
Response: Redis LangCache is a semantic caching solution...
------------------------------------------------------------
Ask something: Tell me about LangCache
[CACHE HIT]
[Latency] Cache hit in 0.002 seconds
Response: Redis LangCache is a semantic caching solution...
------------------------------------------------------------
- Search in Redis LangCache for a semantically similar question.
- If a cache hit is found (above the similarity threshold), return it instantly.
- If a cache miss occurs:
- Query OpenAI.
- Store the response in Redis LangCache for future reuse.
MIT - Feel free to fork it!