|
1 |
| -Imagine you are building a smart AI assistant that: |
2 |
| - - Remembers chats, but only temporarily |
3 |
| - - Thinks by meaning, not by matching exact words |
4 |
| - - Cleans up after itself, with no manual scripts |
| 1 | +Build a production-ready AI assistant that remembers conversations semantically and auto-cleans using Redis 8’s new features. |
5 | 2 |
|
6 |
| -Redis 8 has two powerful capabilities to make this happen: |
7 |
| - - Field-level expiration: Let individual chat messages expire on their own |
8 |
| - - Vector similarity search: Find past messages based on meaning, not keywords |
| 3 | +This smart AI assistant offers: |
| 4 | + - **Ephemeral Memory**: Messages expire automatically at different intervals |
| 5 | + - **Semantic Recall**: Search past chats by meaning, not just keywords |
| 6 | + - **Zero Maintenance**: No cleanup scripts or cron jobs needed |
| 7 | + - **Multi-User Support**: Memory isolated by user and session |
| 8 | + - **Hybrid Search**: Combine text and vector search for better results |
9 | 9 |
|
10 |
| -Let’s dive in. |
| 10 | +**Note**: [Redis 8](https://hub.docker.com/_/redis/tags) is required for this tutorial Redis 8 since it introduces `HSETEX`, which lets you set TTL per hash field, ideal for ephemeral chat memory without complex expiration logic. |
11 | 11 |
|
12 |
| -### Short-Term Memory with Field-Level Expiry |
13 |
| -Each chat session is stored as a Redis Hash. |
14 |
| -Each message is a field in the hash. |
15 |
| -Redis 8’s new HSETEX command allows you to assign a TTL to each field, perfect for building ephemeral, session-level memory. |
| 12 | +### Hierarchical Memory Structure |
| 13 | +Store chat sessions as Redis hashes. Each message is a field with its own TTL. |
16 | 14 |
|
17 | 15 | ```redis:[run_confirmation=true] Upload Session Data
|
18 |
| -// session:42 is the session ID |
19 |
| -// msg:<timestamp> ensures uniqueness and traceability |
20 |
| -HSETEX session:42 msg:1717935301 120 "Hi Chatbot!" |
21 |
| -HSETEX session:42 msg:1717935361 180 "What can you do?" |
22 |
| -HSETEX session:42 msg:1717935440 90 "Can you remind me about my tasks?" |
23 |
| -HSETEX session:42 msg:1717935720 30 "What's the news today?" |
| 16 | +// User-scoped sessions for isolation |
| 17 | +// Pattern: user:{user_id}:session:{session_id} |
| 18 | +HSETEX user:alice:session:morning msg:1717935301 3600 "Good morning! What's my schedule today?" |
| 19 | +HSETEX user:alice:session:morning msg:1717935361 1800 "Remind me about the team meeting at 2 PM" |
| 20 | +HSETEX user:alice:session:morning msg:1717935420 900 "What's the weather forecast?" |
| 21 | +HSETEX user:alice:session:morning msg:1717935480 300 "Thanks, that's all for now" |
| 22 | +
|
| 23 | +// Different user, same session pattern |
| 24 | +HSETEX user:bob:session:work msg:1717935500 7200 "I need to prepare for the client presentation" |
| 25 | +HSETEX user:bob:session:work msg:1717935560 3600 "What are the key points I should cover?" |
| 26 | +
|
24 | 27 | ```
|
25 | 28 |
|
26 |
| -Each field automatically expires after its TTL (in seconds). |
27 |
| -No need for cron jobs or background workers. |
28 |
| -What you get: |
29 |
| - - Clean memory |
30 |
| - - Zero manual cleanup |
31 |
| - - Session-scoped retention, just like short-term memory in humans |
| 29 | +### Memory Tiers for Different Lifetimes |
| 30 | +Control how long messages last depending on their importance. |
| 31 | + |
| 32 | +```redis:[run_confirmation=true] Memory Tiers Strategy |
| 33 | +// Short-term (5 minutes) - Immediate context |
| 34 | +HSETEX user:alice:session:current msg:1717935301 300 "Current conversation context" |
| 35 | +
|
| 36 | +// Medium-term (30 minutes) - Session memory |
| 37 | +HSETEX user:alice:session:current msg:1717935302 1800 "Important session details" |
| 38 | +
|
| 39 | +// Long-term (2 hours) - Cross-session context |
| 40 | +HSETEX user:alice:session:current msg:1717935303 7200 "Key user preferences and facts" |
| 41 | +``` |
32 | 42 |
|
| 43 | +### Check Current Session Memory |
| 44 | +No manual cleanup needed; expired messages vanish automatically. |
33 | 45 |
|
34 |
| -Try it: After a few minutes, run `HGETALL session:42` and see what's left. |
| 46 | +```redis:[run_confirmation=true] Monitor Session State Over Time |
| 47 | +// After a few minutes, run this command to see what's left. |
| 48 | +HGETALL user:alice:session:morning |
| 49 | +``` |
35 | 50 |
|
36 |
| -### Vector Search for Semantic Recall |
37 |
| -Now, your assistant needs to “recall” semantically related messages, not just match by words. |
38 |
| -To do that, you’ll: |
39 |
| - - Convert messages to vector embeddings |
40 |
| - - Store them in Redis |
41 |
| - - Use Vector Search with FT.SEARCH for semantic retrieval |
| 51 | +### Vector Search Setup for Semantic Recall |
| 52 | +Create an index to store messages as vectors for semantic search. |
42 | 53 |
|
43 | 54 | ```redis:[run_confirmation=true] Create a Vector Index
|
44 |
| -FT.CREATE idx:memory ON HASH PREFIX 1 memory: SCHEMA |
45 |
| - message TEXT |
46 |
| - embedding VECTOR FLAT // FLAT = exact vector search |
47 |
| - 6 |
48 |
| - TYPE FLOAT32 |
49 |
| - DIM 8 // DIM = embedding size, DIM 8 is just for demo purposes. In real use, embeddings are usually 128–1536 dimensions. |
50 |
| - DISTANCE_METRIC COSINE // COSINE = measures semantic closeness |
| 55 | +FT.CREATE idx:ai_memory |
| 56 | + ON HASH |
| 57 | + PREFIX 1 memory: |
| 58 | + SCHEMA |
| 59 | + user_id TAG SORTABLE |
| 60 | + session_id TAG SORTABLE |
| 61 | + message TEXT WEIGHT 2.0 PHONETIC dm:en |
| 62 | + context TEXT WEIGHT 1.0 |
| 63 | + timestamp NUMERIC SORTABLE |
| 64 | + embedding VECTOR HNSW 6 |
| 65 | + TYPE FLOAT32 |
| 66 | + DIM 8 // DIM = embedding size, DIM 8 is just for demo purposes. In real use, embeddings are usually 128–1536 dimensions. |
| 67 | + DISTANCE_METRIC COSINE // COSINE = measures semantic closeness |
| 68 | + INITIAL_CAP 10000 |
| 69 | + M 16 |
| 70 | + EF_CONSTRUCTION 200 |
51 | 71 | ```
|
52 | 72 |
|
53 |
| -Now, let’s add entries for your chatbots: |
| 73 | +Add sample vectorized messages (embedding dims are demo-sized): |
54 | 74 |
|
55 | 75 | ```redis:[run_confirmation=true] Add entries for the chatbot
|
56 |
| -// Embeddings are stored as binary FLOAT32 vectors - this is a compact format required by Redis Vector Serch indexes |
57 |
| -HSET memory:1 message "Book a dentist appointment" embedding "\x00\x00\x80?\x00\x00\x00@\x00\x00@@\x00\x00\x80@\x00\x00\x00@\x00\x00\x00@" |
58 |
| -HSET memory:2 message "Remind me to water plants" embedding "\x00\x00\x80@\x00\x00\x80@\x00\x00\x80@\x00\x00\x80?\x00\x00\x80?\x00\x00@@" |
59 |
| -HSET memory:3 message "What’s the weather like?" embedding "\x00\x00@@\x00\x00\x00@\x00\x00\x00@\x00\x00\x00@\x00\x00\x80?\x00\x00\x80?" |
60 |
| -HSET memory:4 message "Cancel my gym session" embedding "\x00\x00@@\x00\x00\x00@\x00\x00\x80?\x00\x00\x80@\x00\x00\x00@\x00\x00\x00@" |
61 |
| -HSET memory:5 message "Start a new shopping list" embedding "\x00\x00\x00@\x00\x00\x00@\x00\x00\x80?\x00\x00\x80@\x00\x00\x80?\x00\x00@@" |
| 76 | +HSET memory:alice:1 user_id "alice" session_id "morning" message "I have a dentist appointment at 3 PM today" context "healthcare scheduling appointment" timestamp 1717935301 embedding "\x00\x00\x80?\x00\x00\x00@\x00\x00@@\x00\x00\x80@\x00\x00\x00@\x00\x00\x00@" |
| 77 | +HSET memory:alice:2 user_id "alice" session_id "morning" message "Remind me to water the plants in my office" context "task reminder plants office" timestamp 1717935361 embedding "\x00\x00\x80@\x00\x00\x80@\x00\x00\x80@\x00\x00\x80?\x00\x00\x80?\x00\x00@@" |
| 78 | +HSET memory:alice:3 user_id "alice" session_id "work" message "Schedule a meeting with the engineering team" context "work scheduling meeting team" timestamp 1717935420 embedding "\x00\x00@@\x00\x00\x00@\x00\x00\x00@\x00\x00\x00@\x00\x00\x80?\x00\x00\x80?" |
| 79 | +HSET memory:bob:1 user_id "bob" session_id "work" message "I need to review the quarterly sales report" context "business analysis quarterly report" timestamp 1717935480 embedding "\x00\x00@@\x00\x00\x00@\x00\x00\x80?\x00\x00\x80@\x00\x00\x00@\x00\x00\x00@" |
62 | 80 | ```
|
63 | 81 |
|
64 |
| -Now your messages are vectorized and ready for search. |
65 |
| - |
66 | 82 | ### Let Chatbot Think – Semantic Search with Vectors
|
67 |
| -When a user sends a new message, convert it to an embedding and run a KNN search: |
68 |
| - |
69 |
| -```redis:[run_confirmation=true] Search For Similar Messages |
70 |
| -// Returns the top 3 semantically similar messages, even if no words match directly. |
71 |
| -FT.SEARCH idx:memory "*=>[KNN 3 @embedding $vec AS score]" |
72 |
| - PARAMS 2 vec "\x00\x00@@\x00\x00\x80@\x00\x00\x00@\x00\x00\x80?\x00\x00@@\x00\x00\x00@" |
73 |
| - SORTBY score |
74 |
| - DIALECT 2 |
| 83 | +When a user says something new, find all related past conversations across your entire system based on semantic meaning. |
| 84 | + |
| 85 | +```redis:[run_confirmation=false] Find Top 5 Related Messages By Meaning |
| 86 | +FT.SEARCH idx:ai_memory |
| 87 | + "*=>[KNN 5 @embedding $query_vec AS vector_score]" |
| 88 | + PARAMS 2 query_vec "\x00\x00@@\x00\x00\x80@\x00\x00\x00@\x00\x00\x80?\x00\x00@@\x00\x00\x00@" |
| 89 | + RETURN 6 user_id message context vector_score timestamp |
| 90 | + SORTBY vector_score ASC |
| 91 | + DIALECT 2 |
75 | 92 | ```
|
76 | 93 |
|
77 | 94 | Now your assistant “remembers” things it’s heard before - by meaning.
|
78 | 95 |
|
79 |
| -### Real-Time Session Cleanup – Redis Handles It |
80 |
| -Want to check what's still in memory? |
| 96 | +### User-Scoped Semantic Search |
| 97 | +Your AI should only recall memories from the specific user it's talking to, not leak information between users. |
| 98 | + |
| 99 | +```redis:[run_confirmation=false] Find Similar Memories For Specific User Only |
| 100 | +FT.SEARCH idx:ai_memory |
| 101 | + "(@user_id:{alice}) => [KNN 3 @embedding $query_vec AS vector_score]" |
| 102 | + PARAMS 2 query_vec "\x00\x00@@\x00\x00\x80@\x00\x00\x00@\x00\x00\x80?\x00\x00@@\x00\x00\x00@" |
| 103 | + RETURN 6 user_id message context vector_score session_id |
| 104 | + SORTBY vector_score ASC |
| 105 | + DIALECT 2 |
| 106 | +``` |
| 107 | + |
| 108 | +### Time-Bounded Semantic Search |
| 109 | +When users ask about "recent" things, limit your search to a specific time window while still using semantic matching. |
| 110 | + |
| 111 | +```redis:[run_confirmation=false] Find recent similar memories (last 24 hours) |
| 112 | +FT.SEARCH idx:ai_memory |
| 113 | + "(@timestamp:[1717849200 +inf]) => [KNN 3 @embedding $query_vec AS vector_score]" |
| 114 | + PARAMS 2 query_vec "\x00\x00@@\x00\x00\x80@\x00\x00\x00@\x00\x00\x80?\x00\x00@@\x00\x00\x00@" |
| 115 | + RETURN 6 message timestamp vector_score |
| 116 | + SORTBY timestamp DESC |
| 117 | + DIALECT 2 |
| 118 | +``` |
| 119 | + |
| 120 | +### Session-Specific Recall |
| 121 | +When users refer to something from "earlier in our conversation," search only within the current session context. |
81 | 122 |
|
82 |
| -```redis:[run_confirmation=false] Check Sessions |
83 |
| -HGETALL session:42 |
84 | 123 | ```
|
| 124 | +FT.SEARCH idx:ai_memory |
| 125 | + "(@user_id:{alice} @session_id:{morning}) => [KNN 10 @embedding $query_vec AS vector_score]" |
| 126 | + PARAMS 2 query_vec "\x00\x00@@\x00\x00\x80@\x00\x00\x00@\x00\x00\x80?\x00\x00@@\x00\x00\x00@" |
| 127 | + RETURN 6 message context vector_score timestamp |
| 128 | + SORTBY vector_score ASC |
| 129 | + DIALECT 2 |
| 130 | +``` |
| 131 | + |
| 132 | +### Want to check what's still in memory? |
| 133 | + |
| 134 | +Only unexpired fields remain. |
85 | 135 |
|
86 |
| -Only the unexpired fields remain. Redis does the cleanup invisibly in the background. |
87 |
| -Your assistant has a clean, focused mind at all times. |
| 136 | +```redis:[run_confirmation=false] Check Sessions |
| 137 | +HGETALL memory:alice:1 |
| 138 | +HGETALL memory:alice:2 |
| 139 | +HGETALL memory:alice:3 |
| 140 | +HGETALL memory:bob:1 |
| 141 | +HGETALL memory:bob:2 |
| 142 | +``` |
88 | 143 |
|
89 | 144 | ### Next Steps
|
90 | 145 | Now that your assistant has memory and meaning, you can:
|
91 |
| - - Tie session messages to store embeddings for per-session recall |
92 |
| - - Use RAG (Retrieval-Augmented Generation) by combining Redis Vector Search with LLMs |
93 |
| - - Add per-user memory: prefix session keys with a user ID (user:42:session:...) |
94 |
| - - Introduce a fallback to persistent storage for long-term memory using Redis Flex |
| 146 | + - Combine Redis Vector Search with LLMs (RAG) |
| 147 | + - Use OpenAI or sentence-transformers for embeddings |
| 148 | + - Add fallback persistent storage with Redis Flex |
| 149 | + - Manage users with ACLs, quotas, and keyspace notifications |
0 commit comments