HackIllinois 2026
Frequent memory recall can slow the progression of dementia symptoms. Clariti makes that recall effortless, emotional, and safe.
Over 55 million people worldwide live with dementia. Research shows that frequent, emotionally positive memory recall can slow cognitive decline and improve quality of life. Yet the existing tools are impersonal and frustrating for the very people they aim to help.
Clariti is a smart memory-sharing and recall application built for individuals experiencing dementia and memory loss. It transforms scattered photos, voice notes, and written memories into a living, searchable memory library that feels like a warm conversation with a trusted friend.
| Feature | Description |
|---|---|
| LLM-Augmented Voice Assistant | Full voice conversation pipeline: speak a question → STT → semantic RAG retrieval → LLM answer generation → TTS → hear the response. Powered by ElevenLabs + Modal. |
| Facial Recognition | Automatic face detection and identification using InsightFace (RetinaFace + ArcFace). Family members enroll their faces once; Clariti recognizes them in every future photo. |
| Semantic Memory Search | Vector embeddings (via Supabase pgvector) enable natural-language queries across all memories in a group. Ask "When did we go to the beach?" and Clariti finds the right photo. |
| AI Image Descriptions | Google Gemini 2.5 Flash generates rich, contextual descriptions of uploaded photos — combining what the AI sees with what the family described. |
| Group Memory Sharing | Create family groups with join codes. All members contribute memories; the person with dementia can browse and query everything in one unified library. |
| Accessibility-First Design | Large touch targets, high-contrast UI, voice-first interaction model, and automatic navigation back to home — designed for users with cognitive impairments. |
Clariti's voice and text Q&A follows a two-stage RAG pipeline:
-
Semantic Retrieval — User's question is embedded via a Supabase Edge Function, then compared against all memory embeddings in the user's group using cosine similarity (pgvector). The top-matching memory is selected.
-
LLM Generation — The matched memory's content (user description, AI description, identified people) is assembled into a rich prompt and sent to a model running on Modal (A100-80GB). The LLM generates a warm, second-person response grounded exclusively in the retrieved context.
Stores user identity and facial recognition data. Linked to auth.users.
| Column | Type | Description |
|---|---|---|
id |
uuid (PK, FK → auth.users) |
User's unique identifier |
full_name |
text |
Display name |
avatar_url |
text |
Profile picture URL |
bio |
text |
User bio |
face_embedding |
vector |
ArcFace 512-D facial embedding for recognition |
created_at |
timestamptz |
Account creation time |
updated_at |
timestamptz |
Last profile update |
Family or caregiver groups that share memories.
| Column | Type | Description |
|---|---|---|
id |
uuid (PK) |
Group identifier |
name |
text |
Group display name |
join_code |
text (unique) |
Shareable code for joining the group |
created_at |
timestamptz |
Creation time |
Many-to-many relationship between users and groups.
| Column | Type | Description |
|---|---|---|
id |
uuid (PK) |
Membership record ID |
group_id |
uuid (FK → groups) |
Group reference |
user_id |
uuid (FK → auth.users) |
User reference |
created_at |
timestamptz |
Join time |
role |
text |
Member role (e.g., admin, member) |
The core content table — each row is a single memory (photo + metadata).
| Column | Type | Description |
|---|---|---|
id |
uuid (PK) |
Memory identifier |
group_id |
uuid (FK → groups) |
Owning group |
user_id |
uuid (FK → auth.users) |
Uploader |
content |
text |
Human-written description of the memory |
image_url |
text |
URL to the stored image |
ai_description |
text |
Gemini-generated image description |
text_embedding |
vector |
Semantic embedding for RAG retrieval |
users_in_image |
uuid[] |
Profile IDs of recognized faces |
created_at |
timestamptz |
Upload time |
match_profile_face(query_embedding, match_threshold, match_count)— RPC function that performs cosine similarity search against allprofiles.face_embeddingvectors to identify a face.- Supabase Edge Function
ragTest— Embeds a text question and performs vector similarity search againstmemories.text_embeddingwithin a specific group.
Full voice conversation turn. Accepts audio, returns audio.
| Parameter | Type | Location | Description |
|---|---|---|---|
audio |
file |
multipart | Recorded audio (m4a, wav, mp3, etc.) |
group_id |
string |
form | Group to search memories in |
user_id |
string |
form | Current user's profile ID (optional) |
Response: audio/mpeg stream with headers X-Transcript and X-Answer.
Same pipeline as /voice-chat, but returns JSON for easier frontend parsing.
Response:
{
"transcript": "Who was I with at the park?",
"answer": "You were at the park with Avi and Akash...",
"audio_base64": "<base64-encoded MP3>"
}Generate an AI description of an image using Google Gemini.
{
"image_url": "https://...",
"user_description": "Chad at the beach"
}Detect and store a user's facial embedding from their profile photo.
{
"bucket": "profile-image-bucket",
"path": "avatars/user123.jpg",
"profile_id": "uuid-of-user"
}Detect all faces in a memory image and match them against enrolled profiles.
{
"bucket": "memory-images",
"path": "photos/memory456.jpg",
"memory_id": "uuid-of-memory"
}Test the RAG pipeline with a specific memory ID.
{
"question": "Who was at this event?",
"memory_id": "uuid-of-memory",
"user_id": "uuid-of-current-user"
}Returns server status and current UTC timestamp.
- Python 3.12+
- Node.js 18+ and npm
- Expo CLI (
npx expo --versionshould return ≥ 54.0) - Modal account (for GPU inference)
- Supabase project (with pgvector enabled)
git clone https://github.com/ObviAvi/Clariti.git
cd Clariticd backend
# Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
python -m pip install --upgrade pip
pip install -r requirements.txtCreate a backend/.env file:
# Google Gemini — Vision LLM for image descriptions
GEMINI_API_KEY=your_gemini_api_key
# ngrok — Public tunnel for mobile device testing (optional)
NGROK_AUTHTOKEN=your_ngrok_authtoken
# Supabase — Database, Auth, and Storage
SUPABASE_URL=your_supabase_project_url
SUPABASE_SERVICE_KEY=your_supabase_service_role_key
# ElevenLabs — Speech-to-Text / Text-to-Speech
ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_VOICE_ID=your_elevenlabs_voice_idModal hosts the two GPU-intensive services: facial recognition and LLM inference. No API keys are needed for the models themselves, they're open-source and run directly on Modal's GPUs.
# Install the Modal client (if not already in requirements.txt)
(.venv) pip install modal
modal token newThis opens a browser window to log in. Once authenticated, your token is saved locally at ~/.modal.toml.
Both Modal apps share a persistent volume for caching model weights (~65 GB for Qwen 2.5 32B, ~1 GB for InsightFace). The volume is created automatically on first deploy, but you can also create it explicitly:
modal deploy backend/modal_vision.pyThis deploys the clariti-face app with:
- InsightFace
buffalo_l(RetinaFace + ArcFace) baked into the container image - An A100-80GB GPU with 1 warm container
On first deploy, the container image is built (~2–3 min).
modal deploy backend/modal_rag_output.pyThis deploys the clariti-rag-llm app with:
- Qwen 2.5 32B Instruct loaded in float16
- An A100-80GB GPU with 1 warm container
- Model weights stored in the
clariti-model-cachevolume
First run note: The very first invocation after deploying
clariti-rag-llmwill download ~65 GB of model weights from Hugging Face. This takes 5–10 minutes. All subsequent calls (even after container restarts) reuse the cached weights from the persistent volume.
# List running Modal apps
modal app list
# You should see:
# clariti-face (deployed)
# clariti-rag-llm (deployed)# Redeploy after code changes
modal deploy backend/modal_vision.py
modal deploy backend/modal_rag_output.py
# Stop running apps (to save costs)
modal app stop clariti-face
modal app stop clariti-rag-llm
# Run a Modal file locally for testing (uses Modal's cloud GPUs)
modal run backend/modal_vision.py
modal run backend/modal_rag_output.pycd backend
python main.pyThe server starts on http://localhost:8000. API docs available at http://localhost:8000/docs.
cd frontend
# Install dependencies
npm installCreate a frontend/.env file:
# Supabase — Client-side connection (use the anon/public key, NOT the service key)
EXPO_PUBLIC_SUPABASE_URL=your_supabase_project_url
EXPO_PUBLIC_SUPABASE_ANON_KEY=your_supabase_anon_key
# Backend API — FastAPI edge server URL (use ngrok URL for physical devices)
EXPO_PUBLIC_BACKEND_API_URL=your_backend_api_urlnpx expo startScan the QR code with Expo Go (iOS/Android) or press i for iOS simulator / a for Android emulator.
Clariti — Because every memory matters.