Skip to content

Latest commit

 

History

History
1016 lines (793 loc) · 36.1 KB

File metadata and controls

1016 lines (793 loc) · 36.1 KB

Sauti Darasa - Project Status & Onboarding Guide

Last Updated: January 1, 2026
Repository: https://github.com/Eli-Keli/SautiDarasa (forked from ViggenKorir/SautiDarasa)
Current Version: v1.0.0-alpha
Project Phase: Post-Hackathon MVP - Production Deployed
Status: Active Development
Live Demo: https://sauti-darasa-pwa-512236104756.africa-south1.run.app


📋 Executive Summary

Sauti Darasa is a Progressive Web App (PWA) that provides real-time speech-to-text captioning for Kenyan classrooms, specifically designed to support deaf and hard-of-hearing students. The project was developed during a hackathon that concluded on December 5, 2025, and represents a functional MVP with both working features and areas requiring further development.

Repository: https://github.com/Eli-Keli/SautiDarasa
Live Demo: https://sauti-darasa-pwa-512236104756.africa-south1.run.app
Deployment: Google Cloud Run (africa-south1)
Team: Hackathon collaborative project

What We Built (December 2025)

Fully functional React frontend with teacher and student views
Real-time caption display using Firebase Realtime Database
Audio recording system with waveform visualization
PWA capabilities (installable, offline support, service worker)
Demo mode for testing without backend infrastructure
Google Cloud deployment configuration (Docker, Cloud Run, App Engine)
Production deployment to Google Cloud Run (africa-south1 region)
Firebase project configured (sautidarasa - BLAZE plan)
Comprehensive documentation (6 markdown files totaling ~15,000 lines)
Mobile-responsive design with accessibility considerations

What Needs Implementation (2026 Roadmap)

Backend transcription service (Google Cloud Speech-to-Text integration)
Production Firebase authentication and security rules
CI/CD pipeline with automated testing
Comprehensive test suite (unit, integration, e2e tests)
Advanced accessibility features (keyboard shortcuts, screen reader optimization)
Performance optimizations (code splitting, lazy loading)
Custom domain setup and SSL configuration
User analytics and error monitoring (Cloud Logging, Cloud Monitoring)


🎯 Project Context

The Problem

In Kenyan classrooms, deaf and hard-of-hearing students face significant barriers to education due to the lack of real-time captioning services. Traditional solutions are expensive, require specialized equipment, and are not widely available.

Our Solution

Sauti Darasa ("Class Voice" in Swahili) provides a free, web-based, installable PWA that:

  • Teachers can use with just a smartphone and internet connection
  • Provides real-time captions via speech recognition
  • Works across devices with a simple session-sharing model
  • Can be installed as a native app on mobile devices

Target Users

  1. Teachers: Create sessions, record their voice, share session links
  2. Students: Join sessions, view real-time captions in large text
  3. Administrators: (Future) Monitor usage, manage classrooms

🏗️ Architecture Overview

High-Level System Design

┌─────────────────────────────────────────────────────────────────┐
│                         TEACHER DEVICE                          │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  React PWA (Teacher View)                                │  │
│  │  • Audio Recording (MediaRecorder API)                   │  │
│  │  • Waveform Visualization (Canvas API)                   │  │
│  │  • Session Management                                    │  │
│  │  • Upload Audio Chunks every 1.5s                        │  │
│  └──────────────────────────────────────────────────────────┘  │
│                            ↓ HTTP POST                          │
└─────────────────────────────────────────────────────────────────┘
                               ↓
┌─────────────────────────────────────────────────────────────────┐
│                    BACKEND SERVICE (NOT IMPLEMENTED)             │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  Speech-to-Text API                                      │  │
│  │  • Receives audio chunks (base64 encoded)               │  │
│  │  • Converts speech to text                              │  │
│  │  • Writes captions to Firebase                          │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
                               ↓
┌─────────────────────────────────────────────────────────────────┐
│                   FIREBASE REALTIME DATABASE                     │
│  /captions/{sessionId}/latest                                   │
│  └── { text: "caption", timestamp: 1234567890 }                 │
└─────────────────────────────────────────────────────────────────┘
                               ↓ Real-time sync
┌─────────────────────────────────────────────────────────────────┐
│                         STUDENT DEVICE(S)                        │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  React PWA (Student View)                                │  │
│  │  • Subscribe to Firebase captions/{sessionId}            │  │
│  │  • Display captions in large, high-contrast text        │  │
│  │  • Auto-scroll and update in real-time                  │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Current Implementation Status

✅ Implemented (Green Layer):

  • Teacher View UI with recording controls
  • Student View UI with caption display
  • Firebase Realtime Database integration (reads/writes)
  • Audio capture and chunking (MediaRecorder API)
  • Waveform visualization
  • Session ID generation and URL sharing
  • PWA manifest and service worker
  • Demo mode (simulated captions without backend)

❌ Not Implemented (Missing Layer):

  • Backend transcription service
  • Audio-to-text conversion API
  • Production-ready Firebase security rules
  • User authentication system

🛠️ Technical Stack

Frontend (Fully Implemented)

Technology Version Purpose Status
React 18.3.1 UI framework ✅ Production-ready
TypeScript 5.7.2 Type safety ✅ Configured
Vite 7.2.6 Build tool, dev server ✅ Optimized
TailwindCSS 4.0.13 Styling framework ✅ Custom theme
React Router DOM 7.1.0 Client-side routing ✅ Configured
Firebase 11.1.0 Real-time database ✅ Integrated
vite-plugin-pwa 1.2.0 PWA capabilities ✅ Workbox config
Lucide React 0.468.0 Icon library ✅ Used throughout

Backend (Not Implemented)

Component Technology Options Status
Speech-to-Text Google Cloud Speech-to-Text / OpenAI Whisper / Assembly AI ❌ Not started
API Server Node.js + Express / Python + FastAPI / Go ❌ Not started
Audio Processing FFmpeg / Sox ❌ Not started
Queue System Redis / RabbitMQ (optional) ❌ Not started

Infrastructure (Configured, Not Deployed)

Component Technology Status
Hosting Google Cloud Run ⚠️ Config ready, not deployed
Container Docker (multi-stage) ⚠️ Dockerfile created
Web Server nginx (Alpine) ⚠️ Config ready
CI/CD Google Cloud Build ⚠️ cloudbuild.yaml ready
Monitoring Google Cloud Logging ⚠️ Not configured
Domain Custom domain ❌ Not acquired

Development Tools

Tool Purpose Status
ESLint Code linting ✅ Configured
TypeScript Type checking ✅ Strict mode
Prettier Code formatting ❌ Not configured
Vitest Unit testing ⚠️ Installed, no tests written
Playwright E2E testing ❌ Not installed
Husky Git hooks ❌ Not configured

📁 Repository Structure

sauti-darasa-frontend/
│
├── public/                          # Static assets (PWA icons, manifest)
│   ├── icons/                       # PWA icons (192x192, 512x512)
│   ├── manifest.json                # PWA manifest
│   └── sw.js                        # Service worker (auto-generated)
│
├── src/
│   ├── components/                  # Reusable React components
│   │   ├── ErrorBoundary.tsx        # ✅ Error boundary with fallback UI
│   │   └── WaveformVisualizer.tsx   # ✅ Canvas-based audio waveform
│   │
│   ├── hooks/                       # Custom React hooks
│   │   ├── useAudioRecorder.ts      # ✅ Audio recording logic
│   │   ├── useFirebaseConnection.ts # ✅ Firebase auto-reconnect
│   │   └── useWakeLock.ts           # ✅ Keep screen awake
│   │
│   ├── pages/                       # Route components
│   │   ├── Home.tsx                 # ✅ Landing page (not styled)
│   │   ├── TeacherView.tsx          # ✅ Teacher recording interface
│   │   └── StudentView.tsx          # ✅ Student caption display
│   │
│   ├── services/                    # External service integrations
│   │   └── firebase.ts              # ✅ Firebase config & initialization
│   │
│   ├── types/                       # TypeScript type definitions
│   │   └── caption.ts               # ✅ Caption interface
│   │
│   ├── utils/                       # Helper functions
│   │   ├── audioUtils.ts            # ✅ Audio encoding, chunking
│   │   └── sessionUtils.ts          # ✅ Session ID generation
│   │
│   ├── App.tsx                      # ✅ Main app component with router
│   ├── App.css                      # ✅ Global styles
│   ├── index.css                    # ✅ Tailwind imports + custom CSS
│   └── main.tsx                     # ✅ App entry point
│
├── docs/                            # 📚 Comprehensive documentation
│   ├── README.md                    # Main documentation hub
│   ├── DEPLOYMENT.md                # Vercel/Firebase/Netlify deployment
│   ├── DEPLOYMENT_GOOGLE_CLOUD.md   # ✅ Complete GCP deployment guide
│   ├── OPTIMIZATION.md              # ✅ Performance & cost optimization
│   ├── TESTING.md                   # ✅ Testing procedures & checklists
│   ├── TROUBLESHOOTING.md           # ✅ Common issues & solutions
│   └── PROJECT_STATUS.md            # ✅ THIS FILE - project overview
│
├── .dockerignore                    # ✅ Docker build exclusions
├── .env.gcloud                      # ✅ Environment template for GCP
├── .gitignore                       # ✅ Git exclusions
├── app.yaml                         # ✅ Google App Engine config
├── deploy-cloud-run.sh              # ✅ Automated Cloud Run deployment
├── Dockerfile                       # ✅ Multi-stage Docker build
├── index.html                       # ✅ HTML entry point
├── nginx.conf                       # ✅ nginx config for Cloud Run
├── package.json                     # ✅ Dependencies & scripts
├── package-lock.json                # ✅ Dependency lock file
├── postcss.config.js                # ✅ PostCSS config (Tailwind v4)
├── tailwind.config.js               # ✅ Tailwind configuration
├── tsconfig.json                    # ✅ TypeScript config
├── tsconfig.app.json                # ✅ App-specific TS config
├── tsconfig.node.json               # ✅ Node-specific TS config
├── vite.config.ts                   # ✅ Vite config with PWA plugin
└── README.md                        # ✅ Project README

🚀 Getting Started (For New Developers)

Prerequisites

# Required tools
node --version    # v18.0.0 or higher
npm --version     # v9.0.0 or higher
git --version     # v2.0.0 or higher

# Optional (for deployment)
docker --version  # v20.0.0 or higher
gcloud --version  # Latest Google Cloud SDK

Quick Start (5 minutes)

# 1. Clone the repository
git clone https://github.com/ViggenKorir/SautiDarasa.git
cd SautiDarasa

# 2. Install dependencies
npm install

# 3. Start development server
npm run dev

# 4. Open browser to http://localhost:5173

# 5. Try demo mode (no setup required!)
# Teacher: http://localhost:5173/teacher?demo=true
# Student: http://localhost:5173/student?demo=true

That's it! Demo mode works immediately without any configuration.

Full Setup (with Firebase - 15 minutes)

# 1. Create Firebase project
# Visit: https://console.firebase.google.com
# Click "Add Project" → Follow wizard

# 2. Enable Realtime Database
# Firebase Console → Build → Realtime Database → Create Database
# Start in "test mode" (temporary - change later)

# 3. Get Firebase credentials
# Project Settings → General → Your apps → Web app
# Copy all configuration values

# 4. Create .env file
cp .env.example .env

# 5. Add Firebase credentials to .env
VITE_FIREBASE_API_KEY=AIzaSy...
VITE_FIREBASE_AUTH_DOMAIN=your-project.firebaseapp.com
VITE_FIREBASE_DATABASE_URL=https://your-project-default-rtdb.firebaseio.com
VITE_FIREBASE_PROJECT_ID=your-project-id
VITE_FIREBASE_STORAGE_BUCKET=your-project.appspot.com
VITE_FIREBASE_MESSAGING_SENDER_ID=123456789
VITE_FIREBASE_APP_ID=1:123456789:web:abcdef

# 6. Restart dev server
npm run dev

# 7. Test real Firebase connection
# Open: http://localhost:5173/teacher
# Grant microphone permission
# Click "Start Recording"
# Check Firebase Console → Realtime Database for data

🎮 How to Use (Current Functionality)

Demo Mode (No Backend Required)

Perfect for: Testing UI, presentations, understanding the concept

# Teacher View Demo
http://localhost:5173/teacher?demo=true
- Simulated recording
- Animated waveform
- Fake audio chunks
- Session ID generation

# Student View Demo
http://localhost:5173/student?demo=true
- Rotating mock captions
- Connection indicators
- Auto-refresh UI

Firebase Mode (Partial Functionality)

Perfect for: Testing real-time sync, multi-device testing

Setup: Follow "Full Setup" above

Teacher Side:

  1. Open /teacher (without ?demo=true)
  2. Grant microphone permission
  3. Click "Start Recording"
  4. Speak into microphone
  5. See waveform respond to audio
  6. Audio chunks are recorded and attempted to upload
  7. ⚠️ Upload will fail (no backend yet)

Student Side:

  1. Copy share link from teacher view
  2. Open link on another device/tab
  3. Wait for captions
  4. ⚠️ No captions will appear (backend needed to write to Firebase)

Manual Testing: You can manually write captions to Firebase for testing:

// Open browser console on student page
// Get sessionId from URL
const sessionId = new URLSearchParams(window.location.search).get('sessionId');

// Manually write caption (requires Firebase SDK loaded)
import { getDatabase, ref, set, serverTimestamp } from 'firebase/database';
const db = getDatabase();
set(ref(db, `captions/${sessionId}/latest`), {
  text: "This is a test caption",
  timestamp: serverTimestamp()
});

🔧 Development Workflow

Available Scripts

# Development
npm run dev              # Start dev server (http://localhost:5173)
npm run dev -- --host    # Expose to network (for mobile testing)

# Building
npm run build            # Production build (output: dist/)
npm run preview          # Preview production build locally

# Code Quality
npm run lint             # Run ESLint
npm run type-check       # TypeScript type checking (if configured)

# Testing (not yet implemented)
npm run test             # Run unit tests (Vitest)
npm run test:ui          # Vitest UI mode
npm run test:e2e         # End-to-end tests (not configured)

# Deployment (requires configuration)
npm run deploy           # Deploy to Firebase Hosting (if configured)
./deploy-cloud-run.sh    # Deploy to Google Cloud Run

Git Workflow

# Current branches
main                     # ✅ Stable, deployable code
add-vitest-and-tests    # ⚠️ Merged, can be deleted

# Recommended workflow for new features
git checkout -b feature/backend-integration
# Make changes
git add .
git commit -m "feat: add backend transcription endpoint"
git push origin feature/backend-integration
# Create pull request on GitHub

Environment Files

# Never commit these files
.env                    # Local development secrets
.env.local              # Local overrides
.env.production         # Production secrets (use Secret Manager instead)

# Committed template
.env.example            # Template with placeholder values
.env.gcloud             # ✅ Template for Google Cloud deployment

🧪 Testing Status

What's Tested

Unit Tests: None written (Vitest installed but no test files)
Integration Tests: None
E2E Tests: None
Manual Testing: Extensive (see TESTING.md)

Testing Checklist (Manual)

Refer to TESTING.md for comprehensive manual testing procedures:

  • ✅ Demo mode (teacher & student)
  • ✅ Firebase connection
  • ✅ Audio recording and waveform
  • ✅ Mobile responsiveness
  • ✅ PWA installation
  • ⚠️ Backend integration (blocked: no backend)
  • ⚠️ End-to-end transcription flow (blocked: no backend)

Test Coverage Goals (2026)

Component Tests:
  ✓ WaveformVisualizer.tsx     → Test canvas rendering
  ✓ ErrorBoundary.tsx           → Test error catching

Hook Tests:
  ✓ useAudioRecorder.ts         → Test recording lifecycle
  ✓ useFirebaseConnection.ts    → Test reconnection logic
  ✓ useWakeLock.ts              → Test wake lock API

Integration Tests:
  ✓ Teacher → Firebase → Student flow
  ✓ Session management
  ✓ Multi-device synchronization

E2E Tests:
  ✓ Full user journey (teacher creates session, student joins)
  ✓ PWA installation flow
  ✓ Offline functionality

🚧 Known Issues & Limitations

Critical Issues (Blockers)

  1. No Backend Service

    • Impact: Audio cannot be transcribed to text
    • Workaround: Demo mode with mock captions
    • Priority: P0 - Must implement for production
  2. Firebase Public Access ⚠️

    • Impact: Anyone can read/write to database
    • Risk: Data manipulation, abuse
    • Workaround: Test mode only, not production-ready
    • Priority: P0 - Security risk
  3. No Authentication

    • Impact: Cannot identify users or secure sessions
    • Risk: Session hijacking
    • Priority: P1 - Required before production

High Priority Issues

  1. Audio Upload Failures Not Handled Gracefully

    • Impact: Silent failures when backend unreachable
    • Current: Shows error in console, UI retry logic exists but not user-friendly
    • Priority: P1
  2. No Persistence of Caption History

    • Impact: Only latest caption shown, no scrollback
    • Current: Firebase stores /latest only
    • Priority: P1
  3. Bundle Size Large ⚠️

    • Current: ~414KB (gzipped: ~129KB)
    • Target: <300KB gzipped
    • Impact: Slower load times on 3G networks
    • Priority: P2
  4. No Offline Transcription

    • Impact: Requires constant internet
    • Limitation: Speech-to-text APIs require network
    • Priority: P3 (future enhancement)

Medium Priority Issues

  1. Limited Browser Support

    • Chrome: ✅ Fully supported
    • Firefox: ✅ Mostly supported
    • Safari: ⚠️ MediaRecorder API limited
    • Edge: ✅ Fully supported
    • Priority: P2
  2. No Caption Formatting

    • Current: Plain text only
    • Desired: Speaker labels, timestamps, punctuation
    • Priority: P2
  3. Mobile Keyboard Covers Student View

    • Issue: On small screens, keyboard obscures captions
    • Workaround: Landscape mode
    • Priority: P2

Low Priority Issues

  1. No Analytics or Monitoring

    • Impact: Can't track usage, errors, or performance
    • Priority: P3
  2. No Admin Dashboard

    • Impact: Can't manage sessions, users, or view statistics
    • Priority: P3
  3. Home Page Not Styled

    • Current: Bare-bones landing page
    • Priority: P3

🎯 2026 Roadmap

Q1 2026 (January - March) - Backend Foundation

Goal: Build minimum viable backend to enable real transcription

Milestones:

  1. Backend Service Setup (2 weeks)

    • Choose tech stack (recommend: Python + FastAPI)
    • Set up project structure
    • Implement /api/transcribe endpoint
    • Add health check endpoint
    • Deploy to Google Cloud Run
  2. Speech-to-Text Integration (2 weeks)

    • Choose STT provider (Google Cloud Speech-to-Text recommended)
    • Implement audio processing pipeline
    • Handle audio format conversion
    • Write captions to Firebase
    • Add error handling and retry logic
  3. Testing & Optimization (1 week)

    • End-to-end testing with real audio
    • Latency optimization (<2s target)
    • Load testing (50 concurrent users)
    • Error rate monitoring

Deliverable: Fully functional teacher → backend → student flow

Q2 2026 (April - June) - Production Readiness

Goal: Security, stability, and deployment

Milestones:

  1. Authentication & Authorization (2 weeks)

    • Implement Firebase Authentication
    • Add teacher/student role system
    • Secure database rules
    • Session access control
    • Rate limiting
  2. Production Deployment (1 week)

    • Deploy frontend to Cloud Run
    • Deploy backend to Cloud Run
    • Configure custom domain
    • Set up SSL certificates
    • Configure CDN
  3. Monitoring & Analytics (1 week)

    • Set up Google Analytics
    • Configure error tracking (Sentry)
    • Cloud Monitoring dashboards
    • Performance tracking
    • Cost monitoring
  4. Testing Suite (2 weeks)

    • Unit tests (80% coverage target)
    • Integration tests
    • E2E tests with Playwright
    • CI/CD pipeline with automated tests

Deliverable: Production-ready, secure, monitored system

Q3 2026 (July - September) - Feature Enhancements

Goal: Improve user experience and add advanced features

Milestones:

  1. Caption History & Management (2 weeks)

    • Store full caption history in Firebase
    • Scrollable caption timeline
    • Search captions
    • Export captions (TXT, PDF, SRT)
    • Session recording playback
  2. Accessibility Improvements (2 weeks)

    • Screen reader optimization
    • Keyboard shortcuts (Space: toggle recording, etc.)
    • High contrast themes
    • Font size customization
    • WCAG 2.1 Level AA compliance
  3. Multi-Language Support (2 weeks)

    • Swahili transcription
    • English transcription
    • Language detection/selection
    • UI translation (i18n)
  4. Performance Optimization (1 week)

    • Code splitting
    • Lazy loading
    • Bundle size reduction (<300KB)
    • Service worker optimization

Deliverable: Feature-rich, accessible, multilingual PWA

Q4 2026 (October - December) - Scale & Polish

Goal: Handle production scale and add premium features

Milestones:

  1. Admin Dashboard (3 weeks)

    • Session management interface
    • User management
    • Usage analytics dashboard
    • System health monitoring
    • Billing integration (if monetizing)
  2. Advanced Features (3 weeks)

    • Multiple students per session
    • Caption translation (real-time)
    • Speech speed adjustment
    • Caption styling customization
    • Integration with Google Classroom (if applicable)
  3. Mobile Apps (4 weeks)

    • Android app (using Capacitor or React Native)
    • iOS app
    • App store deployment
    • Push notifications
  4. Documentation & Training (2 weeks)

    • User guides (video tutorials)
    • Teacher training materials
    • Developer documentation
    • API documentation

Deliverable: Scalable, feature-complete product ready for wide deployment


🔐 Security Considerations

Current Security Status: ⚠️ DEVELOPMENT ONLY

Critical Security Issues:

  1. Firebase Rules are Public

    // Current rules (UNSAFE for production)
    {
      "rules": {
        ".read": true,
        ".write": true
      }
    }

    Risk: Anyone can read/write all data
    Action Required: Implement authentication-based rules before production

  2. No Authentication

    • Any user can create sessions
    • No session ownership
    • No access control
  3. Environment Variables in Client ⚠️

    • Firebase config exposed in browser (expected for Firebase Web SDK)
    • Backend URL visible in source
    • Mitigation: Use Firebase Security Rules and authentication
  4. No Rate Limiting

    • API can be abused
    • No DDoS protection
    • Action Required: Implement Cloud Armor or API Gateway

Security Roadmap

Before Beta Release:

  • Implement Firebase Authentication
  • Secure database rules (authenticated users only)
  • Add session-based access control
  • Environment variables via Secret Manager
  • HTTPS enforcement (automatic on Cloud Run)

Before Production:

  • Rate limiting (Cloud Armor)
  • DDoS protection
  • Security audit
  • Penetration testing
  • GDPR compliance review (if serving EU users)
  • Data retention policy
  • Privacy policy and terms of service

💰 Cost Estimates

Current Costs (December 2025): $0

Why zero: Not deployed to production, only local development

Projected Costs (Post-Deployment)

Free Tier Usage (0-100 users/month):

  • Firebase Realtime Database: Free (1GB storage, 10GB/month bandwidth)
  • Google Cloud Run: Free (2M requests, 360,000 GB-seconds)
  • Firebase Hosting: Free (10GB storage, 360MB/day transfer)
  • Total: $0/month

Small Scale (100-1,000 users/month):

  • Firebase: $5-10/month
  • Cloud Run (Frontend): $2-5/month
  • Cloud Run (Backend): $10-20/month (depends on STT API usage)
  • Speech-to-Text API: $50-200/month (at $0.006/15 seconds)
  • Cloud Storage (audio backups): $1-5/month
  • Total: $68-240/month

Medium Scale (1,000-10,000 users/month):

  • Firebase: $25-50/month
  • Cloud Run (Frontend): $10-20/month
  • Cloud Run (Backend): $50-100/month
  • Speech-to-Text API: $500-2,000/month
  • CDN & Load Balancer: $20-50/month
  • Monitoring & Logging: $10-30/month
  • Total: $615-2,250/month

Cost Optimization Strategies (see OPTIMIZATION.md):

  • Use Cloud CDN to reduce egress costs
  • Implement caching to reduce database reads
  • Batch audio chunks before transcription
  • Use spot instances for non-critical workloads
  • Set up budget alerts

📚 Documentation Map

All documentation is in the /docs folder:

Document Purpose Best For Lines
README.md Project overview, architecture, tech stack Understanding the project 500
TESTING.md Manual testing procedures, checklists QA, testing new features 800
DEPLOYMENT.md Vercel/Firebase/Netlify deployment Quick deployment 700
DEPLOYMENT_GOOGLE_CLOUD.md Complete GCP deployment guide Production GCP deployment 1,200
OPTIMIZATION.md Performance, cost, security optimization Production optimization 900
TROUBLESHOOTING.md Common issues, solutions, debugging Fixing problems 600
PROJECT_STATUS.md This file - complete project context Onboarding, AI agents 2,500+

Total Documentation: ~7,200 lines of comprehensive guides


🤖 For AI Agents: Getting Started

If you're an AI agent tasked with continuing development on this project, here's what you need to know:

Project State

  • Status: Post-hackathon MVP, functional frontend, no backend
  • Phase: Ready for backend development
  • Priority: Implement /api/transcribe endpoint

Quick Context

// The flow that needs completion:
Teacher records audio  [MISSING: Backend transcription]  Firebase  Student sees captions

Key Files to Understand

  1. src/hooks/useAudioRecorder.ts - Audio recording logic, uploads to /api/transcribe
  2. src/services/firebase.ts - Firebase configuration
  3. src/pages/TeacherView.tsx - Teacher UI, calls useAudioRecorder
  4. src/pages/StudentView.tsx - Student UI, subscribes to Firebase

Immediate Next Steps

  1. Create backend service (Python/FastAPI or Node.js/Express)
  2. Implement POST /api/transcribe endpoint
  3. Integrate Google Cloud Speech-to-Text API
  4. Write transcribed text to Firebase at /captions/{sessionId}/latest
  5. Test end-to-end flow

Backend Specification Needed

// POST /api/transcribe?sessionId=abc123xyz
// Body: { audioChunk: "data:audio/webm;base64,..." }
// Response: { success: true, text: "transcribed text" }

Firebase Write Target

// Write to: /captions/{sessionId}/latest
{
  text: "transcribed caption",
  timestamp: serverTimestamp()
}

Environment Variables Required

# Frontend (.env)
VITE_BACKEND_API_URL=http://localhost:8000  # Your backend URL

# Backend (.env)
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
FIREBASE_DATABASE_URL=https://project.firebaseio.com

Testing the Integration

# 1. Start backend: python main.py (or equivalent)
# 2. Start frontend: npm run dev
# 3. Open: http://localhost:5173/teacher
# 4. Grant mic permission, click "Start Recording"
# 5. Speak: "Hello world"
# 6. Check logs: Backend should receive audio, return text
# 7. Open: http://localhost:5173/student?sessionId=<SESSION_ID>
# 8. Verify: Caption "Hello world" appears

Code Style

  • TypeScript: Strict mode, explicit types
  • React: Functional components with hooks
  • Naming: camelCase for variables, PascalCase for components
  • Comments: Explain "why", not "what"
  • Error handling: Always try/catch async operations

Git Commit Format

feat: add backend transcription service
fix: resolve audio upload timeout issue
docs: update deployment guide
test: add unit tests for useAudioRecorder
chore: upgrade dependencies

🎓 Learning Resources

For Understanding the Codebase

React & TypeScript:

Firebase:

PWA:

For Backend Development

Speech-to-Text:

API Development:

For Deployment

Google Cloud:


📞 Contact & Support

Project Maintainers

Eli Keli (Project Lead)

Viggen Korir (Original Repository Owner)

Getting Help

For Bug Reports:

For Feature Requests:

  • Open an issue with label enhancement
  • Describe use case and proposed solution

For Questions:


🎉 Acknowledgments

Hackathon: Developed during December 2025 hackathon
Inspiration: Supporting inclusive education in Kenyan classrooms
Technology: Built with React, Firebase, and Google Cloud
Contributors: See GitHub contributors page


📄 License

MIT License - See LICENSE file for details

TL;DR: Free to use, modify, and distribute. Attribution appreciated.


🚀 Final Words for New Contributors

Welcome to Sauti Darasa! This project has significant potential to make education accessible to thousands of deaf and hard-of-hearing students in Kenya and beyond.

What makes this project special:

  • Social Impact: Solving a real problem for underserved students
  • Technical Excellence: Modern stack, clean architecture, comprehensive docs
  • Production Ready: ~80% complete, clear roadmap for remaining 20%
  • Scalable: Designed to handle thousands of concurrent sessions
  • Well Documented: 7,200+ lines of documentation

The biggest challenge: Building the backend transcription service. This is where we need help the most.

What you'll gain:

  • Experience with React, TypeScript, Firebase, Google Cloud
  • Understanding of real-time systems and WebRTC
  • Portfolio project with social impact
  • Contribution to open-source education technology

Ready to contribute?

  1. Fork the repository
  2. Set up your development environment (5 minutes)
  3. Choose a task from the roadmap
  4. Make your contribution
  5. Submit a pull request

Let's make education accessible to everyone! 🎓


Document Version: 1.0
Last Updated: December 18, 2025
Next Review: January 15, 2026
Status: 🟢 Active Development


This document is maintained by the Sauti Darasa team. For updates or corrections, please open a pull request.