SpeakSnap 🧠💬

SpeakSnap is an AI-powered meeting companion that enhances real-time conversations by identifying and summarizing domain-specific terms as you speak — helping everyone stay on the same page.

Perfect for technical discussions, onboarding sessions, or interdisciplinary meetings, SpeakSnap provides live contextual explanations of complex terms right inside your video call.

✨ Features

🗣️ Real-time Audio Transcription using Azure Speech-to-Text
🧠 Contextual Term Detection with Google's Gemini API via LangChain
💡 Dynamic Popups in the frontend to display term summaries live during meetings
🎥 Jitsi Meet Integration for live video/audio conferencing
📦 Modular Architecture split into Core (AI) and Suite (App)

🧱 Architecture Overview

Component	Description
Core	Python module that uses Gemini + LangChain to process domain-specific terms
Suite	JavaScript backend and frontend with Azure STT, Jitsi, and term popup UI

🚀 Getting Started

1. Clone the Repository

git clone https://github.com/your-org/speaksnap.git
cd speaksnap

🧠 Core (Gemini + LangChain) – `core/`

The Core handles all the AI-based processes, such as interacting with the Gemini API to detect and summarize domain-specific terms during the meeting.

Setup

Navigate to the core/ directory:
```
cd core
```

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate        # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Configure `.env`

Create a .env file in the core/ directory with your Google Gemini API key:

GOOGLE_API_KEY=your_google_gemini_api_key_here

Run

Start the core service, which will handle term detection and summarization:

python main.py

💻 Suite (Backend + Frontend) – `suite/`

The Suite is responsible for the frontend UI and the backend WebSocket server that connects to the core service.

🔧 Backend Setup

Navigate to the suite/backend directory:
```
cd suite/backend
```
Install dependencies:
```
npm install
```
If you face any issues, try:
```
npm install vite@4.0.0
```

Configure `.env`

Create a .env file in suite/backend/ with the following environment variables:

MONGO_URI=your_mongodb_connection_string
AZURE_SPEECH_KEY=your_azure_speech_key
AZURE_REGION=your_azure_region

Start Backend

Start the backend WebSocket server, which will handle real-time speech-to-text data and interact with the Core:

node server.js

🎨 Frontend Setup

Navigate to the suite/frontend/SpeakSuit directory:
```
cd suite/frontend/SpeakSuit
```
Install dependencies:
```
npm install
```
If any issues arise, try:
```
npm install vite@4.0.0
```

Run Frontend

Start the frontend React app, which will display live term summaries in the meeting:

npm run dev

The app will be available at http://localhost:5173.

🌍 Hosted API Server

Due to some technical difficulties, we have only hosted the API server at the following URL:

API Endpoint: http://52.23.182.233:8080/api/summary/

Request JSON Schema

The API accepts the following JSON request schema:

{
  "text": "string",
  "userid": "string",
  "sessionid": "string"
}

Response JSON Schema

The response schema is as follows:

{
  "title": "Summary",
  "type": "object",
  "properties": {
    "summary": {
      "type": "string",
      "description": "An overall summary of the entire chat history until the most recent query, in as few lines as possible but make sure to the major components of old text as well"
    },
    "sentiment": {
      "type": "string",
      "enum": ["pos", "neu", "neg"],
      "description": "Return the sentiment of the conversation as positive, neutral, or negative"
    },
    "name": {
      "type": ["string", "null"],
      "description": "The speaker's name, if available. Use null if the speaker is unidentified or not mentioned in the text."
    },
    "contextual_explanations": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "term": {
            "type": "string",
            "description": "A term or phrase used in the conversation that might require explanation—this includes pop culture references (e.g., TV shows, movies), scientific terms, financial or economic concepts, historical or political references, technical jargon, or any other potentially unclear or domain-specific expression."
          },
          "explanation": {
            "type": "string",
            "description": "A concise explanation of the term in the context it was used, aimed at someone who may not be familiar with it."
          }
        },
        "required": ["term", "explanation"]
      },
      "description": "List of all terms or phrases in the conversation that could benefit from contextual explanation, regardless of their domain."
    }
  },
  "required": ["summary", "sentiment"]
}

✅ Workflow Summary

Start the Core service (python core/main.py)
Start the Suite backend (node suite/backend/server.js)
Start the Suite frontend (npm run dev inside suite/frontend/SpeakSuit)
Join a Jitsi meeting and speak — watch contextual definitions appear live!

📄 Repositories

🧠 speaksnap-core (Python) – Gemini + LangChain backend
💻 speaksnap-suite (JS) – Backend (Node.js) and Frontend (React)

🖼 Preview:

📄 License

MIT License – See individual folders for details.

👥 Contributors

Siddharth Karmokar (Backend Api Server)
Arnav Sharda (Frontend Developer)
Rushikesh Iche (Backend Developer)

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
images		images
notebooks		notebooks
schemas		schemas
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
cron_job.py		cron_job.py
crontab.txt		crontab.txt
giffer.py		giffer.py
main.py		main.py
requirements.txt		requirements.txt
slider.gif		slider.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpeakSnap 🧠💬

✨ Features

🧱 Architecture Overview

🚀 Getting Started

1. Clone the Repository

🧠 Core (Gemini + LangChain) – `core/`

Setup

Configure `.env`

Run

💻 Suite (Backend + Frontend) – `suite/`

🔧 Backend Setup

Configure `.env`

Start Backend

🎨 Frontend Setup

Run Frontend

🌍 Hosted API Server

Request JSON Schema

Response JSON Schema

✅ Workflow Summary

📄 Repositories

🖼 Preview:

📄 License

👥 Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

SiddharthKarmokar/SpeakSnap-core

Folders and files

Latest commit

History

Repository files navigation

SpeakSnap 🧠💬

✨ Features

🧱 Architecture Overview

🚀 Getting Started

1. Clone the Repository

🧠 Core (Gemini + LangChain) – core/

Setup

Configure .env

Run

💻 Suite (Backend + Frontend) – suite/

🔧 Backend Setup

Configure .env

Start Backend

🎨 Frontend Setup

Run Frontend

🌍 Hosted API Server

Request JSON Schema

Response JSON Schema

✅ Workflow Summary

📄 Repositories

🖼 Preview:

📄 License

👥 Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

🧠 Core (Gemini + LangChain) – `core/`

Configure `.env`

💻 Suite (Backend + Frontend) – `suite/`

Configure `.env`

Packages