From 470074cbbbb192c4960a840b94431d583ae493a3 Mon Sep 17 00:00:00 2001 From: Ivan-Wang-tech <157322972+Ivan-Wang-tech@users.noreply.github.com> Date: Wed, 19 Nov 2025 02:49:09 -0500 Subject: [PATCH] finalize readme.md --- README.md | 346 +++++++++++++++++++++++++++++++-------------- docker-compose.yml | 2 +- 2 files changed, 243 insertions(+), 105 deletions(-) diff --git a/README.md b/README.md index 2c7cc203..3820217a 100644 --- a/README.md +++ b/README.md @@ -1,52 +1,82 @@ +# โœ‹ HandSense โ€” Containerized Machine Learning + Web Dashboard System + +![ML Client CI](https://github.com/swe-students-fall2025/4-containers-nov/actions/workflows/ml-client-ci.yml/badge.svg) +![Web App CI](https://github.com/swe-students-fall2025/4-containers-nov/actions/workflows/web-app-ci.yml/badge.svg) ![Lint-free](https://github.com/nyu-software-engineering/containerized-app-exercise/actions/workflows/lint.yml/badge.svg) -# Containerized App Exercise +A fully containerized, three-service application that performs **real-time hand gesture recognition** using a MediaPipe + PyTorch machine-learning client, stores gesture events inside **MongoDB**, and visualizes them through a **Flask-based web dashboard**. -Build a containerized app that uses machine learning. See [instructions](./instructions.md) for details. +This project demonstrates how separate services communicate inside a Dockerized micro-service architecture. -# Teammates +--- -Ivan Wang, [Harrison Gao](https://github.com/HTK-G), [Sina Liu](https://github.com/SinaL0123), Serena, [Hanqi Gui](https://github.com/hanqigui) +## ๐Ÿ‘ฅ Teammates -# Machine Learning Client โ€” Hand Gesture Recognition +- [Ivan Wang](https://github.com/Ivan-Wang-tech) +- [Harrison Gao](https://github.com/HTK-G) +- [Sina Liu](https://github.com/SinaL0123) +- [Serena Wang](https://github.com/serena0615) +- [Hanqi Gui](https://github.com/hanqigui) -This folder contains the **machine-learning-client** subsystem of our 3-container project: +--- -- **Machine Learning Client** โ†’ collects sensor data (webcam), performs gesture recognition with MediaPipe + PyTorch, and later sends results to MongoDB. -- **Web App** โ†’ visualizes gesture events stored in the database. -- **MongoDB** โ†’ central datastore for gesture metadata. +## ๐Ÿงฑ System Overview -The ML client runs entirely as a _backend service_ (no user-facing UI). -It processes camera input, performs ML inference, and will later communicate with the database once integrated with the web app. +The system consists of **three Dockerized services**: ---- +``` ++------------------------+ +-----------------------+ +------------------------+ +| Machine Learning | | MongoDB | | Web App | +| Client | --> | handsense database | --> | Dashboard (Flask) | +| (MediaPipe + PyTorch) | | Gesture_events | | Visualize gestures | ++------------------------+ +-----------------------+ +------------------------+ +``` + +### ๐Ÿ”น Machine-Learning Client +Runs locally or inside Docker. +It uses a webcam โ†’ detects hands using MediaPipe โ†’ predicts gestures using a PyTorch MLP โ†’ inserts events into `handsense.gesture_events` collection. + +### ๐Ÿ”น MongoDB +Stores gesture logs, statistics, and capture state toggles. -# 1. Project Structure +### ๐Ÿ”น Web App +Reads gesture events from MongoDB and presents a dashboard showing: -## Project Structure +- Live latest gesture +- Gesture distribution +- Recent event timeline +- Toggle capture control (`/api/control`) + +After all services run, you can visit: + +๐Ÿ‘‰ **http://localhost:5000** + +--- + +## ๐Ÿ“ Project Structure ```text โ”œโ”€โ”€ docker-compose.yml โ”œโ”€โ”€ instructions.md โ”œโ”€โ”€ LICENSE โ”œโ”€โ”€ machine-learning-client -โ”‚ย ย  โ”œโ”€โ”€ data -โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ hagrid_keypoints_X.npy -โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ hagrid_keypoints_y.npy -โ”‚ย ย  โ”œโ”€โ”€ Dockerfile -โ”‚ย ย  โ”œโ”€โ”€ models -โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ gesture_mlp.pt -โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ train_mlp.py -โ”‚ย ย  โ”œโ”€โ”€ Pipfile -โ”‚ย ย  โ”œโ”€โ”€ Pipfile.lock -โ”‚ย ย  โ”œโ”€โ”€ src -โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ __init__.py -โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ extract_keypoints_from_hagrid.py -โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ live_mediapipe_mlp.py -โ”‚ย ย  โ””โ”€โ”€ tests -โ”‚ย ย  โ”œโ”€โ”€ __init__.py -โ”‚ย ย  โ”œโ”€โ”€ test_extract_keypoints_from_hagrid.py -โ”‚ย ย  โ””โ”€โ”€ test_live_mediapipe_mlp.py +โ”‚ โ”œโ”€โ”€ data +โ”‚ โ”‚ โ”œโ”€โ”€ hagrid_keypoints_X.npy +โ”‚ โ”‚ โ””โ”€โ”€ hagrid_keypoints_y.npy +โ”‚ โ”œโ”€โ”€ Dockerfile +โ”‚ โ”œโ”€โ”€ models +โ”‚ โ”‚ โ”œโ”€โ”€ gesture_mlp.pt +โ”‚ โ”‚ โ””โ”€โ”€ train_mlp.py +โ”‚ โ”œโ”€โ”€ Pipfile +โ”‚ โ”œโ”€โ”€ Pipfile.lock +โ”‚ โ”œโ”€โ”€ src +โ”‚ โ”‚ โ”œโ”€โ”€ __init__.py +โ”‚ โ”‚ โ”œโ”€โ”€ extract_keypoints_from_hagrid.py +โ”‚ โ”‚ โ””โ”€โ”€ live_mediapipe_mlp.py +โ”‚ โ””โ”€โ”€ tests +โ”‚ โ”œโ”€โ”€ __init__.py +โ”‚ โ”œโ”€โ”€ test_extract_keypoints_from_hagrid.py +โ”‚ โ””โ”€โ”€ test_live_mediapipe_mlp.py โ”œโ”€โ”€ README.md โ””โ”€โ”€ web-app โ”œโ”€โ”€ app.py @@ -55,131 +85,239 @@ It processes camera input, performs ML inference, and will later communicate wit โ”œโ”€โ”€ Pipfile.lock โ”œโ”€โ”€ readme.txt โ”œโ”€โ”€ static - โ”‚ย ย  โ”œโ”€โ”€ audios - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ among_us.mp3 - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ android_beep.mp3 - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ bom.mp3 - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ error.mp3 - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ playme.mp3 - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ rick_roll.mp3 - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ rizz.mp3 - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ sponge_bob.mp3 - โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ uwu.mp3 - โ”‚ย ย  โ”œโ”€โ”€ hagrid_classes.json - โ”‚ย ย  โ”œโ”€โ”€ images - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ fist.png - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ like.png - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ ok.png - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ one.png - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ palm.png - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ stop.png - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ thinking.png - โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ three.png - โ”‚ย ย  โ”‚ย ย  โ””โ”€โ”€ two_up.png - โ”‚ย ย  โ”œโ”€โ”€ script.js - โ”‚ย ย  โ””โ”€โ”€ style.css + โ”‚ โ”œโ”€โ”€ audios + โ”‚ โ”‚ โ”œโ”€โ”€ among_us.mp3 + โ”‚ โ”‚ โ”œโ”€โ”€ android_beep.mp3 + โ”‚ โ”‚ โ”œโ”€โ”€ bom.mp3 + โ”‚ โ”‚ โ”œโ”€โ”€ error.mp3 + โ”‚ โ”‚ โ”œโ”€โ”€ playme.mp3 + โ”‚ โ”‚ โ”œโ”€โ”€ rick_roll.mp3 + โ”‚ โ”‚ โ”œโ”€โ”€ rizz.mp3 + โ”‚ โ”‚ โ”œโ”€โ”€ sponge_bob.mp3 + โ”‚ โ”‚ โ””โ”€โ”€ uwu.mp3 + โ”‚ โ”œโ”€โ”€ hagrid_classes.json + โ”‚ โ”œโ”€โ”€ images + โ”‚ โ”‚ โ”œโ”€โ”€ fist.png + โ”‚ โ”‚ โ”œโ”€โ”€ like.png + โ”‚ โ”‚ โ”œโ”€โ”€ ok.png + โ”‚ โ”‚ โ”œโ”€โ”€ one.png + โ”‚ โ”‚ โ”œโ”€โ”€ palm.png + โ”‚ โ”‚ โ”œโ”€โ”€ stop.png + โ”‚ โ”‚ โ”œโ”€โ”€ thinking.png + โ”‚ โ”‚ โ”œโ”€โ”€ three.png + โ”‚ โ”‚ โ””โ”€โ”€ two_up.png + โ”‚ โ”œโ”€โ”€ script.js + โ”‚ โ””โ”€โ”€ style.css โ”œโ”€โ”€ templates - โ”‚ย ย  โ””โ”€โ”€ index.html + โ”‚ โ””โ”€โ”€ index.html โ””โ”€โ”€ tests - โ”œโ”€โ”€ __init__.py + โ”œโ”€โ”€ __init__.py โ”œโ”€โ”€ conftest.py โ””โ”€โ”€ test_app.py ``` -# 2. Environment Setup (macOS, M-series) +--- + +## โš™๏ธ 1. Environment Setup (Any Platform) -## **1. Install pipenv (if not installed)** +The recommended workflow uses **pipenv** for dependency management. +### macOS / Linux / Windows (WSL) + +#### Install pipenv ```bash pip install pipenv ``` -## 2. Install all ML client dependencies +--- + +## โš™๏ธ 2. Running the System (Docker) -From the repository root: +From project root: ```bash -cd machine-learning-client -pipenv install --dev +docker compose up --build ``` -This installs all dependencies, including: +This starts: + +| Service | URL | Purpose | +|---------|-----|---------| +| web-app | http://localhost:5000 | Dashboard UI | +| mongodb | localhost:27017 | Database | +| ml-client | headless, no UI | Captures gestures + inserts into DB | + +To stop: -- mediapipe -- opencv-python -- numpy -- torch (with MPS acceleration for Apple Silicon) -- pylint + black -- pytest (required later for unit testing) +```bash +docker compose down +``` + +--- -# 3. Run Live Gesture Recognition (MediaPipe + PyTorch) +## ๐Ÿ‘๏ธ Running the ML Client With Webcam (macOS/Windows/Linux Host) -Make sure your webcam is connected, then run: +Since macOS Docker cannot access `/dev/video0`, we run the ML client on host machine: ```bash cd machine-learning-client +pipenv install --dev pipenv run python src/live_mediapipe_mlp.py ``` -You should see: +Features: + +- Live webcam feed +- MediaPipe hand-tracking +- PyTorch gesture inference +- Inserts gesture records into `handsense.gesture_events` +- Press `q` to quit -- a live webcam preview window +--- -- detected hand skeletons +## ๐Ÿ—„๏ธ 3. MongoDB Configuration + Starter Data -- predicted gesture label displayed on the frame +The database name is: -Press **q** to exit. +``` +handsense +``` -## Note About Running the ML Client in Docker on macOS +Collections automatically created: -On macOS, Docker containers cannot easily access the host webcam, because the macOS camera is not exposed as a Linux-style /dev/video0 device inside containers. -For this reason, the live gesture recognition demo cannot run inside the Docker container on macOS. +| Collection | Purpose | +|------------|---------| +| gesture_events | ML client inserts gesture data | +| controls | Stores capture toggle state | -During development and demo, we run the ML client directly on the host machine, where the webcam works normally: +At first run the ML client ensures: -```bash -pipenv run python src/live_mediapipe_mlp.py +```json +{ + "_id": "capture_control", + "enabled": false +} ``` -The Docker image of the ML client is still fully functional for: +--- -- CI / GitHub Actions +## ๐Ÿ” 4. Environment Variables -- dependency isolation +Both ml-client and web-app use these: -- database integration tests +| Variable | Description | +|----------|-------------| +| MONGO_URI | Mongo connection string (default: `mongodb://mongodb:27017`) | +| MONGO_DB_NAME | Database name (default: `handsense`) | +| SECRET_KEY | Flask sessions | -- running without a webcam (e.g., headless mode) +See `.env.example` below. -This behavior is expected on macOS and does not affect the overall functionality of the 3-container system. +--- -# 4. MongoDB Integration +## ๐Ÿ“„ 5. .env.example (Required for TA Submission) -The ML client is already connected to MongoDB using pymongo. +Place this file in project root: -In src/live_mediapipe_mlp.py, a MongoDB database named handsense is created: +```env +# MongoDB configuration +MONGO_URI=mongodb://mongodb:27017 +MONGO_DB_NAME=handsense + +# Flask secret +SECRET_KEY=dev-secret +``` + +Then create an actual `.env`: ```bash -mongo_client = MongoClient("mongodb://localhost:27017") -mongo_db = mongo_client["handsense"] -gesture_collection = mongo_db["gesture_events"] +cp .env.example .env ``` -For every detected hand gesture, an event document is inserted: +--- + +## ๐Ÿ” 6. Web App (Flask) โ€” Running Locally ```bash -event = { - "timestamp": datetime.now(timezone.utc).isoformat(), - "gesture": pred_label, - "confidence": confidence, - "handedness": handedness, -} -gesture_collection.insert_one(event) +cd web-app +pipenv install --dev +pipenv run flask run --host=0.0.0.0 --port=5000 ``` -This allows the Web App subsystem to read and visualize real-time gesture activity from: +Navigate to: + +๐Ÿ‘‰ **http://localhost:5000** +### Endpoints: + +| Route | Description | +|-------|-------------| +| `/` | Dashboard UI | +| `/api/latest` | Latest gesture | +| `/api/latest_full` | Latest gesture (detailed) | +| `/api/control` | POST toggle capture | +| `/api/control/status` | GET capture control | + +--- + +## ๐Ÿงช 7. Testing + Linting + Coverage + +### Run ML Client Tests +```bash +cd machine-learning-client +pipenv run pytest --cov=src +pipenv run pylint src +``` + +### Run Web App Tests ```bash -handsense.gesture_events +cd web-app +pipenv run pytest --cov=. +pipenv run pylint app.py +``` + +Coverage must be โ‰ฅ 80%. + +--- + +## ๐Ÿงฐ 8. Docker Compose + +```yaml +version: "3.9" + +services: + mongodb: + image: mongo:6 + container_name: mongodb + ports: + - "27017:27017" + volumes: + - mongo-data:/data/db + + web-app: + build: + context: ./web-app + container_name: web-app + depends_on: + - mongodb + environment: + MONGO_URI: "mongodb://mongodb:27017" + MONGO_DB_NAME: "handsense" + FLASK_APP: "app.py" + FLASK_RUN_HOST: "0.0.0.0" + ports: + - "5000:5000" + + ml-client: + build: + context: ./machine-learning-client + container_name: ml-client + depends_on: + - mongodb + environment: + MONGO_URI: "mongodb://mongodb:27017" + MONGO_DB_NAME: "handsense" + +volumes: + mongo-data: ``` diff --git a/docker-compose.yml b/docker-compose.yml index 76e857d7..3a7bb532 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -21,7 +21,7 @@ services: FLASK_APP: "app.py" FLASK_RUN_HOST: "0.0.0.0" ports: - - "5001:5000" + - "5000:5000" ml-client: build: