A full-stack web application for manga translation with AI-powered features including OCR, balloon detection, and inpainting.
- πΌοΈ Image Upload - Drag & drop or click to upload manga images
- π Balloon Detection - AI-powered detection of speech balloons and text regions
- π OCR - Extract text from manga images (supports Japanese, English, Korean, Chinese)
- π¨ Inpainting - Clean/remove text from images using AI
- ποΈ Canvas Editor - Zoom, pan, and select regions
- π Real-time Updates - WebSocket connection for progress updates
- π± Responsive UI - Modern interface with Tailwind CSS
- React 18 + TypeScript
- Vite (build tool)
- Tailwind CSS (styling)
- FastAPI (Python)
- WebSocket support
- PIL/Pillow (image processing)
- CORS enabled
web_app/
βββ backend/ # FastAPI backend
β βββ api/ # API endpoints
β β βββ upload.py # Image upload endpoint
β β βββ ocr.py # OCR endpoint
β β βββ inpaint.py # Inpainting endpoint
β β βββ detect.py # Detection endpoint
β βββ core/ # Core utilities
β β βββ config.py # Configuration settings
β β βββ websocket_manager.py # WebSocket manager
β β βββ utils.py # Utility functions
β βββ services/ # AI service implementations
β β βββ ocr_service.py
β β βββ inpaint_service.py
β β βββ detection_service.py
β βββ uploads/ # Uploaded images (created at runtime)
β βββ results/ # Processing results (created at runtime)
β βββ temp/ # Temporary files (created at runtime)
β βββ main.py # FastAPI entry point
β βββ requirements.txt # Python dependencies
βββ src/ # Frontend source
β βββ components/ # React components
β βββ hooks/ # Custom React hooks
β βββ services/ # API client
β βββ types/ # TypeScript types
βββ start-dev.bat # Windows startup script
βββ start-dev.sh # Linux/Mac startup script
βββ package.json # Node.js dependencies
- Python 3.8+
- Node.js 18+
- npm or yarn
-
Clone the repository (or navigate to the project folder)
-
Copy environment file
cp .env.example .env
-
Run the startup script
On Windows:
start-dev.bat
On Linux/Mac:
chmod +x start-dev.sh ./start-dev.sh
This will:
- Create a Python virtual environment
- Install backend dependencies
- Install frontend dependencies
- Start both servers
-
Access the application
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
If you prefer to run servers manually:
Terminal 1 - Backend:
cd backend
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
python -m uvicorn main:app --reload --host 0.0.0.0 --port 8000Terminal 2 - Frontend:
npm install
npm run devPOST /api/upload- Upload an image fileGET /api/uploads/{file_id}- Get uploaded imageDELETE /api/upload/{file_id}- Delete uploaded image
POST /api/ocr- Run OCR (async)POST /api/ocr/sync- Run OCR (sync)GET /api/ocr/status/{task_id}- Get OCR task statusGET /api/ocr/languages- Get supported languages
POST /api/detect- Run detection (async)POST /api/detect/sync- Run detection (sync)GET /api/detect/status/{task_id}- Get detection statusGET /api/detect/models- Get available modelsGET /api/detect/types- Get detection types
POST /api/inpaint- Run inpainting (async)POST /api/inpaint/sync- Run inpainting (sync)GET /api/inpaint/status/{task_id}- Get inpainting statusGET /api/inpaint/methods- Get available methods
WS /ws- Real-time updates and progress notifications
npm run dev # Start dev server
npm run build # Build for production
npm run preview # Preview production build
npm run lint # Run ESLintcd backend
source venv/bin/activate
# Run with auto-reload
python -m uvicorn main:app --reload
# Run tests
pytest| Variable | Description | Default |
|---|---|---|
VITE_API_URL |
Backend API URL | http://localhost:8000 |
VITE_WS_URL |
WebSocket URL | ws://localhost:8000/ws |
HOST |
Backend host | 0.0.0.0 |
PORT |
Backend port | 8000 |
DEBUG |
Debug mode | true |
-
Upload an Image
- Click "Open..." or drag & drop a manga image
- Supported formats: JPG, PNG, WebP, BMP, TIFF
-
Detect Balloons
- Click "Detect Balloons" to find speech balloons
- Detections will appear as overlays on the image
-
Run OCR
- Select a region with the rectangle tool, or run on entire image
- Click "Run OCR" to extract text
- Results will show extracted text with confidence scores
-
Clean/Inpaint
- Select a region containing text
- Click "Clean/Inpaint" to remove the text
- The result will replace the original image
-
Canvas Tools
- Pan: Move around the image
- Rectangle Select: Select regions for processing
- Zoom: Zoom in/out
cd backend
pip install -r requirements.txt
# Use a production ASGI server like gunicorn
pip install gunicorn
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000npm run build
# Serve the dist/ folder with your web server (nginx, Apache, etc.)- The current implementation uses mock AI services for demonstration
- To use real AI models, implement the actual model loading in:
backend/services/ocr_service.pybackend/services/inpaint_service.pybackend/services/detection_service.py
MIT License