A web application to transcribe Japanese audio and translate it to Vietnamese. The backend uses FastAPI and Whisper, and the frontend is built with React, TypeScript, and Vite.
- Upload Japanese audio files (
.mp3
,.wav
) - Transcribe audio to Japanese text using Whisper
- Translate Japanese text to Vietnamese
- View and navigate segments with timestamps
- Modern, responsive UI
japanese-transcript/
│
├── backend/ # FastAPI backend (Python)
│ └── main.py
│
├── frontend/ # React frontend (TypeScript, Vite)
│ ├── src/
│ ├── public/
│ ├── package.json
│ └── ...
│
└── README.md
- Python 3.8+
- Node.js 18+
- (Recommended)
ffmpeg
installed for Whisper
-
Install dependencies:
pip install fastapi uvicorn whisper deep-translator
-
(Optional) Run the backend server manually:
uvicorn backend.main:app --reload
The API will be available at
http://127.0.0.1:8000
.
-
Install dependencies:
cd frontend npm install
-
Run both backend and frontend together (recommended):
npm run dev
This will start both the FastAPI backend and the Vite frontend concurrently. The app will be available at
http://localhost:5173
(default Vite port).- The backend runs at
http://127.0.0.1:8000
. - The frontend runs at
http://localhost:5173
.
- The backend runs at
-
(Optional) Run only the frontend:
npm run preview
- Start both backend and frontend servers.
- Open the frontend in your browser.
- Click "Load Audio" and select a Japanese
.mp3
or.wav
file. - Click "Transcribe" to process the audio.
- View Japanese and Vietnamese segments, and play audio by segment.
- CORS: The backend allows all origins by default. For production, restrict
allow_origins
inbackend/main.py
. - Model: The backend loads the Whisper "small" model by default. You can change this in
main.py
.
Backend:
- fastapi
- uvicorn
- whisper
- deep-translator
Frontend:
- react
- react-dom
- axios
- vite
- tailwindcss
- concurrently
- typescript
- eslint & plugins
MIT License