Audio Transcription API

Features

Bonus Features

Dowload links for uploaded audio and generated transcript
Support for .flac file format

API Architecture

flowchart  TD
subgraph  Client  Layer
Client[Client  Application]
end

subgraph  Backend
A[HTTP  Server]
db[(PostgreSQL DB)]
queue["BullMQ  -  Redis  Queue"]
webhook[Webhook  Endpoint /sns-callback]
end

subgraph  Cloud  Services
s3[(Amazon S3 Bucket)]
SNS[Amazon  SNS]
transcribe[Amazon  Transcribe]
end

subgraph  Worker  Pool
worker1[Worker 1]
worker2[Worker 2]
worker3[Worker 3]
worker4[Worker 4]
worker5[Worker 5]
end

%% Client flow
Client  --> A
A  --> db
A  --> queue
A  --> s3

%% Workers
queue  --> worker1
queue  --> worker2
queue  --> worker3
queue  --> worker4
queue  --> worker5

worker1  --> transcribe
worker2  --> transcribe
worker3  --> transcribe
worker4  --> transcribe
worker5  --> transcribe

transcribe  --> s3

%% Notification & callback
s3  --> SNS
SNS  --> webhook
webhook  --> db

📚 Documentation

API Reference

🔧 AWS Setup Guide

See docs/aws-setup.md for full instructions on how to:

Configure S3 bucket with policy
Set up IAM user and permissions
Configure SNS topic and connect it to your backend
Use Ngrok for local webhook testing

Setup Guide

Prerequisites

Node.js v18+
Redis 7+
PostgreSQL 16+
AWS credentials with access to:
- S3 (read/write)
- Transcribe
- SNS

1. Clone the Repository

git clone https://github.com/vaidik-bajpai/Audio-Transcription-API.git
cd Audio-Transcription-API

2. Install Dependencies

npm install

3. Set Up Environment Variables

Create a .env file in the root directory of your project.

Click to expand .env example

# Server Configuration
PORT=8080                         # Port your server will run on

# AWS Credentials
AWS_ACCESS_KEY_ID=your_access_key_id
AWS_SECRET_ACCESS_KEY=your_secret_access_key
AWS_REGION=your_aws_region        # e.g., us-east-1
AWS_S3_BUCKET=your_bucket_name    # e.g., audio-transcription-files

# Database Connection
DATABASE_URL=postgresql://user:password@localhost:5432/transcriptiondb

# File Upload Limit
MAX_FILE_BYTES=10485760           # 10 MB in bytes

# JWT Authentication Secrets
ACCESS_TOKEN_SECRET=your_access_token_secret
REFRESH_TOKEN_SECRET=your_refresh_token_secret

4. Set Up the Database (via Docker)

Make sure PostgreSQL and Redis are running in Docker containers. You can spin them up using the following commands:

# PostgreSQL
docker run --name postgres-transcription \
  -e POSTGRES_USER=transcriber \
  -e POSTGRES_PASSWORD=secret123 \
  -e POSTGRES_DB=transcriptiondb \
  -p 5432:5432 \
  -d postgres:16

# Redis
docker run --name redis-transcription \
  -p 6379:6379 \
  -d redis:7

5. Initialize Prisma

Run the following Prisma commands to set up your database schema and generate the Prisma client:

npx prisma generate
npx prisma db push

6. Start the Development Server

Start your API server using:

npm run dev

Ensure PostgreSQL container is running before starting the server. If successful, you should see something like:

> audio-transcription-api@1.0.0 dev
> tsx watch src/index.ts

Server running on port 8080

Start your worker processes

npm run worker

Ensure Redis container is running before starting the server. If successful, you should see something like:

> audio-transcription-api@1.0.0 worker
> tsx src/jobs/worker.ts

worker started

7. Test the API

Use Postman, Hoppscotch, or curl to test the following API endpoints:

Method	Endpoint	Description
POST	`/api/users/signup`	Register a new user
POST	`/api/users/login`	Log in and receive access/refresh tokens
POST	`/api/users/logout`	Log out and invalidate the refresh token
POST	`/api/users/refresh`	Refresh access token using refresh token
POST	`/api/transcription/upload`	Upload an audio file for transcription
GET	`/api/transcription/status/:id`	Check the status of a transcription job
GET	`/api/transcription/result/:id`	Retrieve the transcription result
GET	`/api/transcription/links/:id`	Retrieve the presigned download urls

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
docs		docs
prisma		prisma
sample		sample
src		src
.gitignore		.gitignore
.prettierrc		.prettierrc
Makefile		Makefile
README.md		README.md
eslint.config.mjs		eslint.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts
vitest.setup.ts		vitest.setup.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Transcription API

Features

Bonus Features

API Architecture

📚 Documentation

Setup Guide

Prerequisites

1. Clone the Repository

2. Install Dependencies

3. Set Up Environment Variables

4. Set Up the Database (via Docker)

5. Initialize Prisma

6. Start the Development Server

7. Test the API

About

Uh oh!

Releases

Packages

Languages

vaidik-bajpai/Audio-Transcription-API

Folders and files

Latest commit

History

Repository files navigation

Audio Transcription API

Features

Bonus Features

API Architecture

📚 Documentation

Setup Guide

Prerequisites

1. Clone the Repository

2. Install Dependencies

3. Set Up Environment Variables

4. Set Up the Database (via Docker)

5. Initialize Prisma

6. Start the Development Server

7. Test the API

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages