Project is migrated to https://code.swecha.org/corpus/corpus-backend
A FastAPI-based backend service for managing Telugu corpus collections, supporting text, audio, video, and image submissions with PostgreSQL database and JWT authentication.
- Multi-media Support: Handle text, audio, video, and image submissions
- User Management: Many-to-many role-based user system (admin/user/reviewer)
- Authentication: JWT-based authentication and authorization
- OTP Authentication: SMS-based OTP verification for secure phone number authentication
- Category Management: Organize submissions by categories
- Record Review System: Support for content review workflows
- Geolocation & PostGIS: Advanced geographic data handling with spatial queries and indexing
- PostgreSQL Database: Robust database with proper foreign key constraints
- File Storage: Support for local and MinIO/S3 storage
- RESTful API: Full CRUD operations with OpenAPI documentation
corpus-te/
├── app/
│ ├── main.py # FastAPI application
│ ├── core/
│ │ ├── config.py # Settings and configuration
│ │ ├── auth.py # JWT authentication utilities
│ │ ├── exceptions.py # Custom exceptions
│ │ ├── logging_config.py # Logging setup
│ │ └── rbac_fastapi.py # Role-based access control utilities
│ ├── db/
│ │ └── session.py # Database session
│ ├── models/
│ │ ├── __init__.py # Database models
│ │ ├── associations.py # Many-to-many association tables
│ │ ├── user.py # User model
│ │ ├── role.py # Role model
│ │ ├── category.py # Category model
│ │ ├── record.py # Record model
│ │ └── otp.py # OTP model for authentication
│ ├── schemas/
│ │ ├── __init__.py # Pydantic schemas
│ │ ├── geo_schemas.py # Geographic coordinate schemas
│ │ └── otp.py # OTP request/response schemas
│ ├── api/
│ │ ├── __init__.py
│ │ ├── auth.py # Legacy auth endpoints
│ │ └── v1/
│ │ ├── api.py # API router
│ │ ├── __init__.py
│ │ └── endpoints/
│ │ ├── __init__.py
│ │ ├── auth.py # Authentication endpoints (with OTP)
│ │ ├── users.py # User management endpoints
│ │ ├── roles.py # Role management endpoints
│ │ ├── categories.py # Category endpoints
│ │ ├── records.py # Record endpoints
│ │ └── system_rbac.py # RBAC system endpoints
│ ├── services/ # Business logic services
│ │ └── otp_service.py # OTP authentication service
│ └── utils/ # Utility modules
│ ├── __init__.py
│ ├── cleanup_storage.py # Storage cleanup utilities
│ ├── hetzner_storage.py # Hetzner object storage integration
│ ├── postgis_utils.py # PostGIS geographic utilities
│ └── record_file_generator.py # Record file generation utilities
├── alembic/ # Database migrations
│ ├── versions/ # Migration files
│ ├── alembic.ini # Alembic configuration
│ └── env.py # Migration environment
├── docs/ # Documentation and guides
│ ├── demo_rbac_optimization.py # RBAC demo script
│ ├── example_hetzner_storage.py # Hetzner storage examples
│ ├── example_record_file_generator.py # Record generator examples
│ ├── generate_record_files.py # File generation script
│ ├── otp_demo.py # OTP demo script
│ ├── HETZNER_STORAGE_GUIDE.md # Hetzner storage setup guide
│ ├── OTP_AUTHENTICATION_GUIDE.md # OTP authentication guide
│ ├── OTP_IMPLEMENTATION_SUMMARY.md # OTP implementation summary
│ ├── OTP_TESTING_RESULTS.md # OTP testing results
│ ├── Plan.md # Project development plan
│ ├── POSTGIS_INTEGRATION_SUMMARY.md # PostGIS integration summary
│ ├── POSTGRESQL_SETUP.md # PostgreSQL setup guide
│ ├── RBAC_GUIDE.md # Role-based access control guide
│ ├── RBAC_OPTIMIZATION_SUMMARY.md # RBAC optimization summary
│ ├── RECORD_FILE_GENERATOR_GUIDE.md # Record file generator guide
│ └── RECORD_FILE_GENERATOR_COMPLETION_SUMMARY.md # Generator completion summary
├── tests/ # Test files
│ ├── create_test_data.py # Test data creation script
│ ├── test_hetzner_storage.py # Hetzner storage tests
│ ├── test_otp_api.py # OTP API tests
│ ├── test_postgis_api.py # PostGIS API tests
│ ├── test_postgis_integration.py # PostGIS integration tests
│ ├── test_updated_api_endpoints.py # Updated API endpoint tests
│ └── verify_test_data.py # Test data verification
├── logs/ # Application logs
├── main.py # Application entry point
├── setup_postgresql.py # PostgreSQL setup automation script
├── pyproject.toml # Project dependencies
├── uv.lock # UV dependency lock file
├── LICENSE # License file
└── README.md # This file
Complete SMS-based One-Time Password authentication system for secure phone number verification:
- SMS Integration: Real SMS delivery via Ozonetel API
- Security: HMAC-SHA256 OTP hashing with salts and time-based expiry
- Rate Limiting: Built-in protection against spam and abuse
- Phone Validation: International phone number format validation
- JWT Integration: Seamless token generation after successful verification
📖 Detailed OTP Authentication Guide
Advanced spatial data handling with PostGIS for location-based features:
- Spatial Queries: Efficient geographic data operations and indexing
- Location Services: Precise coordinate handling and validation
- Performance Optimization: Specialized indexes for geographic queries
- Data Integrity: Robust validation for coordinate formats and ranges
Comprehensive user management system with flexible permissions:
- Multi-Role Support: Admin, User, and Reviewer roles with granular permissions
- Performance Optimized: Efficient user-role queries and caching strategies
- Scalable Architecture: Designed for large-scale user management
📖 RBAC Performance Optimization Guide
-
Clone and navigate to the project:
cd corpus-te -
Create a virtual environment:
uv venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies:
uv pip install -e . -
Install development dependencies:
uv pip install -e ".[dev]"Alternative: Install all dependencies with uv sync (if using uv.lock):
uv sync --dev
-
Set up PostgreSQL database:
See POSTGRESQL_SETUP.md for detailed PostgreSQL installation and setup instructions.
-
Set up environment variables:
cp .env.example .env # Edit .env with your configuration (see Configuration section below) -
Generate a secure secret key for JWT: For Bash
openssl rand -hex 32
OR using Python
import secrets secrets.token_urlsafe(32)
Update the
APP_SECRET_KEYin your.envfile with the generated key.Example:
APP_SECRET_KEY="your-generated-secret-key" -
Set up PostgreSQL database with automated script (Recommended):
Use the provided setup script to automatically create the database and run initial setup:
# Run complete database setup (recommended for first-time setup) python setup_postgresql.py --allOr run individual steps:
# Test PostgreSQL connection python setup_postgresql.py --test-connection # Create database if it doesn't exist python setup_postgresql.py --create-db # Run database migrations python setup_postgresql.py --migrate # Seed initial data (roles) python setup_postgresql.py --seed
Alternative: Manual database setup:
If you prefer manual setup, see POSTGRESQL_SETUP.md for detailed PostgreSQL installation and setup instructions, then run:
alembic upgrade head
-
Run the application:
python main.py
-
Access the API documentation:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
uv pip install package-nameuv pip install -e ".[dev]"uvicorn app.main:app --reload --host 0.0.0.0 --port 8000The setup_postgresql.py script provides automated database setup and testing functionality:
# Show current database configuration
python setup_postgresql.py
# Run complete setup (creates DB, runs migrations, seeds data)
python setup_postgresql.py --all
# Individual operations:
python setup_postgresql.py --test-connection # Test PostgreSQL server connection
python setup_postgresql.py --create-db # Create database if missing
python setup_postgresql.py --migrate # Run Alembic migrations
python setup_postgresql.py --seed # Seed initial roles dataWhat the script does:
- Connection Testing: Verifies PostgreSQL server accessibility
- Database Creation: Creates the target database if it doesn't exist
- Migration Execution: Runs all pending Alembic migrations
- Data Seeding: Creates initial roles (admin, user, reviewer)
- Error Handling: Provides clear feedback on setup status
Prerequisites:
- PostgreSQL server running and accessible
- Correct database credentials in
.envfile psycopg2-binaryinstalled (included in project dependencies)
Before running the migrations check POSTGRESQL_SETUP.md for PostgreSQL setup.
# Create new migration
alembic revision --autogenerate -m "description"
# Apply migrations
alembic upgrade head
# Check migration status
alembic current
# Downgrade to previous migration
alembic downgrade -1pytestuv pip install black isort
black .
isort .uv pip install --upgrade package-nameKey environment variables in .env file:
DATABASE_URL: PostgreSQL connection stringDB_HOST: PostgreSQL host (default: localhost)DB_PORT: PostgreSQL port (default: 5432)DB_NAME: Database nameDB_USER: PostgreSQL usernameDB_PASSWORD: PostgreSQL password
PROJECT_NAME: Application nameLOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR)
APP_SECRET_KEY: JWT secret key (change in production)ALGORITHM: JWT algorithm (default: HS256)ACCESS_TOKEN_EXPIRE_MINUTES: Token expiration time (default: 30)
BACKEND_CORS_ORIGINS: Comma-separated list of allowed origins
HZ_OBJ_ACCESS_KEY: Object storage access keyHZ_OBJ_SECRET_KEY: Object storage secret keyHZ_OBJ_API_TOKEN: Object storage API tokenHZ_OBJ_ENDPOINT: Object storage endpointHZ_OBJ_BUCKET_NAME: Object storage bucket name
MAX_FILE_SIZE: Maximum file size in bytes (default: 104857600 = 100MB)
Example .env file:
# Database Configuration
DB_HOST=localhost
DB_PORT=5432
DB_NAME=corpus_te
DB_USER=postgres
DB_PASSWORD=your_password_here
DATABASE_URL="postgresql://postgres:your_password_here@localhost:5432/corpus_te"
# Application Settings
PROJECT_NAME="Telugu Corpus Collections API"
LOG_LEVEL="WARNING"
# JWT Configuration
APP_SECRET_KEY="your-secret-key-here"
ALGORITHM="HS256"
ACCESS_TOKEN_EXPIRE_MINUTES=30
# CORS Origins
BACKEND_CORS_ORIGINS="http://localhost:3000,http://localhost:8080"
# File Upload Settings
MAX_FILE_SIZE=104857600GET /: Welcome messageGET /health: Health checkGET /docs: Swagger UI API documentationGET /redoc: ReDoc API documentation
POST /api/v1/auth/login: User login (returns JWT token)POST /api/v1/auth/register: User registration
GET /api/v1/users/: List all users (with pagination)POST /api/v1/users/: Create a new userGET /api/v1/users/{user_id}: Get user by IDPUT /api/v1/users/{user_id}: Update userDELETE /api/v1/users/{user_id}: Delete userGET /api/v1/users/{user_id}/with-roles: Get user with roles populatedGET /api/v1/users/phone/{phone}: Get user by phone number
GET /api/v1/users/{user_id}/roles: Get user's rolesPOST /api/v1/users/{user_id}/roles: Assign roles to user (replace all)PUT /api/v1/users/{user_id}/roles/add: Add a role to userDELETE /api/v1/users/{user_id}/roles/{role_id}: Remove role from user
GET /api/v1/roles/: List all rolesPOST /api/v1/roles/: Create a new roleGET /api/v1/roles/{role_id}: Get role by IDPUT /api/v1/roles/{role_id}: Update roleDELETE /api/v1/roles/{role_id}: Delete role
GET /api/v1/categories/: List all categoriesPOST /api/v1/categories/: Create a new categoryGET /api/v1/categories/{category_id}: Get category by IDPUT /api/v1/categories/{category_id}: Update categoryDELETE /api/v1/categories/{category_id}: Delete category
GET /api/v1/records/: List all recordsPOST /api/v1/records/: Create a new recordGET /api/v1/records/{record_id}: Get record by IDPUT /api/v1/records/{record_id}: Update recordDELETE /api/v1/records/{record_id}: Delete record
This application uses PostgreSQL as the primary database. For detailed setup instructions, see POSTGRESQL_SETUP.md.
- Install PostgreSQL on your system
- Create a database and user:
CREATE DATABASE corpus_te; CREATE USER corpus_user WITH PASSWORD 'your_password'; GRANT ALL PRIVILEGES ON DATABASE corpus_te TO corpus_user;
- Update your
.envfile with the database credentials - Run migrations:
alembic upgrade head
The application includes:
- Users: User accounts with many-to-many role relationships
- Roles: System roles (admin, user, reviewer)
- Categories: Content organization categories
- Records: Media submissions with metadata
- User-Role Association: Many-to-many relationship table
The API uses JWT (JSON Web Tokens) for authentication:
- Register/Login: Use
/api/v1/auth/registeror/api/v1/auth/login - Get Token: Login returns an access token
- Use Token: Include in Authorization header:
Bearer <token> - Protected Endpoints: Most endpoints require authentication
Example authentication flow:
# Register a new user
# role_ids 1 for admin
curl -X 'POST' \
'http://localhost:8000/api/v1/users/' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"phone": "1234567890",
"name": "John",
"email": "[email protected]",
"gender": "male",
"date_of_birth": "2025-07-01",
"place": "Telangana",
"password": "password",
"role_ids": [
1
],
"has_given_consent": true
}'
# Login to get token
curl -X POST "http://localhost:8000/api/v1/auth/login" \
-H "Content-Type: application/json" \
-d '{"phone": "1234567890", "password": "password"}'
# Use token in requests
curl -X GET "http://localhost:8000/api/v1/users/" \
-H "Authorization: Bearer <your-token-here>"Database Connection Fails:
# Check if PostgreSQL is running
sudo systemctl status postgresql
# Test connection manually
psql -h localhost -U postgres -d postgresSetup Script Issues:
# Check database configuration
python setup_postgresql.py
# Run with verbose output
python setup_postgresql.py --test-connectionCommon Error Solutions:
-
"PostgreSQL server connection failed"
- Ensure PostgreSQL is installed and running
- Check credentials in
.envfile - Verify host and port settings
-
"Database connection failed"
- Run
python setup_postgresql.py --create-dbfirst - Check if database name matches
.envconfiguration
- Run
-
"Migration failed"
- Ensure database exists and is accessible
- Check for conflicting migrations with
alembic current - Reset migrations if needed:
alembic downgrade base
-
"Permission denied"
- Ensure PostgreSQL user has CREATE DATABASE privileges
- Check PostgreSQL authentication settings in
pg_hba.conf
-
"password authentication failed"
- verify your database password
- for unix based systems like linux sudo sed -i /etc/postgresql/17/main/pg_hba.conf s/peer/scram-sha-256/g sudo systemctl restart postgresql
Database Reset (Development Only):
# Drop and recreate database
psql -h localhost -U postgres -c "DROP DATABASE IF EXISTS corpus_te;"
python setup_postgresql.py --allImport Errors:
- Ensure virtual environment is activated
- Install dependencies:
uv sync --dev
Port Already in Use:
# Find process using port 8000
lsof -i :8000
# Kill process
kill -9 <PID>For more detailed troubleshooting, see POSTGRESQL_SETUP.md.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
This project is licensed under the terms specified in the LICENSE file.