Skip to content

govtechmy/hansards-back

Repository files navigation

Hansards Backend

Digital Parliament Hansard Backend for Government of Malaysia

Overview

This is the backend system for the Digital Parliament Hansard project, providing APIs for searching, analyzing, and summarizing parliamentary proceedings in Malaysia.

Features

  • Parliamentary Data Management: Store and manage hansard data from Dewan Rakyat, Dewan Negara, and Kamar Khas
  • Advanced Search: Full-text search across parliamentary speeches with filtering options
  • AI Summarization: Automatic generation of bilingual summaries (English & Bahasa Malaysia) using OpenAI
  • Analytics: Attendance tracking, speaker analysis, and parliamentary statistics
  • RESTful APIs: Comprehensive API endpoints for frontend integration

Quick Start

Prerequisites

  • Python 3.11+
  • PostgreSQL
  • OpenAI API key (for AI summarization)

Installation

  1. Clone the repository

    git clone <repository-url>
    cd hansards-back
  2. Set up virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    make init
  4. Configure environment

    cp env.example .env
    # Edit .env with your configuration
  5. Run database migrations

    python src/manage.py migrate
  6. Start development server

    python src/manage.py runserver

AI Summarization

The system includes AI-powered summarization of parliamentary sessions:

  • Bilingual Support: Summaries in English and Bahasa Malaysia
  • Cost Optimization: Database storage to avoid repeated API calls
  • Async Processing: Non-blocking summary generation
  • Status Tracking: Monitor generation progress

Setup AI Summarization

  1. Add OpenAI API key to environment

    OPENAI_API_KEY=your-openai-api-key-here
  2. Generate summaries

    # Generate for specific sitting
    python src/manage.py generate_summaries --house dewan-rakyat --date 2024-01-15
    
    # Bulk generation
    python src/manage.py generate_summaries --house dewan-rakyat --term 15 --limit 50

For detailed documentation, see AI_SUMMARIZATION_README.md.

API Documentation

Core Endpoints

  • GET /api/catalogue/ - List parliamentary sittings
  • GET /api/search/ - Search speeches
  • GET /api/sitting/ - Get specific sitting data
  • GET /api/attendance/ - Attendance statistics

AI Summarization Endpoints

  • GET /api/summary/ - Retrieve existing summary
  • POST /api/summary/ - Generate new summary
  • GET /api/summary/status/ - Check generation status
  • POST /api/summary/bulk/ - Bulk summary generation

Development

Running Tests

python src/manage.py test

Code Quality

# Format code
black src/

# Sort imports
isort src/

# Lint code
flake8 src/

Database Management

# Create migration
python src/manage.py makemigrations

# Apply migrations
python src/manage.py migrate

# Reset database
python src/manage.py flush

Deployment

Docker

# Build image
make build

# Run with Docker Compose
docker-compose up -d

Environment Variables

Required environment variables:

  • DATABASE_URL - PostgreSQL connection string
  • SECRET_KEY - Django secret key
  • ALLOWED_HOSTS - Comma-separated list of allowed hosts
  • CSRF_TRUSTED_ORIGINS - Comma-separated list of trusted origins
  • OPENAI_API_KEY - OpenAI API key (for AI summarization)

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Support

For support and questions, contact: support@mydigital.gov.my

About

Backend for the Malaysian Parliament Hansards Search Engine

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors