This repository provides a modular backend that converts natural language into SQL and executes it on an SQLite database (school.db
).
The backend is built with FastAPI and uses Groq’s OpenAI-compatible API to generate SQL from English questions. A separate modern UI will be added by the frontend team.
- Features
- Technologies Used
- Getting Started
- Example Queries
- Database Schema
- Architecture
- Demo
- API Endpoints (v1)
- Testing & Coverage
- Roadmap
- Contributing
- Convert natural language questions into SQL queries (Groq).
- Execute generated SQL queries against the SQLite database.
- Clean SQL responses (strip markdown, enforce semicolon).
- Modern FastAPI backend with auto docs at
/docs
. - Full test suite with coverage and temporary DB isolation.
- FastAPI for the backend API
- Groq for NL→SQL (OpenAI-compatible API)
- SQLite for the demo database
- Pydantic for validation
- Pytest for testing and coverage
Backend is located in backend/
. You can use Conda for a reproducible setup.
- Create environment (Option A: from file)
conda env create -f backend/environment.yml
conda activate text2sql-backend
Or Option B (manual):
conda create -n text2sql-backend python=3.11 -y
conda activate text2sql-backend
pip install -e backend[dev]
- Configure environment variables
cp backend/.env.example backend/.env
# edit backend/.env and set GROQ_API_KEY and (optionally) GROQ_MODEL
2.5) Seed the database (creates/refreshes school.db
)
python backend/scripts/seed_db.py
- Run the API (from repo root)
uvicorn app.main:app --reload --port 8000 --app-dir backend
Open API docs at: http://127.0.0.1:8000/docs
Here are some example questions you can ask:
- "How many entries of records are present?"
- "Tell me all the students studying in Data Science class?"
- "What is the average marks of students?"
The database school.db has the following schema:
Column | Type | Description |
---|---|---|
NAME | VARCHAR(25) | Name of the student |
CLASS | VARCHAR(25) | Class of the student |
SECTION | VARCHAR(25) | Section of the student |
MARKS | INT | Marks obtained by the student |
┌──────────────────────────────────────────────────────────────┐
│ Client (future SPA) │
└──────────────▲───────────────────────────────────────▲───────┘
│ │
│ HTTP (JSON) │
│ │
┌──────┴───────────────────────────────────────┴──────┐
│ FastAPI Backend │
│ `backend/app/main.py` │
├───────────────┬───────────────────────────┬──────────┤
│ API (v1) │ Services │ Utils │
│ routes.py │ - nl2sql.py (Groq) │ - sql_cleaner.py
│ │ - db.py (SQLite ops) │ │
└───────────────┴───────────────┬───────────┴──────────┘
│
│ SQL
▼
SQLite: school.db
- Start server from repo root:
uvicorn app.main:app --reload --port 8000 --app-dir backend
- Run the demo script:
python backend/api_demo.py
- GET
/api/v1/health
→ health check - GET
/api/v1/students
→ list all students - POST
/api/v1/sql
→ execute provided SQL - POST
/api/v1/nl2sql
→ convert NL to SQL using Groq
Example curl (after starting the server):
curl http://127.0.0.1:8000/api/v1/health
curl http://127.0.0.1:8000/api/v1/students
curl -X POST http://127.0.0.1:8000/api/v1/sql \
-H 'Content-Type: application/json' \
-d '{"sql":"SELECT COUNT(*) FROM STUDENT;"}'
curl -X POST http://127.0.0.1:8000/api/v1/nl2sql \
-H 'Content-Type: application/json' \
-d '{"question":"How many entries of records are present?"}'
cd backend
pytest -q --cov=app --cov-report=term-missing
Tests use a temporary SQLite DB and do not touch your real school.db
.
- Frontend UI integration (modern SPA)
- AuthN/Z and rate limiting
- Schema introspection and multi-table support
- Safer SQL generation and validation guardrails
Contributions are welcome! Please open an issue or submit a pull request if you'd like to contribute.
- Fork the repository.
- Create your feature branch (
git checkout -b feature/AmazingFeature
). - Commit your changes (
git commit -m 'Add some AmazingFeature'
). - Push to the branch (
git push origin feature/AmazingFeature
). - Open a pull request.
This project is licensed under the MIT License.