Text Analysis API
A small FastAPI project that exposes text analysis utilities through a REST API. It takes plain text (or a .txt file) and returns useful statistics such as word counts, hapax legomena, sentence length, and lexical diversity.
Project Structure:
backend/ ├─ api/ │ └─ main.py ├─ app/ │ ├─ io_utils.py │ └─ text_utils.py └─ tests/
Features:
Upload a text file and receive structured JSON analysis.
Metrics include:
Most frequent words (top_n)
Hapax legomena (words that appear only once)
Sentence count
Average sentence length
Lexical diversity
Unique words and total tokens
Interactive documentation with Swagger UI and ReDoc (auto-generated by FastAPI).
1 create virtual environment with:
python -m venv .venv
and:
.venv\Scripts\Activate.ps1 for Windows PowerShell source .venv/Scripts/activate for Windows Bash source .venv/bin/activate for Linux, MacOS
2 Install dependencies
pip install -r requirements.txt
3 From the backend/ directory:
fastapi dev api/main.py
Tests:
You can run unit tests with:
pytest
Notes:
Built with FastAPI, pydantic, and pandas.
Intended as a learning project for experimenting with backend development and text processing.
Clean and modular: logic is in app/, API layer is in api/.
Guillermo Marin