Skip to content

A small backend project where I’m learning how to build and expose a text analysis service with FastAPI. It takes plain text files and returns simple metrics like word frequencies, sentence length, and lexical diversity. This is mainly a learning exercise — the code isn’t perfect, but I’m trying to keep it clear, structured, and easy to extend.

Notifications You must be signed in to change notification settings

GMarindev/text-analysis-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Text Analysis API

A small FastAPI project that exposes text analysis utilities through a REST API. It takes plain text (or a .txt file) and returns useful statistics such as word counts, hapax legomena, sentence length, and lexical diversity.

Project Structure:

backend/ ├─ api/ │ └─ main.py ├─ app/ │ ├─ io_utils.py │ └─ text_utils.py └─ tests/

Features:

Upload a text file and receive structured JSON analysis.

Metrics include:

Most frequent words (top_n)

Hapax legomena (words that appear only once)

Sentence count

Average sentence length

Lexical diversity

Unique words and total tokens

Interactive documentation with Swagger UI and ReDoc (auto-generated by FastAPI).

1 create virtual environment with:

python -m venv .venv

and:

.venv\Scripts\Activate.ps1 for Windows PowerShell source .venv/Scripts/activate for Windows Bash source .venv/bin/activate for Linux, MacOS

2 Install dependencies

pip install -r requirements.txt

3 From the backend/ directory:

fastapi dev api/main.py

Tests:

You can run unit tests with:

  pytest

Notes:

Built with FastAPI, pydantic, and pandas.

Intended as a learning project for experimenting with backend development and text processing.

Clean and modular: logic is in app/, API layer is in api/.

Guillermo Marin

About

A small backend project where I’m learning how to build and expose a text analysis service with FastAPI. It takes plain text files and returns simple metrics like word frequencies, sentence length, and lexical diversity. This is mainly a learning exercise — the code isn’t perfect, but I’m trying to keep it clear, structured, and easy to extend.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages