This application is an internal decision-support web product for a flooring contractor who reviews construction bid documents. It centralizes project PDFs, extracts flooring and epoxy scope details, highlights risk signals, and produces a structured recommendation (BID, REVIEW, or PASS).
The business problem is practical: bid packages are large, inconsistent, and time-sensitive. Important scope details can be buried across multiple plans and specifications. This tool reduces manual scanning time and makes decisions more consistent by turning unstructured documents into a traceable, review-ready output.
Why it matters for this domain:
- Flooring and epoxy scope often appears across scattered sections.
- Missed exclusions or ambiguous language can create margin risk.
- Fast, evidence-backed summaries improve bid/no-bid judgment.
- Authenticated single-admin access.
- Project creation and editing with bid metadata.
- Upload of 1–10 project PDFs.
- Two-pass extraction focused on flooring/epoxy scope.
- Risk flag detection and recommendation output.
- Downloadable project summary PDF.
- Searchable/filterable project library.
- Admin deletion and basic usage metrics.
- Multi-user roles or team workflows.
- External platform scraping/automation.
- Quantity takeoff, geometry, or plan measurement.
- Mobile application support.
The product sits between document intake and final bid strategy: ingest documents, synthesize scope/risk, then support a go/no-go decision with references.
- Project: one bid opportunity with metadata (type, source, date, notes) and attached documents.
- Construction documents (PDFs): plans/specs/addenda uploaded for analysis.
- Scope extraction: conversion of document text into structured, decision-ready fields.
- Flooring / epoxy scope: explicit requirements, inclusions, exclusions, materials, prep/coating expectations.
- Risk flags: warnings about ambiguity, exclusions, assumptions, or potential commercial risk.
- Recommendation (
BID/REVIEW/PASS):BID: generally clear and actionable scope.REVIEW: potentially viable but requires manual clarification.PASS: insufficient fit or low-confidence scope value.
- References: evidence objects attached to extracted items (
file,page,excerpt) for verification. - Two-pass extraction:
- Pass 1 narrows relevant pages.
- Pass 2 performs detailed structured extraction only on shortlisted pages.
- Authentication: Admin signs in to access protected workflows.
- Project creation: A new bid project is created with context data.
- PDF upload: Relevant bid documents are attached to the project.
- AI extraction: System runs two-pass analysis and saves run status/results.
- Review output: User inspects extracted scope, risk flags, recommendation, and references.
- Decision support: User uses evidence-backed output to decide bid posture.
- Report generation: A structured PDF summary is generated and downloaded.
- Project library reuse: Past projects are searched/filtered for comparison and historical recall.
- Why: bidding requires stable project context, not loose files.
- Approach: model each bid as a durable entity with standardized fields.
- Principle: metadata-first organization improves retrieval and reporting.
- Why: source truth is document-based.
- Approach: enforce file type/count/size constraints and project association.
- Principle: controlled intake prevents invalid analysis and runtime drift.
- Why: reduce noise and computational cost while improving relevance.
- Approach: identify likely pages first, then extract structured details.
- Principle: scoped context yields more reliable outputs than whole-document prompting.
- Why: business users must trust and verify outputs quickly.
- Approach: attach file/page/excerpt evidence to extracted elements.
- Principle: traceability is mandatory for operational trust.
- Why: decisions and handoffs need a shareable artifact.
- Approach: deterministic summary layout with scope, risk, recommendation, references.
- Principle: consistency improves communication quality.
- Why: historical bids are strategic assets.
- Approach: keyword search and domain filters (status/type/source/date).
- Principle: retrieval speed enables learning across projects.
- Why: internal owner needs operational control and visibility.
- Approach: secure destructive actions + minimal usage counters.
- Principle: lightweight governance without enterprise overhead.
The system uses a web frontend + API backend + Postgres persistence.
- Frontend (React): handles user workflows, forms, filtering, and output presentation.
- Backend (FastAPI): owns auth, business rules, extraction orchestration, report generation, and access control.
- Database (Postgres): stores projects, files, extraction runs, reports, and counters.
Persistence strategy is intentional: uploaded PDFs and generated reports are stored in the database so data survives container restarts and free-tier hosting limitations.
Containerization is used for reproducible environments:
- Local: Docker Compose runs API + Postgres.
- Production (Render): Docker image + managed Postgres, without Compose.
AI is used as an accelerator for document interpretation, not as an autonomous decision-maker. The system automates extraction and structuring, then presents evidence for human judgment.
Reliability and trust are handled by:
- Structured output contracts (schema-driven payloads).
- Explicit run statuses (
PENDING,RUNNING,SUCCESS,FAILED). - Evidence references on extracted items.
- Clear distinction between AI suggestion and business decision.
References and structured JSON matter because they make outputs auditable, debuggable, and reusable across reporting and search.
- Single-admin model only.
- Synchronous extraction execution.
- Basic usage metrics (not full observability suite).
- No external bidding platform integrations.
- Multi-user roles and permissions.
- Background job queue for heavy extraction workloads.
- Richer analytics and operational dashboards.
- Enhanced model strategy and extraction confidence scoring.
- Broader document intelligence workflows beyond flooring/epoxy.
The architecture is designed to evolve incrementally: clear service boundaries, explicit schemas, and migration-backed data contracts reduce rewrite risk.
cd /home/vant/Documents/business/ai-flooring-pdf-analyzer
docker compose up --build -dOpen:
http://localhost:8000
Default admin credentials:
- Username:
admin - Password:
admin123
Optional quick validation:
python scripts/create_sample_pdfs.py
bash scripts/smoke.shStop services:
docker compose down- Local DB host port is
55432(mapped to container5432) to avoid conflicts. - Render deployment configuration is provided in
render/render.yaml. - Environment variable baseline is documented in
.env.example.