IDFC GenAI Hackathon | Convolve 4.0
This project is an end-to-end Document AI pipeline designed to extract structured information from invoice images and PDFs.
OCR-based text extraction (Image + PDF)
Rule + fuzzy matching extractors
Model-based Horse Power inference
Stamp & Signature detection (YOLO – optional)
Confidence score recalibration
Streamlit-based UI demo
Batch & single-document inference support
Dealer Name
Model Name
Horse Power
Asset Cost
Stamp Presence + Bounding Box
Signature Presence + Bounding Box
Document Confidence Score
The system follows an end-to-end, modular pipeline that ingests invoice images or PDFs, extracts multilingual text using OCR, converts it into structured JSON, and applies rule-based, fuzzy, and model-driven logic for accurate field extraction. Vision models detect stamps and signatures, while EDA and confidence calibration ensure reliable outputs. A Streamlit UI enables real-time single-document inference, with batch processing supported for offline evaluation.
Screenshot shows Streamlit UI with extracted fields & confidence score.
Upload invoice image or PDF
Click Run Extraction
View extracted fields and confidence score
Output JSON generated at:
sample_output/result.json
for image : python src/executable.py invoice.png
for pdf : python src/executable.py invoice.pdf
Note: Processes only the given document
python src/executable.py --batch data/train
Imp Note: Processes all images in folder Generates combined JSON output

