This n8n workflow automates the extraction of structured information from PDFs and image files using the Mistral OCR API. Itβs designed for use cases like extracting invoice details or restaurant receipts, and storing them into Google Sheets.
- π Upload PDF invoices through an n8n form
- πΌοΈ Provide image URLs (e.g., receipts) for OCR
- π Uses Mistral OCR for high-quality document text extraction
- π§ Extracts specific fields using LangChain Information Extractor:
- For PDFs:
Invoice Number
Date
Gross Amount
Customer ID
- For images:
Restaurant Name
Date
Total Bill Amount
- For PDFs:
- π€ Appends extracted data to a Google Sheets document
- π¬ Integrated OpenAI Chat node (optional) for further enrichment or validation
- Form Trigger β Collects PDF invoices from users
- Set Node β Allows testing with static image URLs
- Mistral API Integration β Handles:
- File upload
- Signed URL generation
- OCR processing
- LangChain Extractors β Converts OCR'd text into structured fields
- Google Sheets Node β Writes the extracted information to a live spreadsheet
- OpenAI Chat Node β Optionally reviews or interprets the data
- n8n setup (local or cloud)
- Mistral API key with OCR access
- Google Sheets API credentials
- OpenAI API key
- Automating expense reports
- Extracting invoice metadata for accounting
- Digitizing restaurant receipts for tax documentation
Built with π using n8n and Mistral OCR.