Skip to content

This workflow automates the extraction of structured information from PDFs and image files using the Mistral OCR API.

Notifications You must be signed in to change notification settings

sushant1827/Mistral-OCR-PDF-Image

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 Mistral OCR with n8n

This n8n workflow automates the extraction of structured information from PDFs and image files using the Mistral OCR API. It’s designed for use cases like extracting invoice details or restaurant receipts, and storing them into Google Sheets.


πŸ”§ Features

  • πŸ“„ Upload PDF invoices through an n8n form
  • πŸ–ΌοΈ Provide image URLs (e.g., receipts) for OCR
  • πŸ” Uses Mistral OCR for high-quality document text extraction
  • 🧠 Extracts specific fields using LangChain Information Extractor:
    • For PDFs:
      • Invoice Number
      • Date
      • Gross Amount
      • Customer ID
    • For images:
      • Restaurant Name
      • Date
      • Total Bill Amount
  • πŸ“€ Appends extracted data to a Google Sheets document
  • πŸ’¬ Integrated OpenAI Chat node (optional) for further enrichment or validation

🧩 Workflow Nodes Overview

  • Form Trigger – Collects PDF invoices from users
  • Set Node – Allows testing with static image URLs
  • Mistral API Integration – Handles:
    • File upload
    • Signed URL generation
    • OCR processing
  • LangChain Extractors – Converts OCR'd text into structured fields
  • Google Sheets Node – Writes the extracted information to a live spreadsheet
  • OpenAI Chat Node – Optionally reviews or interprets the data

🧩 PDF Workflow

image


🧩 Image Workflow

image


πŸ›  Requirements

  • n8n setup (local or cloud)
  • Mistral API key with OCR access
  • Google Sheets API credentials
  • OpenAI API key

πŸ“‚ Example Use Cases

  • Automating expense reports
  • Extracting invoice metadata for accounting
  • Digitizing restaurant receipts for tax documentation

Built with πŸ’› using n8n and Mistral OCR.

About

This workflow automates the extraction of structured information from PDFs and image files using the Mistral OCR API.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published