@@ -31,6 +31,55 @@ Go from PDF files to this:
3131 {'date': (2014, 8, 3), 'invoice_number': '42183017', 'amount': 4.11, 'desc': 'Invoice 42183017 from Amazon Web Services'}
3232 {'date': (2015, 1, 28), 'invoice_number': '12429647', 'amount': 101.0, 'desc': 'Invoice 12429647 from Envato'}
3333
34+ ``` mermaid
35+ flowchart LR
36+
37+ InvoiceFile[fa:fa-file-invoice Invoicefile\n\npdf\nimage\ntext] --> Input-module(Input Module\n\npdftotext\ntext\npdfminer\npdfplumber\ntesseract\ngvision)
38+
39+ Input-module --> |Extracted Text| C{keyword\nmatching}
40+
41+ Invoice-Templates[(fa:fa-file-lines Invoice Templates)] --> C{keyword\nmatching}
42+
43+ C --> |Extracted Text + fa:fa-file-circle-check Template| E(Template Processing\n apply options from template\nremove accents, replaces etc...)
44+
45+ E --> |Optimized String|Plugins&Parsers(Call plugins + parsers)
46+
47+ subgraph Plugins&Parsers
48+
49+ direction BT
50+
51+ tables[fa:fa-table tables] ~~~ lines[fa:fa-grip-lines lines]
52+
53+ lines ~~~ regex[fa:fa-code regex]
54+
55+ regex ~~~ static[fa:fa-check static]
56+
57+
58+
59+ end
60+
61+ Plugins&Parsers --> |output| result[result\nfa:fa-file-csv,\njson,\nXML]
62+
63+
64+
65+ click Invoice-Templates https://github.com/invoice-x/invoice2data/blob/master/TUTORIAL.md
66+
67+ click result https://github.com/invoice-x/invoice2data#usage
68+
69+ click Input-module https://github.com/invoice-x/invoice2data#installation-of-input-modules
70+
71+ click E https://github.com/invoice-x/invoice2data/blob/master/TUTORIAL.md#options
72+
73+ click tables https://github.com/invoice-x/invoice2data/blob/master/TUTORIAL.md#tables
74+
75+ click lines https://github.com/invoice-x/invoice2data/blob/master/TUTORIAL.md#lines
76+
77+ click regex https://github.com/invoice-x/invoice2data/blob/master/TUTORIAL.md#regex
78+
79+ click static https://github.com/invoice-x/invoice2data/blob/master/TUTORIAL.md#parser-static
80+
81+ ```
82+
3483## Installation
3584
36851 . Install pdftotext
0 commit comments