You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ai/generative-ai-service/smart-invoice-extraction/README.md
+21-7Lines changed: 21 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,16 @@
2
2
3
3
An intelligent invoice data extractor built with **OCI Generative AI**, **LangChain**, and **Streamlit**. Upload any invoice PDF and this app will extract structured data like REF. NO., POLICY NO., DATES, etc. using multimodal LLMs.
- 🔍 Automatically identifies key invoice headers using OCI Vision LLM (LLaMA 3.2 90B Vision or Llama 4 Maverick)
10
15
- 🤖 Lets you choose what elements to extract (with type selection)
11
16
- 🧠 Leverages a text-based LLM (Cohere Command R+) for context-aware value extraction
12
17
- 🧪 Outputs data in clean **JSON** and saves to **CSV**
@@ -32,16 +37,16 @@ An intelligent invoice data extractor built with **OCI Generative AI**, **LangCh
32
37
1.**User Uploads Invoice PDF**
33
38
The file is uploaded and converted into an image using `pdf2image` (Ensure you upload one page documents ONLY)
34
39
35
-
2.**Initial Header Detection (LLaMA-3.2 Vision)**
40
+
2.**Initial Header Detection (LLaMA-3.2 Vision or Llama 4 Maverick)**
36
41
The first page is passed to the multimodal LLM which returns a list of fields that are likely to be useful (e.g., "Policy No.", "Amount", "Underwriter").
37
42
38
43
3.**User Selects Fields and Types**
39
44
A UI allows the user to pick 3 fields from the detected list, and specify their data types (Text, Number, etc.).
40
45
41
-
4.**Prompt Generation (Cohere Command R+)**
46
+
4.**Prompt Generation (Cohere Command A)**
42
47
The second LLM generates a custom system prompt to extract those fields as JSON.
43
48
44
-
5.**Full Invoice Extraction (LLaMA-3.2 Vision)**
49
+
5.**Full Invoice Extraction (LLaMA-3.2 Vision or Llama 4 Maverick)**
45
50
Each page image is passed into the multimodal LLM using the custom prompt, returning JSON values for the requested fields.
46
51
47
52
6.**Data Saving & Display**
@@ -86,8 +91,8 @@ streamlit run app.py
86
91
> - Replace all instances of `<YOUR_COMPARTMENT_OCID_HERE>` with your actual **OCI Compartment OCID**
87
92
> - Ensure you have access to **OCI Generative AI Services** with correct permissions
Licensed under the Universal Permissive License (UPL), Version 1.0.
118
+
119
+
See [LICENSE](LICENSE.txt) for more details.
120
+
121
+
ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK.
0 commit comments