Skip to content

Commit fbd09b6

Browse files
committed
added OCR for image text reading
1 parent ecf3781 commit fbd09b6

File tree

5 files changed

+17
-7
lines changed

5 files changed

+17
-7
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ wandb/
1818

1919
# datasets
2020
data/
21+
old-data/
2122

2223
# outputs
2324
outputs/

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# available file formats
2+
text files, images (with text through OCR)
3+
4+
# needed for image OCR
5+
sudo apt update
6+
sudo apt install -y tesseract-ocr libtesseract-dev

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,4 @@ pypdf
77
python-dotenv
88
pinecone
99
langgraph
10-
unstructured[pdf,docx,pptx,md]
10+
unstructured[pdf,docx,pptx,md,image]

src/any_chatbot/agent.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -46,10 +46,9 @@
4646
config = {"configurable": {"thread_id": random.random()}}
4747

4848
input_message = (
49-
"First retrieve what the revenue for Nike in 2023 was using the functional call.\n\n"
50-
"Once you get the answer, do a second retrieve to tell me which distribution centers nike have.\n\n"
51-
"Once you get the second answer,, tell me how many employees nike has. You can retreive MULTIPLE TIMES\n\n"
52-
"Base your answers only on the retrieved information thorugh the functional call you have."
49+
"What is the content of the image?\n\n"
50+
"When you don't know while files the user is talking about, use the functional call to retrieve what data is available with a general prompt.\n\n"
51+
"Base your answers only on the retrieved information thorugh the functional call you have. You can retreive MULTIPLE TIMES"
5352
)
5453

5554
for event in agent_executor.stream(

src/any_chatbot/indexing.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,12 +27,16 @@ def index_text_docs(
2727
"**/*.md",
2828
"**/*.html",
2929
"**/*.txt",
30+
"**/*.png",
31+
"**/*.jpg",
32+
"**/*.jpeg",
33+
"**/*.tiff",
3034
],
3135
loader_cls=UnstructuredFileLoader
3236
)
33-
print(f"Loading docs from {data_pth}")
37+
print(f"Loading files from {data_pth}")
3438
docs = loader.load()
35-
print(f"Loaded {len(docs)} docs")
39+
print(f"Loaded {len(docs)} files")
3640

3741
# Split the texts
3842
text_splitter = RecursiveCharacterTextSplitter(

0 commit comments

Comments
 (0)