Skip to content

Commit 609dce0

Browse files
improve some instruction for using exsiting OCR result
1 parent a183dce commit 609dce0

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

notebooks/field_extraction_pro_mode.ipynb

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,9 @@
171171
"REFERENCE_DOC_SAS_URL = os.getenv(\"REFERENCE_DOC_SAS_URL\")\n",
172172
"REFERENCE_DOC_PATH = os.getenv(\"REFERENCE_DOC_PATH\")\n",
173173
"\n",
174+
"# Set skip_analyze to True if you already have OCR results for the documents in the reference_docs folder\n",
175+
"# Please name the OCR result files with the same name as the original document files including its extension, and add the suffix \".result.json\"\n",
176+
"# For example, if the original document is \"invoice.pdf\", the OCR result file should be named \"invoice.pdf.result.json\"\n",
174177
"await client.generate_knowledge_base_on_blob(reference_docs, REFERENCE_DOC_SAS_URL, REFERENCE_DOC_PATH, skip_analyze=False)"
175178
]
176179
},

0 commit comments

Comments
 (0)