create readme.md

sunilemanjee · sunilemanjee · commit 5ee8ee9342c9 · 2024-10-15T19:28:36.000-07:00
created readme.md
diff --git a/supporting-blog-content/alternative-approach-for-parsing-pdfs-in-rag/README.md b/supporting-blog-content/alternative-approach-for-parsing-pdfs-in-rag/README.md
@@ -0,0 +1,19 @@
+
+# PDF Parsing - Table Extraction
+
+ Python notebook demonstrates an alternative approach to parsing PDFs, particularly focusing on extracting and converting tables into a format suitable for search applications such as Retrieval-Augmented Generation (RAG). The notebook leverages Azure OpenAI to process and convert table data from PDFs into plain text for better searchability and indexing.
+
+## Features
+- **PDF Table Extraction**: The notebook identifies and parses tables from PDFs.
+- **LLM Integration**: Calls Azure OpenAI models to provide a text representation of the extracted tables.
+- **Search Optimization**: The parsed table data is processed into a format that can be more easily indexed and searched in Elasticsearch or other vector-based search systems.
+  
+## Getting Started
+
+### Prerequisites
+- Python 3.x
+
+
+## Example Use Case
+This notebook is ideal for use cases where PDFs contain structured tables that need to be converted into plain text for indexing and search applications in environments like Elasticsearch or similar search systems.
+