Skip to content

Commit 247da1c

Browse files
committed
removed colab link
removed colab link to check if that is why build is faling.
1 parent 649d811 commit 247da1c

File tree

1 file changed

+2
-3
lines changed

1 file changed

+2
-3
lines changed

supporting-blog-content/alternative-approach-for-parsing-pdfs-in-rag/alternative-approach-for-parsing-pdfs-in-rag.ipynb

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,7 @@
66
"id": "e9-GuDRKCz_1"
77
},
88
"source": [
9-
"# PDF Parsing - Table Extraction\n",
10-
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/supporting-blog-content/alternative-approach-for-parsing-pdfs-in-rag/alternative-approach-for-parsing-pdfs-in-rag.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n"
9+
"# PDF Parsing - Table Extraction\n"
1110
]
1211
},
1312
{
@@ -16,7 +15,7 @@
1615
"id": "MBdflc9G0ICc"
1716
},
1817
"source": [
19-
"##Objective\n",
18+
"## Objective\n",
2019
"This Python script extracts text and tables from a PDF file, converts the tables into a human-readable text format using Azure OpenAI, and writes the processed content to a text file. The script uses pdfplumber to extract text and table data from each page of the PDF. For tables, it sends a cleaned version (handling any missing or None values) to Azure OpenAI, which generates a natural language summary of the table. The extracted non-table text and the summarized table text are then saved to a text file for easy search and readability."
2120
]
2221
},

0 commit comments

Comments
 (0)