Skip to content

Commit 1418b04

Browse files
author
KB Bot
committed
Added new kb article convert-pdf-table-to-datatable
1 parent 8a1532f commit 1418b04

File tree

1 file changed

+62
-0
lines changed

1 file changed

+62
-0
lines changed
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
title: Converting PDF Table Content to DataTable
3+
description: Learn how to transform a table from a PDF file into a DataTable object using the Telerik Document Processing libraries.
4+
type: how-to
5+
page_title: How to Convert PDF Table to DataTable with Telerik Document Processing
6+
slug: convert-pdf-table-to-datatable
7+
tags: document, processing, table, datatable, convert
8+
res_type: kb
9+
ticketid: 1675626
10+
---
11+
12+
## Environment
13+
14+
| Version | Product | Author |
15+
| ---- | ---- | ---- |
16+
| 2024.4.1106| Telerik Document Processing Libraries|[Desislava Yordanova](https://www.telerik.com/blogs/author/desislava-yordanova)|
17+
18+
## Description
19+
20+
Learn how to convert a specific table from a PDF file into a DataTable object using Telerik Document Processing libraries.
21+
22+
## Solution
23+
24+
Telerik Document Processing libraries do not offer a direct method to convert PDF table to a DataTable object. However, a feasible workaround is available. This method involves utilizing MS Excel or [RadSpreadsheet](https://docs.telerik.com/devtools/winforms/controls/spreadsheet/overview) for the intermediary conversion step.
25+
26+
1. Select and copy the desired table's content from the PDF file.
27+
2. Paste the copied content into MS Excel or RadSpreadsheet. This step converts the PDF table into an Excel format.
28+
3. Save the document into XLSX with [RadSpreadProcessing]({%slug radspreadprocessing-overview%}).
29+
4. Use the RadSpreadProcessing library to convert the Excel document into a DataTable. Utilize the [DataTableFormatProvider]({%slug radspreadprocessing-formats-and-conversion-using-data-table-format-provider%}) from RadSpreadProcessing for this conversion.
30+
31+
Here is a code snippet demonstrating the conversion of an XLSX document to a DataTable using RadSpreadProcessing:
32+
33+
```csharp
34+
using Telerik.Windows.Documents.Spreadsheet.FormatProviders.OpenXml.Xlsx;
35+
using Telerik.Windows.Documents.Spreadsheet.Model;
36+
using System.Data;
37+
using Telerik.Windows.Documents.Spreadsheet.FormatProviders;
38+
39+
// Load the XLSX file
40+
Workbook workbook;
41+
using (FileStream input = new FileStream("path_to_your_xlsx_file.xlsx", FileMode.Open))
42+
{
43+
IWorkbookFormatProvider formatProvider = new XlsxFormatProvider();
44+
workbook = formatProvider.Import(input);
45+
}
46+
47+
// Convert the first worksheet to DataTable
48+
Worksheet worksheet = workbook.Worksheets[0];
49+
DataTable dataTable = new DataTable();
50+
51+
DataTableFormatProvider dataTableFormatProvider = new DataTableFormatProvider();
52+
dataTable = dataTableFormatProvider.Export(worksheet);
53+
```
54+
55+
This solution provides a way to parse PDF table content and use it as a DataTable, leveraging the powerful features of Telerik Document Processing libraries.
56+
57+
## See Also
58+
59+
- [RadWordsProcessing Overview](https://docs.telerik.com/devtools/document-processing/libraries/radwordsprocessing/overview)
60+
- [RadSpreadProcessing Overview](https://docs.telerik.com/devtools/document-processing/libraries/radspreadprocessing/overview)
61+
- [Using DataTable Format Provider](https://docs.telerik.com/devtools/document-processing/libraries/radspreadprocessing/formats-and-conversion/data-table/using-data-table-format-provider)
62+
- [Import and Export to Excel File Formats](https://docs.telerik.com/devtools/document-processing/libraries/radspreadprocessing/formats-and-conversion/import-and-export-to-excel-file-formats/xlsx/xlsx)

0 commit comments

Comments
 (0)