Skip to content

Commit 90c13aa

Browse files
committed
refactor: rename the whole project into pdf2table
1 parent fe28ed2 commit 90c13aa

28 files changed

+56
-56
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# TableRag
1+
# Pdf2Table
22

33
A RAG (Retrieval-Augmented Generation) application for detecting, extracting, and indexing tables from PDF documents and finally inferring on them.
44

@@ -24,7 +24,7 @@ This project aims to provide a robust solution for extracting tabular data from
2424

2525
## Project Structure
2626

27-
- `table_rag/`: Main package
27+
- `pdf2table/`: Main package
2828
- `adaptors/`: Interface with external systems (Elasticsearch, PDF reader, Table Transformer)
2929
- `entities/`: Domain models
3030
- `usecases/`: Application logic
@@ -42,8 +42,8 @@ pip install -e .
4242

4343
### Usage
4444
```python
45-
from table_rag.frameworks.table_extraction_factory import TableExtractionFactory
46-
from table_rag.usecases.dtos import TableExtractionRequest
45+
from pdf2table.frameworks.table_extraction_factory import TableExtractionFactory
46+
from pdf2table.usecases.dtos import TableExtractionRequest
4747

4848
# Initialize the factory
4949
factory = TableExtractionFactory()

docs/architecture_guide.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Directory Structure
44
```
5-
table_rag/
5+
pdf2table/
66
├── entities/
77
│ └── table_entities.py
88
├── usecases/
@@ -22,7 +22,7 @@ table_rag/
2222

2323
## Architecture Layers
2424

25-
### 1. Entities Layer (`table_rag/entities/`)
25+
### 1. Entities Layer (`pdf2table/entities/`)
2626
- **table_entities.py**: Core business entities and domain services
2727
- `BoundingBox`: Value object for coordinates
2828
- `DetectedCell`: Detected table cell entity
@@ -31,7 +31,7 @@ table_rag/
3131
- `DetectedTable`: Detected table with metadata
3232
- `PageImage`: PDF page image entity
3333

34-
### 2. Use Cases Layer (`table_rag/usecases/`)
34+
### 2. Use Cases Layer (`pdf2table/usecases/`)
3535
- **table_extraction_use_case.py**: Application business logic
3636
- `TableExtractionUseCase`: Orchestrates table extraction workflow
3737
- `TableGridBuilder`: Builds structured grids from detected cells
@@ -43,12 +43,12 @@ table_rag/
4343
- `TableExtractionRequest`: Request DTO for table extraction
4444
- `TableExtractionResponse`: Response DTO for table extraction
4545

46-
### 3. Interface Adapters Layer (`table_rag/adaptors/`)
46+
### 3. Interface Adapters Layer (`pdf2table/adaptors/`)
4747
- **table_extraction_ports.py**: Abstract interfaces and DTOs
4848
- Port interfaces: `PDFImageExtractorPort`, `TableDetectorPort`, etc.
4949
- `TableExtractionAdapter`: Coordinates between use cases and external interfaces
5050

51-
### 4. Frameworks & Drivers Layer (`table_rag/frameworks/`)
51+
### 4. Frameworks & Drivers Layer (`pdf2table/frameworks/`)
5252
- **pdf_image_extractor.py**: PyMuPDF implementation
5353
- **table_transformer_detector.py**: Table detection using Transformer models
5454
- **table_structure_recognizer.py**: Structure recognition using Transformer models
@@ -58,7 +58,7 @@ table_rag/
5858

5959
### Usage (Simple)
6060
```python
61-
from table_rag.frameworks.table_extraction_factory import TableExtractionService
61+
from pdf2table.frameworks.table_extraction_factory import TableExtractionService
6262

6363
service = TableExtractionService(device="cpu")
6464
result = service.extract_tables_from_page(pdf_path, page_number)
@@ -67,8 +67,8 @@ tables = result["tables"]
6767

6868
### Usage (Advanced)
6969
```python
70-
from table_rag.frameworks.table_extraction_factory import TableExtractionFactory
71-
from table_rag.usecases.dtos import TableExtractionRequest
70+
from pdf2table.frameworks.table_extraction_factory import TableExtractionFactory
71+
from pdf2table.usecases.dtos import TableExtractionRequest
7272

7373
# Create with custom configuration
7474
adapter = TableExtractionFactory.create_table_extraction_adapter(
File renamed without changes.
File renamed without changes.

table_rag/adaptors/table_extraction_adaptor.py renamed to pdf2table/adaptors/table_extraction_adaptor.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
from table_rag.usecases.dtos import TableExtractionRequest, TableExtractionResponse
2-
from table_rag.usecases.table_extraction_use_case import TableExtractionUseCase
1+
from pdf2table.usecases.dtos import TableExtractionRequest, TableExtractionResponse
2+
from pdf2table.usecases.table_extraction_use_case import TableExtractionUseCase
33

44

55
class TableExtractionAdapter:
File renamed without changes.
File renamed without changes.
File renamed without changes.

table_rag/frameworks/ocr_service.py renamed to pdf2table/frameworks/ocr_service.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
from PIL import Image
55
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
66

7-
from table_rag.usecases.interfaces.framework_interfaces import OCRInterface
7+
from pdf2table.usecases.interfaces.framework_interfaces import OCRInterface
88

99

1010
class TrOCRService(OCRInterface):

table_rag/frameworks/pdf_image_extractor.py renamed to pdf2table/frameworks/pdf_image_extractor.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
import numpy as np
33
import fitz
44

5-
from table_rag.entities.table_entities import PageImage
6-
from table_rag.usecases.interfaces.framework_interfaces import (
5+
from pdf2table.entities.table_entities import PageImage
6+
from pdf2table.usecases.interfaces.framework_interfaces import (
77
PDFImageExtractorInterface,
88
)
99

0 commit comments

Comments
 (0)