document_text_extractor

Overview

Brief description of your Python project module. Mention its purpose, main features, and any other relevant information.

Installation

pip install document_text_extractor==1.0.0


├── document_text_extractor
│ ├── document_text_extractor
│ │ ├── data
│ │ ├── __init__.py
│ │ ├── CommonOperations.py
│ │ ├── Configuration.py
│ │ ├── GetTextFromImage.py
│ │ ├── DocumentTextExtractor.py
│ │ ├── PDFFileDplitter.py    
│ ├── MANIFEST.in
│ ├── README.md
│ ├── setup.py

sudo apt-get update
sudo apt-get install tesseract-ocr
sudo apt-get install tesseract-ocr-eng

Current Version: v1.0.0

Brief description of your project.

How to Run

Follow these steps to set up and run the Document Text Extractor on your local machine.

Prerequisites

Python 3.6 or higher

Installation

Clone the repository:

 git clone https://github.com/harshad208/document_text_extractor.git

Navigate to the project directory
cd document_text_extractor/document_text_extractor
create a virtual environment -> python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

For Linux (Ubuntu/Debian)

sudo apt-get update
sudo apt-get install tesseract-ocr
sudo apt-get install tesseract-ocr-eng # Optional: English language pack

For Windows Download and install Tesseract OCR from https://github.com/tesseract-ocr/tesseract. Add Tesseract to your system's PATH. Verify the installation by running tesseract --version in the command prompt.
For macOS
```
brew install tesseract
```
Feel free to customize this template further based on the specifics of your project and the setup instructions. Providing clear and concise instructions will make it easier for others to use and contribute to your project.

Dependencies

The following Python packages are required to run this project. You can install them using the provided requirements.txt file:

 uvicorn==0.25.0
 fastapi==0.109.0
 python-multipart==0.0.6
 pytesseract==0.3.10
 Pillow==10.2.0
 PyMuPDF==1.23.12

# v1.0.0
1. initial phase

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

document_text_extractor

Overview

Installation

How to Run

Prerequisites

Installation

Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
document_text_extractor		document_text_extractor
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

document_text_extractor

Overview

Installation

How to Run

Prerequisites

Installation

Dependencies

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages