Quipo-Oscar Donut Transformer

This project demonstrates how to process PDF files by splitting them into pages, extracting data using a transformer model (Donut Transformer), and saving the extracted data as JSON files in the same directory as the original PDFs.

Features

Split PDF files into individual pages (PNG images).
Extract data from each page using a transformer model (e.g., Donut Transformer).
Save extracted data in JSON format alongside the original PDF files.

Installation

Clone the repository:

git clone https://github.com/Adarsh-aot/quipo-oscar.git

cd quipo-oscar

Install the required Python packages:

pip install -r requirements.txt

Run code

python main.py

License This project is licensed under the MIT License - see the LICENSE file for details.

Project Overview: The README provides an overview of the project, explaining its purpose and key features related to PDF processing and data extraction using a transformer model.
Installation: Instructions for cloning the repository and installing the required Python packages using pip.
Usage: Detailed instructions on how to run the main.py script to process PDF files and convert them into JSON format. It includes information about input/output directories and running the script.
Configuration: Guidance on customizing configuration settings (e.g., model parameters, output directories)
Requirements: List of software requirements and dependencies needed to run the project, including Python version and Donut Transformer model.
Directory Structure: Description of the directory structure used in the project, including directories for input PDF files, output PNG images, output JSON files, and main script files.
Contributing: Encouragement for contributions from the community, with instructions on how to submit issues or pull requests on GitHub.
License: Information about the project's license (in this case, the MIT License) for users and contributors.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Modelclass		Modelclass
.gitignore		.gitignore
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quipo-Oscar Donut Transformer

Features

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Adarsh-aot/quipo-oscar

Folders and files

Latest commit

History

Repository files navigation

Quipo-Oscar Donut Transformer

Features

Installation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages