Indian Address Parser

📍 Overview

The Indian Address Parser is an advanced Natural Language Processing (NLP) tool designed to extract structured address information from unstructured text and complex PDF documents. It utilizes spaCy, Regex-based pattern matching, and custom entity recognition to efficiently identify and extract addresses.

🚀 Features

📄 Extracts addresses from PDF files and raw text
🔍 Uses NLP & Named Entity Recognition (NER) for accurate parsing
🗺️ Identifies cities, states, PIN codes, and localities
⚡ Optimized for large-scale documents
📥 Download extracted addresses in a structured format

🔧 Installation

To use this project locally, follow these steps:

Clone this repository:

git clone https://github.com/Adityagupta-dev/Indian-Address-Parser.git
cd Indian-Address-Parser

Create a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Run the Streamlit app:
```
streamlit run app.py
```

📂 Usage

1️⃣ Upload a PDF File

Click Upload a PDF to extract addresses automatically.
The extracted addresses will be displayed along with confidence scores and structured components.

2️⃣ Enter Text Manually

Paste text containing addresses in the text box.
The extracted addresses will be displayed along with confidence scores and structured components.

3️⃣ Download Extracted Addresses

The extracted addresses can be downloaded as a structured text file.

🏗️ Work in Progress

🚧 Version 2 is coming soon! 🚧

Improved address extraction accuracy
Support for additional document formats
More robust NLP models
Customization options for user-specific needs

🤝 Contributing

Contributions are welcome! If you find any issues or have suggestions, feel free to open an issue or submit a pull request.

📞 Contact

For any queries, feel free to connect with me on LinkedIn. .

📜 License

This project is licensed under the MIT License. You are free to use, modify, and distribute it, but attribution is required. See the LICENSE file for more details.

⭐ If you find this project useful, don't forget to star the repo! ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.devcontainer		.devcontainer
LICENSE		LICENSE
README.md		README.md
Working_Parser.py		Working_Parser.py
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Indian Address Parser

📍 Overview

🚀 Features

🔧 Installation

📂 Usage

1️⃣ Upload a PDF File

2️⃣ Enter Text Manually

3️⃣ Download Extracted Addresses

🏗️ Work in Progress

🤝 Contributing

📞 Contact

📜 License

This project is licensed under the MIT License. You are free to use, modify, and distribute it, but attribution is required. See the LICENSE file for more details.

About

Uh oh!

Releases

Packages

Languages

License

Adityagupta-dev/Indian-Address-Parser

Folders and files

Latest commit

History

Repository files navigation

Indian Address Parser

📍 Overview

🚀 Features

🔧 Installation

📂 Usage

1️⃣ Upload a PDF File

2️⃣ Enter Text Manually

3️⃣ Download Extracted Addresses

🏗️ Work in Progress

🤝 Contributing

📞 Contact

📜 License

This project is licensed under the MIT License. You are free to use, modify, and distribute it, but attribution is required. See the LICENSE file for more details.

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages