Patent-Innovation-Predictor is a Python-based project designed to assist with patent analysis and research. It provides tools for collecting, analyzing, and managing patent-related data, making it easier for researchers and professionals to work with large datasets.
- Patent Search Tools: Utilities for searching and retrieving patent data.
- Data Ingestion: Scripts to process and ingest patent data into the system.
- Embedding Generation: Tools for creating embeddings for patent data.
- Information Collection: Collect and organize relevant information for analysis.
- Agentic RAG: Implements retrieval-augmented generation for advanced data processing.
- Docker Support: Includes a
docker-compose.ymlfile for containerized deployment.
┌───────────────────────────────────────────────────────────────┐
│ User Interface Layer │
└───────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌───────────────────────────────────────────────────────────────┐
│ Agent Orchestration Layer │
│ ┌──────────────────┐ ┌────────────────┐ ┌───────────────┐│
│ │ Research Director│ │Patent Retriever│ │Data Analyst ││
│ └──────────────────┘ └────────────────┘ └───────────────┘│
│ │
│ ┌──────────────────┐ │
│ │Innovation │ │
│ │Forecaster │ │
│ └──────────────────┘ │
└───────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌───────────────────────────────────────────────────────────────┐
│ Knowledge Processing Layer │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ Semantic │ │ Hybrid │ │ Iterative │ │
│ │ Search │ │ Search │ │ Search │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
└───────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌───────────────────────────────────────────────────────────────┐
│ Data Storage Layer │
│ ┌───────────────────────────────────────────────────────────┐│
│ │ OpenSearch ││
│ └───────────────────────────────────────────────────────────┘│
└───────────────────────────────────────────────────────────────┘
RAG-based-Patent-Innovation-Predictor/
├── .env # Environment variables
├── agent.ipynb # Jupyter Notebook for interactive exploration
├── agentic_rag.py # Retrieval-augmented generation implementation
├── client.py # Client-side utilities
├── crew.py # Crew management utilities
├── docker-compose.yml # Docker Compose configuration
├── embedding.py # Embedding generation tools
├── files/ # Directory for input files
│ ├── patent_details.json # Detailed patent data
│ └── patents.json # Raw patent data
├── helper.py # Helper functions
├── information_collector.py # Information collection utilities
├── ingestion.py # Data ingestion scripts
├── patent_analysis_*.txt # Analysis results
├── patent_search_tools.py # Patent search utilities
├── requirements.txt # Python dependencies
└── results/ # Directory for output results
├── citation_*.json # Citation data
├── patent_data_*.json # Processed patent data
-
Clone the repository:
git clone https://github.com/extremecoder-rgb/RAG-based-Patent-Innovation-Predictor cd RAG-based-Patent-Innovation-Predictor -
Create a virtual environment and activate it:
python -m venv venv # On Windows .\venv\Scripts\activate # On macOS/Linux source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
- Create a
.envfile in the root directory. - Add necessary environment variables as key-value pairs.
- Create a
-
Interactive Exploration: Open the
agent.ipynbnotebook in Jupyter to explore the functionalities interactively. -
Command Line Execution: Run the Python scripts directly from the command line. For example:
python ingestion.py
-
Docker Deployment: Use Docker Compose to deploy the project:
docker-compose up
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Commit your changes and push them to your fork.
- Submit a pull request with a detailed description of your changes.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
For any inquiries or support, please contact Subhranil Mondal at [email protected].