Skip to content

Patent-Innovation-Predictor is a Python-based project designed to assist with patent analysis and research. It provides tools for collecting, analyzing, and managing patent-related data, making it easier for researchers and professionals to work with large datasets.

License

Notifications You must be signed in to change notification settings

extremecoder-rgb/RAG-based-Patent-Innovation-Researcher

Repository files navigation

RAG-based-Patent-Innovation-Predictor

Overview

Patent-Innovation-Predictor is a Python-based project designed to assist with patent analysis and research. It provides tools for collecting, analyzing, and managing patent-related data, making it easier for researchers and professionals to work with large datasets.

Features

  • Patent Search Tools: Utilities for searching and retrieving patent data.
  • Data Ingestion: Scripts to process and ingest patent data into the system.
  • Embedding Generation: Tools for creating embeddings for patent data.
  • Information Collection: Collect and organize relevant information for analysis.
  • Agentic RAG: Implements retrieval-augmented generation for advanced data processing.
  • Docker Support: Includes a docker-compose.yml file for containerized deployment.

System Architecture

┌───────────────────────────────────────────────────────────────┐
│                     User Interface Layer                      │
└───────────────────────────────────────────────────────────────┘
                │                │                │
                ▼                ▼                ▼
┌───────────────────────────────────────────────────────────────┐
│                 Agent Orchestration Layer                     │
│  ┌──────────────────┐   ┌────────────────┐  ┌───────────────┐│
│  │ Research Director│   │Patent Retriever│  │Data Analyst   ││
│  └──────────────────┘   └────────────────┘  └───────────────┘│
│                                                               │
│  ┌──────────────────┐                                         │
│  │Innovation        │                                         │
│  │Forecaster        │                                         │
│  └──────────────────┘                                         │
└───────────────────────────────────────────────────────────────┘
                │                │                │
                ▼                ▼                ▼
┌───────────────────────────────────────────────────────────────┐
│                Knowledge Processing Layer                     │
│  ┌───────────────┐    ┌───────────────┐    ┌───────────────┐ │
│  │ Semantic      │    │ Hybrid        │    │ Iterative     │ │
│  │ Search        │    │ Search        │    │ Search        │ │
│  └───────────────┘    └───────────────┘    └───────────────┘ │
└───────────────────────────────────────────────────────────────┘
                │                │                │
                ▼                ▼                ▼
┌───────────────────────────────────────────────────────────────┐
│                     Data Storage Layer                        │
│  ┌───────────────────────────────────────────────────────────┐│
│  │                       OpenSearch                          ││
│  └───────────────────────────────────────────────────────────┘│
└───────────────────────────────────────────────────────────────┘

Project Structure

RAG-based-Patent-Innovation-Predictor/
├── .env                        # Environment variables
├── agent.ipynb                 # Jupyter Notebook for interactive exploration
├── agentic_rag.py              # Retrieval-augmented generation implementation
├── client.py                   # Client-side utilities
├── crew.py                     # Crew management utilities
├── docker-compose.yml          # Docker Compose configuration
├── embedding.py                # Embedding generation tools
├── files/                      # Directory for input files
│   ├── patent_details.json     # Detailed patent data
│   └── patents.json            # Raw patent data
├── helper.py                   # Helper functions
├── information_collector.py    # Information collection utilities
├── ingestion.py                # Data ingestion scripts
├── patent_analysis_*.txt       # Analysis results
├── patent_search_tools.py      # Patent search utilities
├── requirements.txt            # Python dependencies
└── results/                    # Directory for output results
    ├── citation_*.json         # Citation data
    ├── patent_data_*.json      # Processed patent data

Installation

  1. Clone the repository:

    git clone https://github.com/extremecoder-rgb/RAG-based-Patent-Innovation-Predictor
    cd RAG-based-Patent-Innovation-Predictor
  2. Create a virtual environment and activate it:

    python -m venv venv
    # On Windows
    .\venv\Scripts\activate
    # On macOS/Linux
    source venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Set up environment variables:

    • Create a .env file in the root directory.
    • Add necessary environment variables as key-value pairs.

Usage

  • Interactive Exploration: Open the agent.ipynb notebook in Jupyter to explore the functionalities interactively.

  • Command Line Execution: Run the Python scripts directly from the command line. For example:

    python ingestion.py
  • Docker Deployment: Use Docker Compose to deploy the project:

    docker-compose up

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix.
  3. Commit your changes and push them to your fork.
  4. Submit a pull request with a detailed description of your changes.

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Contact

For any inquiries or support, please contact Subhranil Mondal at [email protected].

About

Patent-Innovation-Predictor is a Python-based project designed to assist with patent analysis and research. It provides tools for collecting, analyzing, and managing patent-related data, making it easier for researchers and professionals to work with large datasets.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published