CopyChecker-AI

AI Copychecking is an innovative solution designed to automate the grading process by comparing original answer keys with students' handwritten answer scripts. The project uses advanced NLP techniques, similarity metrics, and OCR technology to provide an accurate grading system. The application is deployed on Azure and features a user-friendly frontend.

Workflow

Step-by-Step Process

Input:
- Users upload two files:
  - The original answer key in PDF format.
  - The student's handwritten notes (image or scanned PDF).
Text Extraction:
- From PDFs (Typed Text):
  - Text is extracted using the PyPDF2 library.
  - This ensures clean and structured text data from the answer key.
- From Handwritten Notes:
  - Handwritten text is extracted using the Gemini OCR API.
  - The OCR API processes the image or scanned notes and converts it into machine-readable text.
Text Comparison:
- Naive Similarity:
  - Basic word overlap and matching techniques are applied.
- Context-Based Similarity:
  - Tools like Gensim and Word2Vec are used to measure the semantic similarity between the extracted texts.
- Evaluation Metrics:
  - BLEU (Bilingual Evaluation Understudy): Measures precision-based similarity.
  - ROUGE-N: Measures recall-based similarity.
Grading System:
- A grading algorithm assigns scores based on threshold values of BLEU, ROUGE-N, and other metrics.
- These thresholds can be adjusted for different grading criteria.
Frontend:
- A user-friendly interface allows:
  - File uploads.
  - Viewing similarity scores and grades.
- Built using Gradio or Hugging Face Spaces.
Deployment:
- The entire application is hosted on Azure for scalability and reliability.

Tech Stack

Libraries and Tools

NLP Operations:
- NLTK: For basic text processing and tokenization.
- spaCy: For advanced NLP tasks like named entity recognition and dependency parsing.
Word Similarity Mapping by Context:
- Gensim: For topic modeling and semantic similarity.
- Word2Vec: For word embeddings and context-based similarity.
Text Extraction:
- Typed Text from PDFs: PyPDF2: For extracting text from PDF files.
- Handwritten Notes: Gemini OCR API: For converting handwritten content into text. (Paid API; ensure you have access.)
Frontend:
- Gradio: For building interactive user interfaces.
- Hugging Face Spaces: Alternative for hosting simple apps.
Backend:
- Flask: Lightweight web framework for handling backend operations.
- FastAPI: For building APIs quickly and efficiently.
Deployment:
- Azure: For hosting and scaling the application.

Installation and Setup

Prerequisites

Install Python 3.8+.
Create an Azure account.
Obtain a subscription for the Gemini OCR API (if needed).

Steps

Clone the Repository:

git clone https://github.com/yourusername/ai-copychecking.git
cd ai-copychecking

Set Up a Virtual Environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install Dependencies:
```
pip install -r requirements.txt
```
Configure API Keys:
- Create a config.py file.
- Add your API keys:
```
GEMINI_OCR_API_KEY = "your_api_key_here"
```
Run the Application:
```
python app.py
```
Access the Application:
- Open your browser and navigate to http://localhost:5000.

Features

Automated Text Extraction:
- Extracts text from PDFs and handwritten notes seamlessly.
Advanced Comparison:
- Uses both naive and context-based similarity techniques.
Customizable Grading System:
- Adjust thresholds for BLEU, ROUGE-N, and other metrics.
Interactive Frontend:
- Simple interface for uploading files and viewing results.
Cloud Deployment:
- Hosted on Azure for high availability and scalability.

Resources

BLEU Metric: BLEU Explained
ROUGE Metric: ROUGE Explained
Gradio: Documentation
Azure Deployment: Getting Started
Gemini OCR API: API Details
FastAPI: Documentation

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.

Create a new branch:

git checkout -b feature/your-feature-name

Commit your changes:
```
git commit -m "Add your message here"
```

Push to the branch:

git push origin feature/your-feature-name

Open a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
backend/app		backend/app
static		static
templates		templates
utils		utils
.gitignore		.gitignore
README.md		README.md
Workflow Diagram (2).png		Workflow Diagram (2).png
app.py		app.py
requirements.txt		requirements.txt
yolov8n-seg.pt		yolov8n-seg.pt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CopyChecker-AI

Workflow

Step-by-Step Process

Tech Stack

Libraries and Tools

Installation and Setup

Prerequisites

Steps

Features

Resources

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

mitul-goswami/CopyChecker-AI

Folders and files

Latest commit

History

Repository files navigation

CopyChecker-AI

Workflow

Step-by-Step Process

Tech Stack

Libraries and Tools

Installation and Setup

Prerequisites

Steps

Features

Resources

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages