SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches [NeurIPS 2025 🔥]
- Sep-18-25: SketchMind accepted at NeurIPS 2025! 🔥🥳
- Jul-18-25: Released complete multi-agent framework with both GPT-4 and LLaMA-4 implementations 🔥
- May-15-25: Published comprehensive evaluation on 3,500+ scientific sketches across 6 NGSS-aligned assessment items 🔥
SketchMind introduces a cognitively grounded, multi-agent framework for assessing student drawn scientific sketches using semantic structures known as Sketch Reasoning Graphs (SRGs). Each SRG is annotated with Bloom's Taxonomy levels and constructed via mulitmodal agents that collaboratively parse rubrics, analyze student sketches, evaluate conceptual understanding, and generate pedagogical feedback.
- Novel Multi-Agent Framework: First cognitively-grounded multi-agent system for scientific sketch assessment
- Sketch Reasoning Graphs (SRGs): New semantic representation combining visual elements with Bloom's taxonomy levels
- Comprehensive Evaluation: Extensive validation on 3,500+ student sketches across 6 NGSS-aligned assessment items
- Dual Model Implementation: Complete pipelines for both proprietary (GPT-4) and open-source (LLaMA-4) models
- Interactive Visualization: Web-based tools for exploring and understanding SRG structures
- Inputs: Rubric, question text, gold-standard sketches
- Outputs: Gold-Standard Reference SRG with Bloom's taxonomy levels and reverse mapping
- Function: Establishes cognitive benchmarks from expert-designed rubrics and 3 gold-standard reference scientific sketches for each assessment tasks
- Inputs: Student sketch image, reference SRG
- Outputs: Student SRG constructed from visible sketch content
- Function: Extracts semantic elements and relationships from hand-drawn sketches
- Inputs: Reference SRG, student SRG
- Outputs: Cognitive alignment score, proficiency classification, concept gaps
- Function: Compares graphs using edit distance and Bloom-level analysis
- Inputs: Evaluation results, original sketch, reference standards
- Outputs: Next-step sketch revision plan with Bloom-guided visual hints
- Function: Generates pedagogical feedback with visual overlays
SketchMind/
├── config/ # Configuration files
│ ├── task_config.yaml # Task Specific configuration
├── data/ # Task-specific data
│ ├── README.md
│ └── {task_id}/ # Per-task directories
│ ├── student_images/ # Student submissions
│ ├── golden_standard_images/ # 3 reference sketches
│ ├── question.txt # Task question (optional)
│ └── rubric.txt # Evaluation rubric (optional)
├── outputs/
│ └── {task_id}/
│ ├── logs/
│ ├── cache/ # SRG cache files
│ └── results/ # Visual hints and textual feedback
├── scripts/ # Core implementation
│ ├── config_loader.py # Configuration management
│ ├── GPT_SRG_Agents.py # GPT-based agents
│ ├── GPT_SRG_MAS.py # GPT pipeline
│ ├── Llama4_SRG_Agents.py # Llama4-based agents
│ ├── Llama4_SRG_MAS.py # Llama4 pipeline
│ └── requirements.txt
├── .env.example # API key template
├── .gitignore
├── run.py # Unified entry point
└── README.md # This file
- Python 3.8+
- pip package manager
- OpenAI API key (for GPT models)
- OpenRouter API key (for Llama4 models - free tier available)
We recommend setting up a conda environment for the project:
# Create and activate environment
conda create --name sketchmind python=3.9+
conda activate sketchmind
# Clone repository
git clone https://github.com/ehsanlatif/SketchMind.git
cd SketchMind
# Install dependencies
pip install -r requirements.txt
## Config API Keys in .env
# For GPT models
OPENAI_API_KEY=your_openai_api_key_here
# For Llama4 models via OpenRouter
OPENROUTER_API_KEY=your_openrouter_api_key_hereFor a complete list of dependencies, see requirements.txt.
Set specific Model names and data paths in .yaml file for task evaluation
python run.py \
--config config/example_task.yaml \
--model-type gpt \
--student-image data/Task{id}/student_images/student1_sketch.jpgpython run.py \
--config config/example_task.yaml \
--model-type llama4 \
--student-image data/Task{id}/student_images/student1_sketch.jpgSketchMind is evaluated on a comprehensive dataset of 3,500+ student-drawn scientific sketches across 6 NGSS-aligned assessment items:
Each assessment item includes:
- ✅ Detailed textual rubric
- ✅ 3 gold standard scientific sketches (Beginning, Developing, Proficient)
- ✅ Student scientific sketch images
- ✅ Expert assigned proficiency labels
Note: The full dataset release is pending approval.
- OpenAI for GPT API access and multimodal capabilities
- Meta AI for open-sourcing multimodal models like LLaMA-4
- Open Router for making LLaMa models available via GPT like API calls for easy reproducibility
Thanks to Dr. Xiaoming Zhai for his unwavering support throughout the project. Special thanks to our educators at AI4STEM Education Center at University of Georgia who provided domain expertise for rubric development.
@misc{latif2025sketchmindmultiagentcognitiveframework,
title={SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches},
author={Ehsan Latif and Zirak Khan and Xiaoming Zhai},
year={2025},
eprint={2507.22904},
archivePrefix={arXiv},
primaryClass={cs.HC},
url={https://arxiv.org/abs/2507.22904},
}For questions, collaborations, or support:
- 📧 Email: Zirak.khan@uga.edu || Ehsan.Latif@uga.edu
- 🐛 Issues: GitHub Issues
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Looking forward to your feedback, contributions, and stars! 🌟
