An interactive educational game designed to teach web scraping fundamentals through hands-on challenges using BeautifulSoup4.
Scrape & Guess is an educational web scraping game that helps students and developers learn data extraction techniques in a fun, competitive environment. Players scrape HTML files or live websites to answer progressively challenging questions.
- Master HTML parsing with BeautifulSoup4
- Understand CSS selectors and DOM navigation
- Practice data extraction and transformation
- Learn web scraping best practices
- Develop problem-solving skills
- Offline Mode: Practice with static HTML files (no internet required)
- Progressive Difficulty: Easy โ Medium โ Hard โ Expert challenges
- Auto-Validation: Automated answer checking system
- Real-World Scenarios: Movie databases, news sites, e-commerce layouts
- Educational: Includes detailed solutions and explanations
- Extensible: Easy to add custom challenges
- Python 3.8 or higher
- pip package manager
- Text editor or IDE
# Clone the repository
git clone https://github.com/Shree2604/ScrapNSearch.git
cd ScrapNSearch
# Create virtual environment (recommended)
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt- Choose a challenge level from the
challenges/directory - Read the HTML file in the
data/directory to understand the structure - Write your scraping script to extract data and answer questions
- Quote your answer & Update .py file in your fork
| Level | File | Difficulty | Time | Skills Required |
|---|---|---|---|---|
| 1 | movies.html | โญ Easy | 15-20 min | Basic tag finding, attribute extraction |
| 2 | news.html | โญโญ Medium | 20-30 min | Text processing, data aggregation |
| 3 | ecommerce.html | โญโญโญ Hard | 30-45 min | Complex selectors, nested data |
| 4 | social_media.html | โญโญโญโญ Expert | 45-60 min | Dynamic content, edge cases |
Detailed challenge instructions: See individual files in the challenges/ directory.
This project is licensed under the MIT License - see the LICENSE file for details.