This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
# Install dependencies
pip install -r app/requirements.txt
# Run the Flask app locally
cd app
gunicorn --workers 1 --threads 4 sw:app
# Access at http://127.0.0.1:8000# Build and run with Docker
docker build -t smallweb .
docker run -p 8080:8080 smallweb# Crawl all feeds (expensive operation)
cd maintenance
./crawl.sh
# Process crawl results and clean up feeds
./process.shKagi Small Web is a feed aggregation platform that curates and displays content from the "small web" - personal blogs, independent YouTube channels, and webcomics. The system operates as a Flask web application with background feed processing.
Main Application (app/sw.py)
- Flask web server serving random posts from curated feeds
- Background feed updates every 5 minutes using APScheduler
- User interaction features: emoji reactions, notes, content flagging
- Iframe embedding for seamless content viewing
- Multiple content modes: blogs, YouTube videos, GitHub projects, comics
Feed Management System
smallweb.txt: Personal blog RSS/Atom feeds (~thousands of entries)smallyt.txt: YouTube channel feeds with subscriber/frequency limitssmallcomic.txt: Independent webcomic feedsyt_rejected.txt: Rejected YouTube channels for reference
Data Persistence
data/favorites.pkl: User emoji reactions stored as OrderedDict per URLdata/notes.pkl: User notes with timestamps per URLdata/flagged_content.pkl: Content flagging counts
- Ingestion: Fetches from Kagi's Small Web API (
/api/v1/smallweb/feed/) - Filtering: YouTube Shorts removal, image detection for comics
- Caching: In-memory storage with periodic updates
- Generation: Creates appreciated feed and OPML export
- Random Discovery: Algorithmic selection from curated feeds
- Content Types: Blogs (
?mode=0), YouTube (?yt), Appreciated (?app), GitHub (?gh), Comics (?comic) - Search: Full-text search across titles, authors, descriptions
- Reactions: 14 emoji types with max 3 per URL, automatic feed inclusion
- Personal Notes: Timestamped annotations per URL
- Content Moderation: Community flagging system
The application deploys to Google Cloud Run with:
- GCS bucket mounting via gcsfuse for persistent data
- Cloud Build pipeline (
cloudbuild.yaml) - Service account with appropriate IAM permissions
- Auto-scaling with 2-4 instances