Skills and automation tactics for spider rust projects.
Pre-built skill definitions for solving common web challenges and interacting with the spider.cloud API. Skills are markdown prompt fragments with trigger conditions that get dynamically injected into the LLM context when the page state matches.
Note: The Rust crate is optional — it provides a typed integration layer for the spider.rs ecosystem. The skill definitions in
skills/are standalone markdown files usable with any LLM-based automation system.
skills/
automation/ 69 web challenge skills (CAPTCHAs, puzzles, forms, security, data extraction)
api/ 8 spider.cloud API reference skills (crawl, scrape, search, screenshot, etc.)
core/ agent-agnostic skill specs
codex/ Codex adapter skills (SKILL.md format)
claude/ Claude adapter skills
This repository includes a generic core skill spec plus platform adapters.
Current Spider CLI extraction skill:
- Core:
skills/core/spider-cli-extraction.md - Codex adapter:
skills/codex/spider-cli-extraction/ - Claude adapter:
skills/claude/spider-cli-extraction.md
To use with Codex, copy skills/codex/spider-cli-extraction/ to $CODEX_HOME/skills/spider-cli-extraction.
Pre-built tactics for common web challenges encountered during crawling and browser automation. Each .md file contains YAML frontmatter (trigger conditions, priority) and prompt content for LLM-driven solving.
Categories:
| Category | Count | Examples |
|---|---|---|
| CAPTCHAs | 20 | reCAPTCHA v2/v3, hCaptcha, Turnstile, GeeTest, FunCaptcha, audio, math, puzzle piece |
| Interactive Puzzles | 19 | Image grids, tic-tac-toe, word search, sliding tiles, mazes, sudoku, crosswords, memory games |
| Access Barriers | 10 | Cookie consent, login walls, age verification, paywalls, popups, redirect chains, iframes |
| Form Automation | 8 | Multi-step forms, file uploads, OTP inputs, payment forms, address forms |
| Anti-Bot / Security | 6 | Bot detection, rate limiting, JS challenges, proof-of-work, fingerprinting, device verification |
| Data Extraction | 6 | Tables, product listings, contact info, pricing, search results, charts |
Reference skills for the spider.cloud API — endpoint documentation, parameters, and usage examples.
| Skill | Endpoint | Description |
|---|---|---|
crawl |
POST /crawl |
Multi-page website crawling |
scrape |
POST /scrape |
Single-page data extraction |
search |
POST /search |
SERP queries with optional content fetch |
links |
POST /links |
Link discovery and extraction |
screenshot |
POST /screenshot |
Visual page capture |
transform |
POST /transform |
HTML-to-markdown/text conversion |
unblocker |
POST /unblocker |
Anti-bot bypass (10-40 extra credits) |
ai |
POST /ai/crawl, /ai/scrape, /ai/search, /ai/browser, /ai/links |
AI-powered Spider routes (subscription required) |
[dependencies]
spider_skills = "0.1"| Feature | Default | Description |
|---|---|---|
web_challenges |
Yes | 69 built-in web challenge skills |
fetch |
No | Load skills from remote URLs at runtime |
s3 |
No | Load skills from AWS S3 buckets |
# All features
spider_skills = { version = "0.1", features = ["web_challenges", "fetch", "s3"] }
# Minimal (just core types, no built-in skills)
spider_skills = { version = "0.1", default-features = false }use spider_skills::web_challenges;
// Get a registry with all 69 built-in skills
let registry = web_challenges::registry();
// Or pick specific skill categories
let mut registry = spider_skills::new_registry();
web_challenges::add_image_grid(&mut registry);
web_challenges::add_text_captcha(&mut registry);
web_challenges::add_tic_tac_toe(&mut registry);let registry = spider_skills::web_challenges::registry();
// Returns combined prompt context for matching skills
let context = registry.match_context(
"https://example.com/login", // url
"Sign In", // title
"<div class='g-recaptcha'>", // html
);
// context now contains the login-wall and recaptcha-v2 skill promptslet mut registry = spider_skills::new_registry();
registry.load_markdown(r#"---
name: my-skill
description: Custom challenge solver
triggers:
- title_contains: "my challenge"
- html_contains: "challenge-widget"
---
Strategy for solving my custom challenge...
"#);# async fn example() {
let mut registry = spider_skills::new_registry();
spider_skills::fetch::fetch_skill(&mut registry, "https://example.com/skills/my-skill.md").await.unwrap();
# }# async fn example() -> Result<(), Box<dyn std::error::Error>> {
use spider_skills::s3::S3SkillSource;
let source = S3SkillSource::new("my-skills-bucket").await;
let mut registry = spider_skills::new_registry();
source.load_into(&mut registry, "skills/").await?;
# Ok(())
# }┌───────────────────────────┐
│ spider_skills │ ← This crate: types + skill content
│ ┌──────────────────────┐ │
│ │ Skill, SkillTrigger, │ │ ← Core types (defined here)
│ │ SkillRegistry │ │
│ ├──────────────────────┤ │
│ │ web_challenges │ │ ← 69 built-in automation skills
│ │ fetch │ │ ← Optional: fetch from URLs
│ │ s3 │ │ ← Optional: load from S3
│ └──────────────────────┘ │
└────────────┬──────────────┘
│ used by
┌─────────┼─────────┐
▼ ▼ ▼
spider spider your
_agent _worker project
Each skill is a markdown file with YAML frontmatter:
---
name: my-challenge-solver
description: Solves a specific type of web challenge
triggers:
- title_contains: "challenge keyword"
- html_contains: "challenge-css-class"
- url_contains: "/challenge/"
priority: 5
---
# Strategy
Step-by-step instructions for the LLM to follow when this
challenge type is detected...Trigger types:
title_contains— case-insensitive match on page titleurl_contains— case-insensitive match on page URLhtml_contains— case-insensitive match on page HTML
Priority: Higher values are injected first. Use 1-3 for low priority, 4-5 for medium, 6+ for high.
MIT