An MCP (Model Context Protocol) server and CLI tool for interacting with the Internet Archive's Wayback Machine without requiring API keys.
Built with: MCP TypeScript Template
This tool can be used in two ways:
- As an MCP server - Integrate with Claude Desktop for AI-powered interactions
- As a CLI tool - Use directly from the command line with
npx
or global installation
Features:
- Save web pages to the Wayback Machine
- Retrieve archived versions of web pages
- Check archive status and statistics
- Search the Wayback Machine CDX API for available snapshots
- 🔐 No API keys required - Uses public Wayback Machine endpoints
- 💾 Save pages - Archive any publicly accessible URL
- 🔄 Retrieve archives - Get archived versions with optional timestamps
- 📊 Archive statistics - Get capture counts and yearly statistics
- 🔍 Search archives - Query available snapshots with date filtering
- ⏱️ Rate limiting - Built-in rate limiting to respect service limits
- 💻 Dual mode - Use as MCP server or standalone CLI tool
- 🎨 Rich CLI output - Colorized output with progress indicators
- 🔒 TypeScript - Full type safety with Zod validation
Archive a URL to the Wayback Machine.
- Input:
url
(required) - The URL to save - Output: Success status, archived URL, and timestamp
- Handles rate limiting automatically
Retrieve an archived version of a URL.
- Input:
url
(required) - The URL to retrievetimestamp
(optional) - Specific timestamp (YYYYMMDDhhmmss) or "latest"
- Output: Archived URL, timestamp, and availability status
Search for all archived versions of a URL.
- Input:
url
(required) - The URL to search forfrom
(optional) - Start date (YYYY-MM-DD)to
(optional) - End date (YYYY-MM-DD)limit
(optional) - Maximum results (default: 10)
- Output: List of snapshots with dates, URLs, status codes, and mime types
Check archival statistics for a URL.
- Input:
url
(required) - The URL to check - Output: Archive status, first/last capture dates, total captures, yearly statistics
- Transport: Stdio (for Claude Desktop integration)
- HTTP Client: Built-in fetch with timeout support
- Rate Limiting: 15 requests per minute (conservative limit)
- Error Handling: Graceful handling with detailed error messages
- Validation: URL and timestamp validation
- TypeScript: Full type safety with Zod schema validation
- Save Page Now:
https://web.archive.org/save/{url}
- Archive pages on demand - Availability API:
http://archive.org/wayback/available?url={url}
- Check archive status - CDX Server API:
http://web.archive.org/cdx/search/cdx?url={url}
- Advanced search and filtering - TimeMap API:
http://web.archive.org/web/timemap/link/{url}
- Get all timestamps for a URL - Metadata API:
https://archive.org/metadata/{identifier}
- Get Internet Archive item metadata - Search API:
https://archive.org/advancedsearch.php?q={query}&output=json
- Search collections
mcp-wayback-machine/
├── src/
│ ├── index.ts # MCP server entry point
│ ├── tools/ # Tool implementations
│ │ ├── save.ts # save_url tool
│ │ ├── retrieve.ts # get_archived_url tool
│ │ ├── search.ts # search_archives tool
│ │ └── status.ts # check_archive_status tool
│ ├── utils/ # Utilities
│ │ ├── http.ts # HTTP client with timeout
│ │ ├── validation.ts # URL/timestamp validation
│ │ └── rate-limit.ts # Rate limiting implementation
│ └── *.test.ts # Test files (alongside source)
├── dist/ # Built JavaScript files
├── package.json
├── tsconfig.json
└── README.md
Use directly with npx (no installation needed):
npx mcp-wayback-machine save https://example.com
Or install globally:
npm install -g mcp-wayback-machine
wayback save https://example.com
Install for use with Claude Desktop:
npm install -g mcp-wayback-machine
git clone https://github.com/Mearman/mcp-wayback-machine.git
cd mcp-wayback-machine
yarn install
yarn build
The tool provides a wayback
command (or use npx mcp-wayback-machine
):
wayback save https://example.com
# or
npx mcp-wayback-machine save https://example.com
wayback get https://example.com
wayback get https://example.com --timestamp 20231225120000
wayback get https://example.com --timestamp latest
wayback search https://example.com
wayback search https://example.com --limit 20
wayback search https://example.com --from 2023-01-01 --to 2023-12-31
wayback status https://example.com
wayback --help
wayback save --help
Add to your Claude Desktop settings:
{
"mcpServers": {
"wayback-machine": {
"command": "npx",
"args": ["mcp-wayback-machine"]
}
}
}
{
"mcpServers": {
"wayback-machine": {
"command": "node",
"args": ["/absolute/path/to/mcp-wayback-machine/dist/index.js"]
}
}
}
{
"mcpServers": {
"wayback-machine": {
"command": "npx",
"args": ["tsx", "/absolute/path/to/mcp-wayback-machine/src/index.ts"]
}
}
}
yarn dev # Run in development mode with hot reload
yarn test # Run tests with coverage
yarn test:watch # Run tests in watch mode
yarn build # Build for production
yarn start # Run production build
yarn lint # Check code style
yarn lint:fix # Auto-fix code style issues
yarn format # Format code with Biome
The project uses Vitest for testing with the following features:
- Unit tests for all tools and utilities
- Integration tests for CLI commands
- Coverage reporting with c8
- Tests located alongside source files (
.test.ts
)
Run tests:
# Run all tests with coverage
yarn test
# Run tests in watch mode during development
yarn test:watch
# Run CI tests with JSON reporter
yarn test:ci
Once configured, you can ask Claude to:
- "Save https://example.com to the Wayback Machine"
- "Find archived versions of https://example.com from 2023"
- "Check if https://example.com has been archived"
- "Get the latest archived version of https://example.com"
# Archive multiple URLs
for url in "https://example.com" "https://example.org"; do
wayback save "$url"
sleep 5 # Be respectful with rate limiting
done
# Check if a URL was archived today
wayback search "https://example.com" --from $(date +%Y-%m-%d) --to $(date +%Y-%m-%d)
# Export archive data
wayback search "https://example.com" --limit 100 > archives.txt
- "URL not found in archive": The URL may not have been archived yet. Try saving it first.
- Rate limit errors: Add delays between requests or reduce request frequency.
- Connection timeouts: Check your internet connection and try again.
- Invalid timestamp format: Use YYYYMMDDhhmmss format (e.g., 20231225120000).
# Enable debug output
DEBUG=* wayback save https://example.com
# Check MCP server logs
DEBUG=* node dist/index.js
- Wayback Machine APIs Overview
- Internet Archive API Documentation
- CDX Server Documentation
- Save Page Now 2 (SPN2) API
- Memento Protocol Guide
- No hard rate limits for public APIs
- Be respectful - add delays between requests
- Use specific date ranges to reduce CDX result sets
- Cache responses when possible
- Include descriptive User-Agent header
- MCP Discord - Get help and share your experience
- Internet Archive Forum - Wayback Machine discussions
For completeness, here are Internet Archive APIs that require authentication but are not included in this MCP server:
- Authentication: S3-style access keys from
https://archive.org/account/s3.php
- Features: Upload files, modify metadata, create items, manage collections
- Documentation:
- Authentication: S3 credentials
- Features: Advanced search capabilities, higher rate limits
- Access: Requires Internet Archive account
- Documentation:
- Authentication: Partnership agreement typically required
- Features: Bulk captures, priority processing, higher rate limits
- Documentation:
- Authentication: Special partnership agreement
- Features: Bulk downloads, custom data exports, direct database access
- Access: Contact Internet Archive directly
- Documentation:
- Create account at archive.org
- Visit S3 API page (requires login)
- Generate Access Key and Secret Key pair
- Configure using
ia configure
command or manual configuration
Note: This MCP server focuses on public, keyless APIs to maintain simplicity and avoid credential management.
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made
- NonCommercial — You may not use the material for commercial purposes
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license
For commercial use or licensing inquiries, please contact the copyright holder.