@@ -10,64 +10,56 @@ An MCP (Model Context Protocol) server for interacting with the Internet Archive
10
10
11
11
This MCP server provides tools to:
12
12
- Save web pages to the Wayback Machine
13
- - Retrieve archived versions of web pages
14
- - Check archive status and availability
15
- - Search the Wayback Machine CDX API
13
+ - Retrieve archived versions of web pages
14
+ - Check archive status and statistics
15
+ - Search the Wayback Machine CDX API for available snapshots
16
16
17
17
## Features
18
18
19
19
- ** No API keys required** - Uses public Wayback Machine endpoints
20
20
- ** Save pages** - Archive any publicly accessible URL
21
- - ** Retrieve archives** - Get archived versions with timestamps
22
- - ** Verify archives** - Check if saves were successful
23
- - ** Search archives** - Query available snapshots for a URL
24
-
25
- ## Architecture Plan
26
-
27
- ### Core Tools
28
-
29
- 1 . ** save_url**
30
- - Triggers archiving of a URL
31
- - Returns the archive timestamp and URL
32
- - Handles rate limiting and retries
33
-
34
- 2 . ** get_archived_url**
35
- - Retrieves the most recent archived version
36
- - Option to specify a specific timestamp
37
- - Returns the wayback URL
38
-
39
- 3 . ** check_archive_status**
40
- - Verifies if an archive request completed
41
- - Returns status and final archive URL
42
-
43
- 4 . ** search_archives**
44
- - Query CDX API for available snapshots
45
- - Filter by date range, status code, mimetype
46
- - Support different match types (exact, prefix, host, domain)
47
- - Return list of available versions with metadata
48
-
49
- 5 . ** get_archive_availability**
50
- - Check if a URL has been archived
51
- - Return closest snapshot to a given timestamp
52
- - Return summary of archive coverage
53
-
54
- 6 . ** get_timemap**
55
- - Retrieve TimeMap for a URL (all available timestamps)
56
- - Returns list of all archived versions
57
- - Implements Memento Protocol
58
-
59
- 7 . ** search_internet_archive**
60
- - Search across Internet Archive collections
61
- - Not limited to Wayback Machine
62
- - Find related archived content
63
-
64
- ### Technical Implementation
21
+ - ** Retrieve archives** - Get archived versions with optional timestamps
22
+ - ** Archive statistics** - Get capture counts and yearly statistics
23
+ - ** Search archives** - Query available snapshots with date filtering
24
+ - ** Rate limiting** - Built-in rate limiting to respect service limits
25
+
26
+ ## Tools
27
+
28
+ ### 1. ** save_url**
29
+ Archive a URL to the Wayback Machine.
30
+ - ** Input** : ` url ` (required) - The URL to save
31
+ - ** Output** : Success status, archived URL, and timestamp
32
+ - Handles rate limiting automatically
33
+
34
+ ### 2. ** get_archived_url**
35
+ Retrieve an archived version of a URL.
36
+ - ** Input** :
37
+ - ` url ` (required) - The URL to retrieve
38
+ - ` timestamp ` (optional) - Specific timestamp (YYYYMMDDhhmmss) or "latest"
39
+ - ** Output** : Archived URL, timestamp, and availability status
40
+
41
+ ### 3. ** search_archives**
42
+ Search for all archived versions of a URL.
43
+ - ** Input** :
44
+ - ` url ` (required) - The URL to search for
45
+ - ` from ` (optional) - Start date (YYYY-MM-DD)
46
+ - ` to ` (optional) - End date (YYYY-MM-DD)
47
+ - ` limit ` (optional) - Maximum results (default: 10)
48
+ - ** Output** : List of snapshots with dates, URLs, status codes, and mime types
49
+
50
+ ### 4. ** check_archive_status**
51
+ Check archival statistics for a URL.
52
+ - ** Input** : ` url ` (required) - The URL to check
53
+ - ** Output** : Archive status, first/last capture dates, total captures, yearly statistics
54
+
55
+ ### Technical Details
65
56
66
57
- ** Transport** : Stdio (for Claude Desktop integration)
67
- - ** HTTP Client** : Built-in fetch for API calls
68
- - ** Rate Limiting** : Respect Wayback Machine limits
69
- - ** Error Handling** : Graceful handling of failed saves
70
- - ** Validation** : URL validation before operations
58
+ - ** HTTP Client** : Built-in fetch with timeout support
59
+ - ** Rate Limiting** : 15 requests per minute (conservative limit)
60
+ - ** Error Handling** : Graceful handling with detailed error messages
61
+ - ** Validation** : URL and timestamp validation
62
+ - ** TypeScript** : Full type safety with Zod schema validation
71
63
72
64
### API Endpoints (No Keys Required)
73
65
@@ -89,40 +81,63 @@ This MCP server provides tools to:
89
81
```
90
82
mcp-wayback-machine/
91
83
├── src/
92
- │ ├── index.ts # Main server entry point
84
+ │ ├── index.ts # MCP server entry point
93
85
│ ├── tools/ # Tool implementations
94
- │ │ ├── save.ts
95
- │ │ ├── retrieve.ts
96
- │ │ ├── search.ts
97
- │ │ └── status.ts
86
+ │ │ ├── save.ts # save_url tool
87
+ │ │ ├── retrieve.ts # get_archived_url tool
88
+ │ │ ├── search.ts # search_archives tool
89
+ │ │ └── status.ts # check_archive_status tool
98
90
│ ├── utils/ # Utilities
99
- │ │ ├── http.ts # HTTP client wrapper
100
- │ │ ├── validation.ts # URL validation
101
- │ │ └── rate-limit.ts # Rate limiting
102
- │ └── types. ts # TypeScript types
103
- ├── tests / # Test files
91
+ │ │ ├── http.ts # HTTP client with timeout
92
+ │ │ ├── validation.ts # URL/timestamp validation
93
+ │ │ └── rate-limit.ts # Rate limiting implementation
94
+ │ └── *.test. ts # Test files (alongside source)
95
+ ├── dist / # Built JavaScript files
104
96
├── package.json
105
97
├── tsconfig.json
106
98
└── README.md
107
99
```
108
100
109
101
## Installation
110
102
103
+ ### From npm
104
+ ``` bash
105
+ npm install -g mcp-wayback-machine
106
+ ```
107
+
108
+ ### From source
111
109
``` bash
112
- npm install
113
- npm run build
110
+ git clone https://github.com/Mearman/mcp-wayback-machine.git
111
+ cd mcp-wayback-machine
112
+ yarn install
113
+ yarn build
114
114
```
115
115
116
116
## Usage
117
117
118
- Configure in Claude Desktop settings:
118
+ ### Claude Desktop Configuration
119
+
120
+ Add to your Claude Desktop settings:
121
+
122
+ #### Using npm installation
123
+ ``` json
124
+ {
125
+ "mcpServers" : {
126
+ "wayback-machine" : {
127
+ "command" : " npx" ,
128
+ "args" : [" mcp-wayback-machine" ]
129
+ }
130
+ }
131
+ }
132
+ ```
119
133
134
+ #### Using local installation
120
135
``` json
121
136
{
122
137
"mcpServers" : {
123
138
"wayback-machine" : {
124
139
"command" : " node" ,
125
- "args" : [" /path/to/mcp-wayback-machine/dist/index.js" ]
140
+ "args" : [" /absolute/ path/to/mcp-wayback-machine/dist/index.js" ]
126
141
}
127
142
}
128
143
}
@@ -131,11 +146,16 @@ Configure in Claude Desktop settings:
131
146
## Development
132
147
133
148
``` bash
134
- npm run dev # Run in development mode
135
- npm test # Run tests
136
- npm run build # Build for production
149
+ yarn dev # Run in development mode with hot reload
150
+ yarn test # Run tests
151
+ yarn test:watch # Run tests in watch mode
152
+ yarn build # Build for production
153
+ yarn start # Run production build
137
154
```
138
155
156
+ ### Testing
157
+ The project uses Vitest for testing. Tests are located alongside source files with ` .test.ts ` extensions.
158
+
139
159
## Resources
140
160
141
161
### Official Documentation
0 commit comments