Skip to content

fix: prevent path traversal via user_id in playground Flask app (CWE-22)#68

Open
sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
sebastiondev:fix/cwe22-app-flask-d5e2
Open

fix: prevent path traversal via user_id in playground Flask app (CWE-22)#68
sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
sebastiondev:fix/cwe22-app-flask-d5e2

Conversation

@sebastiondev
Copy link
Copy Markdown

Summary

The /init_memory endpoint in memoryos-playground/memdemo/app.py accepts a user_id from the JSON request body and passes it directly into os.path.join(data_storage_path, "users", user_id) (and into several derived file paths for short_term.json, mid_term.json, long_term_user.json, and the per-assistant directory). Because the value is never validated, a request with user_id="../something" escapes the intended users/ directory and causes Memoryos to create/overwrite files at attacker-controlled locations on disk.

  • CWE: CWE-22 (Improper Limitation of a Pathname to a Restricted Directory)
  • File / function: memoryos-playground/memdemo/app.pyinit_memory()
  • Severity: High — unauthenticated, network-reachable filesystem write outside the data directory
  • Data flow: request.get_json()['user_id']Memoryos(user_id=…, data_storage_path=…)os.path.join(data_storage_path, "users", user_id) and subsequent file opens.

The demo app binds 0.0.0.0:5019 with debug=True and has no authentication on this endpoint, so any host that can reach the port can trigger the sink.

Fix

Added an _is_safe_id() helper and called it at the top of init_memory() before the value is used to construct any path. The helper:

  • Requires the value to match ^[A-Za-z0-9][A-Za-z0-9._-]*$ (alphanumerics, dot, hyphen, underscore; must start with an alphanumeric).
  • Explicitly rejects empty strings, null bytes (\x00), /, \, and any ./.. component.

If validation fails the endpoint returns 400 with a clear error message; nothing else in the request handler changes. The check sits in front of the only place user_id enters the system from the network — other endpoints look up an already-initialised entry in memory_systems keyed by session, so there is no parallel unfixed sink to worry about.

Tests

Added tests/test_cwe22_app_flask.py, which uses Flask's test client and a stub Memoryos to exercise /init_memory directly without pulling in the ML stack. It asserts that:

  • Traversal payloads are rejected with HTTP 400: ../etc/cron.d, ..\windows\system32, foo/../../etc/passwd, foo/../../../tmp/evil, ....//....//etc, /absolute/path, ., ...
  • Reasonable identifiers are still accepted with HTTP 200: alice, bob_123, user-2024, CamelCaseUser.

All 12 cases pass locally.

Why this is exploitable

The endpoint is reachable without authentication, the input is taken verbatim from JSON, and os.path.join does not collapse .. segments — so users/../../../tmp/evil/short_term.json resolves outside data_storage_path. Memoryos then creates the parent directory (os.makedirs(..., exist_ok=True)) and writes JSON files there, giving an attacker arbitrary file creation/overwrite within whatever the demo process can write to. With debug=True and a network-bound listener this is realistic for anyone who runs the playground on a shared network or container.

Before submitting, we tried to disprove this: we checked whether the surrounding framework, an upstream proxy, or the Memoryos constructor itself normalises or rejects traversal in user_id. None of them do — Memoryos happily uses whatever path it is given, and there is no auth/middleware in front of /init_memory. The validation has to live at the request handler, which is what this PR adds.

cc @lewiswigmore

…-22)

Add _is_safe_id() validation to the /init_memory endpoint to reject
user_id values containing path separators, ".." components, null bytes,
or other characters that could escape the intended data directory.

Without this check, an attacker can POST user_id="../../etc/cron.d" to
create arbitrary directories (via os.makedirs in the Memoryos constructor)
or delete arbitrary directory trees (via shutil.rmtree in /clear_memory).

The fix uses an allowlist regex (alphanumeric, hyphens, underscores, dots)
applied at the HTTP boundary before the value reaches any filesystem operation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant