Skip to content

fix: prevent path traversal via user_id in playground /clear_memory (CWE-22)#67

Open
sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
sebastiondev:fix/cwe22-app-directory-7b14
Open

fix: prevent path traversal via user_id in playground /clear_memory (CWE-22)#67
sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
sebastiondev:fix/cwe22-app-directory-7b14

Conversation

@sebastiondev
Copy link
Copy Markdown

Summary

Fixes a path traversal vulnerability (CWE-22) in memoryos-playground/memdemo/app.py where a user-supplied user_id flows unsanitized into filesystem path construction and is later passed to shutil.rmtree(), allowing arbitrary directory deletion as the Flask process user.

Vulnerability details

  • File: memoryos-playground/memdemo/app.py
  • CWE: CWE-22 (Improper Limitation of a Pathname to a Restricted Directory)
  • Endpoints: POST /init_memory (entry point) → POST /clear_memory (sink)
  • Auth required: None — the playground app binds to 0.0.0.0:5019 with no authentication
  • Severity: High (unauthenticated arbitrary directory deletion on the host)

Data flow

  1. POST /init_memory accepts a JSON body and reads user_id directly from request.json.
  2. The user_id is passed to Memoryos(...), which builds data_storage_path/users/<user_id> (and a parallel assistants/... path) without sanitization.
  3. The resulting user_data_dir / assistant_data_dir are stored on the in-memory memory_systems dict keyed by user_id.
  4. POST /clear_memory looks up the entry and calls shutil.rmtree(user_data_dir) and shutil.rmtree(assistant_data_dir).

A payload such as {"user_id": "../../etc", ...} (along with the required api_key/base_url/model fields) escapes the intended data/users/ directory. The subsequent clear_memory call then deletes the traversed path recursively.

Fix

Two layers of protection in app.py:

  1. Strict input validation at the entry point. A new validate_identifier() helper enforces an allowlist regex (^[A-Za-z0-9][A-Za-z0-9._-]{0,127}$), rejects non-strings, empty/whitespace-only values, and null bytes. init_memory() returns HTTP 400 for any user_id that fails validation, before a Memoryos instance is created.
  2. Defense-in-depth containment check before deletion. clear_memory() resolves both the user and assistant directories with os.path.realpath() and verifies they live under realpath(memory_system.data_storage_path) + os.sep before invoking shutil.rmtree(). If a path escapes the storage root (e.g. via an unexpected symlink or a future regression), the request is rejected with HTTP 400 instead of deleting files.

Also moves import re to module top-level (it was previously imported inside an unrelated helper).

Tests

A new tests/test_cwe22_path_traversal.py exercises validate_identifier() against representative inputs:

  • Rejected: "../../etc", "foo/bar", "/absolute", "ab\x00c", "", " ", non-string values.
  • Accepted: "alice", "user.1", "user-1", "user_1".

I ran the validator manually against the same payloads and all behave as expected. The change is self-contained to app.py plus a new test file; no existing functionality is removed.

Why this is exploitable in practice

The playground app has no authentication and listens on all interfaces by default (app.run(host='0.0.0.0', port=5019)). Any user with network reachability to the port can send the malicious JSON. Flask does not sanitize JSON body fields used in filesystem operations, and there is no other layer (no reverse-proxy validation in the bundled config, no secure_filename-style call) between the request and shutil.rmtree(). The only constraint on the deletion target is the OS-level permission of the Flask process.

Adversarial review

Before submitting I tried to disprove this finding. I checked whether the Memoryos constructor or any helper between init_memory and clear_memory already normalises or rejects traversal segments — it doesn't; it just calls os.path.join and os.makedirs(..., exist_ok=True), which happily accept .. segments. I also considered whether clear_memory requires prior state that an attacker couldn't reach — but the same unauthenticated init_memory call seeds that state, so a single attacker controls both ends of the flow. Finally, I verified that the playground is intended to be runnable as-is (the README documents the python app.py entrypoint), so this isn't dead/example-only code that maintainers would expect to harden separately before deployment.

Diff stat

 memoryos-playground/memdemo/app.py |  34 ++++++-
 tests/test_cwe22_path_traversal.py | 149 +++++++++++++++++++++++++++++++++++++
 2 files changed, 180 insertions(+), 3 deletions(-)

Happy to adjust the allowlist regex (e.g. tighten dot handling to disallow leading dots beyond the anchor, or widen to support unicode usernames) if that better matches your intended user model.

cc @lewiswigmore

…-22)

User-supplied user_id was used unsanitized in filesystem path construction
(os.path.join(data_path, "users", user_id)). An attacker could set
user_id to "../../etc" via POST /init_memory, then POST /clear_memory
would call shutil.rmtree() on the traversed path, deleting arbitrary
directories.

Changes:
- Add validate_identifier() with strict allowlist regex that permits
  only alphanumeric chars, hyphens, underscores, and dots (1-128 chars).
- Call validation on user_id in /init_memory before it reaches Memoryos.
- Add defense-in-depth check in /clear_memory: verify resolved paths
  are within the expected data_storage_path before calling shutil.rmtree.
- Move "import re" to module top-level (was previously inside function).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant