Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 4 additions & 21 deletions .jules/sentinel.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,4 @@
## 2024-05-30 - Local File Inclusion and Argument Injection in File Reading API

**Vulnerability:** The API endpoints `/api/open-file`, `/api/files/content`, and `/api/files/preview-text` accepted unvalidated absolute file paths directly from API requests, allowing Local File Inclusion (LFI) via paths like `/etc/passwd`. Additionally, the `/api/open-file` endpoint passed raw strings to `subprocess.run(['open', path])`, enabling argument injection if a filename started with `-`.

**Learning:** When APIs expose raw file system operations (like reading or passing paths to system commands), depending on clients to send "valid" paths is insufficient. Python's `Path.resolve()` combined with `is_relative_to` provides a robust mechanism to evaluate the *final* destination of a path, neutralizing relative traversal (`../`). However, for endpoints supporting custom folders (like an organizer app), hardcoded whitelists can break functionality. Instead, limiting access to a broad safe boundary (like the user's home directory `Path.home()`) strikes a balance. Furthermore, treating strings as arguments requires strict validation; resolving local paths to absolute strings inherently prefixes them with root (`/`) or drive letters, naturally neutralizing argument injection (`-rf`).

**Prevention:**
1. Always resolve paths to absolute destinations using `.resolve()` before operating on them.
2. Verify path containment within allowed boundaries using `is_relative_to` (or `security_utils.validate_path_within_base`).
3. For endpoints invoking system commands with user-provided paths, ensure paths are absolute to prevent them from being parsed as options (flags starting with `-`), or explicitly block paths where `.name.startswith('-')`.
4. Special care must be given to URL support to prevent bypasses like `file:///etc/passwd` when filtering `http`/`https`.

## 2024-05-30 - Argument Injection Vulnerability in Video Processing Tools

**Vulnerability:** The `vision_content_extractor.py` module passed unvalidated string file paths directly to `subprocess.run` calls for `ffprobe` and `ffmpeg` when preparing video samples. If an attacker controlled the filename, they could name a file starting with `-` (e.g., `-someflag`), leading to argument injection where the command-line tool interprets the filename as an option.

**Learning:** When invoking external command-line tools (like `ffmpeg` or `ffprobe`) using `subprocess.run` with user-controlled file paths, using `str(file_path)` is insufficient to prevent argument injection. If a path string happens to be a relative filename like `-v`, it can alter the tool's behavior, potentially leading to unauthorized operations or command execution depending on the tool's supported flags.

**Prevention:**
1. Always convert `pathlib.Path` objects to absolute strings using `str(path.absolute())` before passing them as arguments to `subprocess.run`.
2. Absolute paths always begin with a directory separator (`/` on Unix) or a drive letter (`C:\` on Windows), guaranteeing the command-line tool parses them as file paths rather than flags or options.
## 2024-05-18 - Prevent SQL Injection in Dynamic Column Names
**Vulnerability:** The `save_file_metadata` method in `metadata_generator.py` dynamically constructed SQL `INSERT OR REPLACE INTO` statements using unvalidated dictionary keys (`metadata.keys()`) as column names. This exposed the application to SQL injection if an attacker could control the keys in the `metadata` dictionary, as standard `?` parameterization does not protect column identifiers.
**Learning:** When constructing dynamic SQL queries that require variable column names, standard parameterization is insufficient. You must use an explicit schema allowlist to filter user-provided keys.
**Prevention:** Fetch the valid column names directly from the database schema (e.g., using `PRAGMA table_info(table_name)`) to create a strict allowlist. Filter the incoming dictionary keys against this allowlist before constructing the SQL query string.
14 changes: 12 additions & 2 deletions metadata_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -467,9 +467,19 @@ def save_file_metadata(self, metadata: Dict[str, Any]) -> bool:

try:
with sqlite3.connect(self.db_path) as conn:
# Get valid columns to prevent SQL injection
cursor = conn.execute("PRAGMA table_info(file_metadata)")
valid_columns = {row[1] for row in cursor.fetchall()}

# Filter metadata to only include valid columns
filtered_metadata = {k: v for k, v in metadata.items() if k in valid_columns}

if not filtered_metadata:
return False

# Convert to database format
columns = list(metadata.keys())
values = list(metadata.values())
columns = list(filtered_metadata.keys())
values = list(filtered_metadata.values())
placeholders = ', '.join(['?' for _ in values])
column_names = ', '.join(columns)

Expand Down
Loading