Skip to content

Commit 457584d

Browse files
🛡️ Sentinel: [CRITICAL] Fix SQL injection in metadata generation
Identified and resolved a critical SQL injection vulnerability in `metadata_generator.py`. The `save_file_metadata` method was using user-provided metadata keys directly as column names in a dynamic `INSERT OR REPLACE INTO file_metadata` query. Since parameterization (`?`) only works for values, not identifiers like column names, an attacker supplying crafted dictionary keys could execute arbitrary SQL commands. Fixed by querying `PRAGMA table_info` to fetch the actual safe schema and filtering the input metadata dictionary against this whitelist. Co-authored-by: thebearwithabite <216692431+thebearwithabite@users.noreply.github.com>
1 parent 613e4ba commit 457584d

2 files changed

Lines changed: 24 additions & 2 deletions

File tree

.jules/sentinel.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,14 @@
1919
**Prevention:**
2020
1. Always convert `pathlib.Path` objects to absolute strings using `str(path.absolute())` before passing them as arguments to `subprocess.run`.
2121
2. Absolute paths always begin with a directory separator (`/` on Unix) or a drive letter (`C:\` on Windows), guaranteeing the command-line tool parses them as file paths rather than flags or options.
22+
23+
## 2024-05-30 - SQL Injection via Dictionary Keys in Dynamic Queries
24+
25+
**Vulnerability:** In `metadata_generator.py`, the `save_file_metadata` method dynamically constructed an `INSERT OR REPLACE INTO` query using unvalidated keys from the `metadata` dictionary as column names. This exposed the application to SQL injection, as an attacker controlling the dictionary keys could inject arbitrary SQL into the query string (e.g., `key) VALUES (...); DROP TABLE --`).
26+
27+
**Learning:** When building dynamic SQL queries based on dictionary keys (e.g., for inserts or updates), standard parameterization (`?` placeholders) only protects the values, not the structural elements like column names. You cannot rely on the source of the dictionary being safe.
28+
29+
**Prevention:**
30+
1. Explicitly fetch the allowed schema (e.g., via `PRAGMA table_info(table_name)`) to establish a whitelist of safe column names.
31+
2. Filter the input dictionary to strictly include only keys that match the whitelisted columns before generating the query string.
32+
3. This neutralizes injection vectors and gracefully ignores unrecognized metadata keys.

metadata_generator.py

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -467,9 +467,20 @@ def save_file_metadata(self, metadata: Dict[str, Any]) -> bool:
467467

468468
try:
469469
with sqlite3.connect(self.db_path) as conn:
470+
# Fetch safe column names to prevent SQL injection via keys
471+
cursor = conn.execute("PRAGMA table_info(file_metadata)")
472+
safe_columns = {row[1] for row in cursor.fetchall()}
473+
474+
# Filter metadata to only include safe columns
475+
safe_metadata = {k: v for k, v in metadata.items() if k in safe_columns}
476+
477+
if not safe_metadata:
478+
print("Error saving metadata: No valid columns provided")
479+
return False
480+
470481
# Convert to database format
471-
columns = list(metadata.keys())
472-
values = list(metadata.values())
482+
columns = list(safe_metadata.keys())
483+
values = list(safe_metadata.values())
473484
placeholders = ', '.join(['?' for _ in values])
474485
column_names = ', '.join(columns)
475486

0 commit comments

Comments
 (0)