Skip to content

Port sqlite operations to APSW#1076

Merged
securestep9 merged 33 commits intoOWASP:masterfrom
pUrGe12:apsw-with-sqlalchemy
Feb 3, 2026
Merged

Port sqlite operations to APSW#1076
securestep9 merged 33 commits intoOWASP:masterfrom
pUrGe12:apsw-with-sqlalchemy

Conversation

@pUrGe12
Copy link
Contributor

@pUrGe12 pUrGe12 commented Jun 4, 2025

Proposed change

This PR aims to move SQLite database operations from SQLAlchemy's ORM to APSW raw SQL queries for lesser computational overhead. It includes the necessary supporting changes in other files and unittests for db.py.

Type of change

  • New core framework functionality
  • Bugfix (non-breaking change which fixes an issue)
  • Code refactoring without any functionality changes
  • New or existing module/payload change
  • Localization improvement
  • Dependency upgrade
  • Documentation improvement

Checklist

  • I've followed the contributing guidelines
  • I've run make pre-commit, it didn't generate any changes
  • I've run make test, all tests passed locally

@securestep9 securestep9 added the gsoc GSoC work label Jun 25, 2025
@securestep9
Copy link
Collaborator

@CodeRabbit review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 1, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 1, 2025

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review

Walkthrough

Adds optional APSW-backed SQLite support with PRAGMA and retry controls, changes several places to parse raw JSON event/service/row strings, adds DB/time/retry settings and deps, updates utilities/docs, and adds extensive database tests and related test adjustments.

Changes

Cohort / File(s) Summary
Configuration
nettacker/config.py
Added DB tuning and retry settings: journal_mode, synchronous_mode on DbConfig; use_apsw_for_sqlite, max_retries, max_submit_query_retry, retry_delay on DefaultSettings.
Database core (APSW + ORM)
nettacker/database/db.py
Added optional APSW-backed SQLite path (APSW import guard); create_connection() may return (connection, cursor) for APSW or an ORM Session; DB functions now branch on backend, using explicit SQL/transactions, retry/backoff, busy timeout, and PRAGMA application; logging variable standardized to logger.
Event/service/row JSON parsing
nettacker/core/app.py, nettacker/core/graph.py, nettacker/core/module.py, nettacker/core/lib/base.py
Switched from reading .json_event attributes to parsing raw JSON strings (e.g., json.loads(row/service/event)); report HTML builds escape from json.dumps(event).
Log merging utility
nettacker/core/utils/common.py
Imported json and added preprocessing in merge_logs_to_list() to handle dicts with a non-dict "json_event" value by parsing it before merging logs.
Docs & config examples
docs/Usage.md
Updated DB configuration examples and prose to reflect unified DBConfig, APSW toggle, and new SQLite fields (journal_mode, synchronous_mode) and default DB path.
Dependencies & packaging
pyproject.toml
Added runtime deps: apsw, impacket, pymysql; added dev tools: ruff, pytest-asyncio; test marker/config updated.
Tests added / updated
tests/database/test_db.py, tests/core/test_exclude_ports.py, plus many test import reorderings (tests/**)
Added comprehensive DB tests covering APSW and ORM flows, transactions, retries; updated tests to use json.dumps(...) where events are now raw JSON strings; many test files had import-order tweaks.
Minor import reorderings
multiple modules (nettacker/core/arg_parser.py, nettacker/api/engine.py, nettacker/core/lib/*, nettacker/api/core.py, nettacker/core/graph.py, nettacker/core/app.py, nettacker/core/lib/ssh.py, nettacker/database/models.py, etc.)
Numerous import ordering adjustments and small refactors with no API-signature changes.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Port sqlite operations to APSW' directly and clearly describes the main change: migrating SQLite database operations from SQLAlchemy ORM to APSW for reduced computational overhead.
Description check ✅ Passed The description is directly related to the changeset, explaining the goal of moving SQLite operations to APSW and mentioning supporting changes and unittests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🔭 Outside diff range comments (1)
nettacker/database/db.py (1)

297-384: Significant code duplication with submit_logs_to_db

This function duplicates most of the logic from submit_logs_to_db. Consider refactoring to extract the common retry logic into a shared helper function.

Create a generic helper function:

def _submit_with_retry(session, insert_func, log_data):
    """Generic retry logic for database insertions."""
    if isinstance(session, tuple):
        connection, cursor = session
        try:
            for _ in range(Config.settings.max_retries):
                try:
                    if not connection.in_transaction:
                        connection.execute("BEGIN")
                    insert_func(cursor, log_data)
                    return send_submit_query(session)
                except apsw.BusyError as e:
                    # ... retry logic ...
            return False
        finally:
            cursor.close()
    else:
        # SQLAlchemy logic

Then simplify both functions to use this helper.

🧹 Nitpick comments (9)
tests/database/test_db.py (3)

170-181: Consider adding assertion for sleep calls in retry test.

The test mocks time.sleep but doesn't verify it was called. Add an assertion to ensure the retry delay is properly applied:

-        with patch("time.sleep"):
+        with patch("time.sleep") as mock_sleep:
             result = send_submit_query(session)
+        
+        # Verify sleep was called during retry
+        mock_sleep.assert_called()

403-431: Improve Config mocking consistency.

The test patches individual Config attributes separately. Consider using a more consistent mocking approach:

-    @patch("nettacker.database.db.Config.settings.retry_delay", 0)
-    @patch("nettacker.database.db.Config.settings.max_retries", 1)
+    @patch("nettacker.database.db.Config")
     @patch("nettacker.database.db.logging.warn")
     @patch("nettacker.database.db.create_connection")
-    def test_apsw_busy_error(self, mock_create_conn, mock_warn):
+    def test_apsw_busy_error(self, mock_create_conn, mock_warn, mock_config):
+        mock_config.settings.retry_delay = 0
+        mock_config.settings.max_retries = 1

This approach is cleaner and prevents potential issues with partial Config mocking.


725-745: Remove duplicate test for find_temp_events.

This test duplicates the logic already covered in test_sqlite_successful_lookup (lines 656-668). Consider removing this redundant test or differentiating it with a unique scenario.

nettacker/database/db.py (6)

100-112: Simplify error handling in SQLAlchemy branch

The nested try-except blocks with identical exception handling are redundant.

 else:
-    try:
-        for _ in range(1, 100):
-            try:
-                session.commit()
-                return True
-            except Exception:
-                time.sleep(0.1)
-        logging.warn(messages("database_connect_fail"))
-        return False
-    except Exception:
-        logging.warn(messages("database_connect_fail"))
-        return False
+    for _ in range(Config.settings.max_retries):
+        try:
+            session.commit()
+            return True
+        except Exception:
+            session.rollback()
+            time.sleep(Config.settings.retry_delay)
+    logging.warn(messages("database_connect_fail"))
     return False

148-149: Use consistent error messaging

The error message should use the messages() function for consistency with the rest of the codebase.

 except Exception:
     cursor.execute("ROLLBACK")
-    logging.warn("Could not insert report...")
+    logging.warn(messages("database_error"))
     return False

200-200: Use consistent error messaging

-logging.warn("Could not remove old logs...")
+logging.warn(messages("database_error"))

625-658: Consider optimizing with fewer queries

The current implementation executes 3 queries per target. Consider using a single query with GROUP BY or window functions for better performance.

Example optimization:

SELECT 
    target,
    GROUP_CONCAT(DISTINCT module_name) as module_names,
    MAX(date) as latest_date,
    GROUP_CONCAT(event) as events
FROM scan_events
WHERE target IN (
    SELECT DISTINCT target 
    FROM scan_events 
    ORDER BY id DESC 
    LIMIT 10 OFFSET ?
)
GROUP BY target

868-869: Consider moving imports to module level

Internal imports can impact performance when the function is called multiple times.

Move these imports to the top of the file:

+from nettacker.core.graph import build_graph
+from nettacker.lib.html_log import log_data

 def logs_to_report_html(target):
     """
     generate HTML report with d3_tree_v2_graph for a host
     
     Args:
         target: the target
     
     Returns:
         HTML report
     """
-    from nettacker.core.graph import build_graph
-    from nettacker.lib.html_log import log_data

1084-1105: Simplify the capture variable logic

The current implementation with capture variable and nested loops is hard to follow. Consider using a dictionary for O(1) lookup.

# Use a dictionary for efficient lookup
target_index = {item["target"]: idx for idx, item in enumerate(selected)}

if data.target in target_index:
    idx = target_index[data.target]
    # Update existing entry
else:
    # Create new entry
    tmp = {
        "target": data.target,
        "info": {
            "module_name": [],
            "port": [],
            "date": [],
            "event": [],
            "json_event": [],
        },
    }
    selected.append(tmp)
    target_index[data.target] = len(selected) - 1
    idx = target_index[data.target]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 958e1bc and 8fca377.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • nettacker/config.py (3 hunks)
  • nettacker/core/app.py (1 hunks)
  • nettacker/core/graph.py (2 hunks)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/core/module.py (1 hunks)
  • nettacker/core/utils/common.py (2 hunks)
  • nettacker/database/db.py (17 hunks)
  • pyproject.toml (1 hunks)
  • tests/database/test_db.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/database/test_db.py (3)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/database/db.py (17)
  • db_inputs (18-35)
  • create_connection (38-70)
  • send_submit_query (73-112)
  • submit_report_to_db (115-162)
  • remove_old_logs (165-212)
  • submit_logs_to_db (215-294)
  • submit_temp_logs_to_db (297-383)
  • find_temp_events (386-446)
  • find_events (449-492)
  • select_reports (495-557)
  • get_scan_result (560-591)
  • last_host_logs (594-715)
  • get_logs_by_scan_id (718-769)
  • get_options_by_scan_id (772-800)
  • logs_to_report_json (803-855)
  • logs_to_report_html (858-964)
  • search_logs (967-1123)
nettacker/config.py (1)
  • as_dict (26-27)
nettacker/database/db.py (4)
nettacker/config.py (1)
  • Config (169-173)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
🔇 Additional comments (14)
pyproject.toml (1)

67-67: APSW dependency version validated

The caret constraint ^3.50.0.0 will include the latest 3.50.4.0 release, and our security check against PyPI/GitHub showed no known advisories for APSW. No changes required.

nettacker/config.py (2)

92-93: Excellent SQLite configuration choices.

The WAL journal mode and NORMAL synchronous mode are optimal defaults for SQLite performance and concurrency. WAL mode allows concurrent readers with writers, and NORMAL synchronous provides good durability without significant performance impact.


139-140: Well-designed retry configuration.

The retry parameters (max_retries = 3 and retry_delay = 0.1) are reasonable defaults for handling SQLite database locks and transient errors without excessive overhead.

nettacker/core/module.py (1)

81-81: Event data access pattern improved.

The change from json.loads(service.json_event) to json.loads(service) aligns with the broader event data standardization refactor. This simplifies data access by treating the service directly as a JSON string rather than accessing a nested attribute.

nettacker/core/utils/common.py (2)

6-6: JSON import properly added.

The json import is correctly added to support the new JSON parsing functionality in merge_logs_to_list.


36-39: Excellent backward compatibility handling.

The enhancement to handle json_event keys is well-designed:

  • Maintains backward compatibility during the event data refactor transition
  • Idempotent operation (only parses if not already a dict)
  • Non-intrusive to existing functionality as noted in the comment
nettacker/core/lib/base.py (1)

55-55: Event data access streamlined.

The change from json.loads(event.event) to direct dictionary access event["response"]["conditions_results"] is consistent with the event data handling refactor. This eliminates unnecessary JSON parsing overhead while maintaining the same functionality.

nettacker/core/app.py (1)

155-155: LGTM! Correctly handles the new return format from find_events.

The change from json.loads(row.json_event) to json.loads(row) aligns with the database module refactoring where find_events now returns JSON strings directly instead of ORM objects with a json_event attribute.

nettacker/core/graph.py (2)

87-87: Good refactoring - passes complete event context to merge_logs_to_list.

This change improves the design by passing the entire event dictionary instead of pre-extracting the json_event field. This gives merge_logs_to_list more flexibility to handle different event structures.


160-168: JavaScript JSON parsing supports full event objects

The change to use json.dumps(event) is compatible with the existing JS in:

  • nettacker/web/static/report/json_parse.js
  • nettacker/web/static/index.html

Both files call JSON.parse(...) on the escaped JSON inserted into elements with IDs like json_event_<index>, so they will correctly deserialize the full event dictionary.

No further fixes are needed.

tests/database/test_db.py (2)

1082-1122: Good test coverage for HTML report generation.

The HTML report tests properly mock the template components and verify the integration with the graph builder. The tests cover both SQLite and SQLAlchemy backends effectively.


1-1275: Comprehensive test suite with excellent coverage.

This test file provides thorough coverage of the dual database backend functionality introduced in the PR. The tests effectively verify:

  • Both APSW (SQLite) and SQLAlchemy ORM paths
  • Retry logic and transaction management
  • Error handling and edge cases
  • All major database operations

The test structure is clear and maintainable. Great work on ensuring the new APSW implementation is well-tested!

nettacker/database/db.py (2)

233-278: Well-implemented retry logic for database locks

The retry mechanism with specific handling for BusyError and transaction state checking is well implemented. Good use of configuration parameters for max_retries and retry_delay.


509-557: Well-implemented pagination logic

Both APSW and SQLAlchemy branches correctly implement pagination with proper offset calculations and consistent return format.

Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good performance improvement contribution.
Could you address coderabbit's suggestions first @pUrGe12 ?

@pUrGe12
Copy link
Contributor Author

pUrGe12 commented Aug 30, 2025

sure I'll do that

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
nettacker/core/module.py (2)

80-85: Harden JSON loading: handle dict/bytes rows and decode failures.

json.loads(service) assumes service is always a JSON string. If APSW returns bytes or ORM returns a dict, this will raise. Add type checks and guarded parsing.

Apply:

-                service_event = json.loads(service)
-                port = service_event["port"]
-                protocols = service_event["response"]["conditions_results"].keys()
+                try:
+                    raw = service
+                    if isinstance(raw, (bytes, bytearray)):
+                        raw = raw.decode("utf-8", "replace")
+                    service_event = json.loads(raw) if isinstance(raw, str) else raw
+                except (TypeError, json.JSONDecodeError) as e:
+                    log.debug(f"Skipping unexpected service row: {service!r} ({e})")
+                    continue
+                port = service_event.get("port")
+                protocols = ((service_event.get("response") or {}).get("conditions_results") or {}).keys()
+                if port is None:
+                    continue

100-107: Use safe dict lookup for service ports
Replace self.discovered_services[payload["library"]] with:

ports = self.discovered_services.get(payload["library"], [])
step = TemplateLoader.parse(step, {"port": ports})

and add a test for the branch where payload["library"] isn’t in discovered_services.

♻️ Duplicate comments (1)
nettacker/config.py (1)

85-90: Fix doc typos, stray text, and correct the default SQLite path.

There’s a typo (“deafault”), stray UI text (“Add commentMore actions”), and the documented default path .data/nettacker.db doesn’t match the actual default .nettacker/data/nettacker.db.

Apply:

-        Set the journal_mode (default="WAL") and
-        synchronous_mode (deafault="NORMAL"). Rest
-        of the fields can be left emptyAdd commentMore actions
-        This is the default database:
-        str(CWD / ".data/nettacker.db")
+        Set the journal_mode (default="WAL") and
+        synchronous_mode (default="NORMAL"). The rest
+        of the fields can be left empty.
+        Default SQLite database path:
+        str(CWD / ".nettacker/data/nettacker.db")
🧹 Nitpick comments (1)
pyproject.toml (1)

47-69: Consider making APSW an optional extra to avoid unnecessary installs.

For users on MySQL/PostgreSQL only, installing APSW can be skipped via an extra.

Add:

 [tool.poetry.dependencies]
 # ...
 apsw = "^3.50.0.0"

+[tool.poetry.extras]
+sqlite = ["apsw"]

Then document poetry install --extras sqlite for SQLite users.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8fca377 and c26a9c2.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • nettacker/config.py (3 hunks)
  • nettacker/core/app.py (1 hunks)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/core/module.py (1 hunks)
  • nettacker/core/utils/common.py (2 hunks)
  • pyproject.toml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • nettacker/core/lib/base.py
  • nettacker/core/app.py
  • nettacker/core/utils/common.py
🧰 Additional context used
🪛 GitHub Actions: CI/CD
pyproject.toml

[error] 1-1: poetry install --with test failed: pyproject.toml changed significantly since poetry.lock was last generated. Run 'poetry lock' to fix the lock file.

🔇 Additional comments (3)
pyproject.toml (1)

67-67: Verify APSW version availability on PyPI
Ensure that the ^3.50.0.0 constraint matches a release published on PyPI and that binary wheels exist for Python 3.9, 3.10, 3.11, and 3.12. If version 3.50.0.0 is not available or missing wheels for any supported interpreter, pin to a published release that does.

nettacker/config.py (2)

111-113: Confirm PRAGMA usage in DB layer.

Ensure journal_mode and synchronous_mode are applied on every SQLite connection (both initial and pooled) via PRAGMA statements in the APSW path.


160-161: LGTM: sensible retry defaults for transient DB errors.

Values look reasonable; please ensure db.py reads these and uses capped backoff.

@pUrGe12 pUrGe12 force-pushed the apsw-with-sqlalchemy branch from c26a9c2 to b1b9851 Compare August 31, 2025 02:52
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (10)
nettacker/database/db.py (10)

15-15: Avoid shadowing the stdlib logging module.

Rename variable for clarity and to prevent confusion in imports.

-logging = logger.get_logger()
+log = logger.get_logger()

Update usages accordingly.


4-4: Handle optional APSW dependency (import guard).

Avoid import errors on non-SQLite backends.

-import apsw
+try:
+    import apsw
+except ImportError:
+    apsw = None

46-47: Guard SQLite path when APSW is missing.

Fail fast with a clear message.

 if Config.db.engine.startswith("sqlite"):
+    if apsw is None:
+        raise ImportError("APSW is required for SQLite backend. Install with: pip install apsw")

50-50: Fix busy-timeout units (should be milliseconds).

Multiply by 1000, not 100.

-connection.setbusytimeout(int(config.settings.timeout) * 100)
+# seconds -> milliseconds
+connection.setbusytimeout(int(config.settings.timeout) * 1000)

46-57: Wrap APSW connection setup with error handling.

Prevent unhandled exceptions and log context.

-        DB_PATH = config.db.as_dict()["name"]
-        connection = apsw.Connection(DB_PATH)
-        connection.setbusytimeout(int(config.settings.timeout) * 100)
-        cursor = connection.cursor()
-
-        # Performance enhancing configurations. Put WAL cause that helps with concurrency
-        cursor.execute(f"PRAGMA journal_mode={Config.db.journal_mode}")
-        cursor.execute(f"PRAGMA synchronous={Config.db.synchronous_mode}")
-
-        return connection, cursor
+        try:
+            DB_PATH = config.db.as_dict()["name"]
+            connection = apsw.Connection(DB_PATH)
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)
+            cursor = connection.cursor()
+            # Performance enhancing configurations. WAL helps with concurrency
+            cursor.execute(f"PRAGMA journal_mode={Config.db.journal_mode}")
+            cursor.execute(f"PRAGMA synchronous={Config.db.synchronous_mode}")
+            return connection, cursor
+        except Exception as e:
+            logging.error(f"Failed to create APSW connection: {e}")
+            raise

85-98: APSW commit loop closes connection every iteration and before retries.

This causes failures on subsequent attempts and double-closing. Also hardcoded retries/sleeps.

-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logging.warn(messages("database_connect_fail"))
-        return False
+        connection, cursor = session
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                connection.execute("COMMIT")
+                success = True
+                break
+            except Exception:
+                connection.execute("ROLLBACK")
+                time.sleep(Config.settings.retry_delay)
+        connection.close()
+        if not success:
+            logging.warn(messages("database_connect_fail"))
+        return success

399-447: Remove unnecessary retry loop; unify return type.

SELECTs don’t need retry loops; also return parsed JSON for both backends.

-        try:
-            for _ in range(100):
-                try:
-                    cursor.execute(
-                        """
-                        SELECT event
-                        FROM temp_events
-                        WHERE target = ? AND module_name = ? AND scan_unique_id = ? AND event_name = ?
-                        LIMIT 1
-                    """,
-                        (target, module_name, scan_id, event_name),
-                    )
-
-                    row = cursor.fetchone()
-                    cursor.close()
-                    if row:
-                        return json.loads(row[0])
-                    return []
-                except Exception:
-                    logging.warn("Database query failed...")
-                    return []
-        except Exception:
-            logging.warn(messages("database_connect_fail"))
-            return []
-        return []
+        try:
+            cursor.execute(
+                """
+                SELECT event
+                FROM temp_events
+                WHERE target = ? AND module_name = ? AND scan_unique_id = ? AND event_name = ?
+                LIMIT 1
+            """,
+                (target, module_name, scan_id, event_name),
+            )
+            row = cursor.fetchone()
+            cursor.close()
+            return json.loads(row[0]) if row else []
+        except Exception:
+            logging.warn(messages("database_connect_fail"))
+            return []
@@
-        try:
-            for _ in range(1, 100):
-                try:
-                    return (
-                        session.query(TempEvents)
-                        .filter(
-                            TempEvents.target == target,
-                            TempEvents.module_name == module_name,
-                            TempEvents.scan_unique_id == scan_id,
-                            TempEvents.event_name == event_name,
-                        )
-                        .first()
-                    )
-                except Exception:
-                    time.sleep(0.1)
-        except Exception:
-            logging.warn(messages("database_connect_fail"))
-            return []
-        return []
+        try:
+            result = (
+                session.query(TempEvents)
+                .filter(
+                    TempEvents.target == target,
+                    TempEvents.module_name == module_name,
+                    TempEvents.scan_unique_id == scan_id,
+                    TempEvents.event_name == event_name,
+                )
+                .first()
+            )
+            return json.loads(result.event) if result else []
+        except Exception:
+            logging.warn(messages("database_connect_fail"))
+            return []

474-492: Avoid redundant JSON load→dump; unify return type.

Return parsed JSON for both backends.

-            if rows:
-                return [json.dumps((json.loads(row[0]))) for row in rows]
+            if rows:
+                return [json.loads(row[0]) for row in rows]
             return []
@@
-        return [
-            row.json_event
+        return [
+            json.loads(row.json_event)
             for row in session.query(HostsLog)

581-592: Add robust handling for missing reports and file I/O errors.

Prevents attribute errors and uncaught IOErrors.

-        if row:
-            filename = row[0]
-            return filename, open(str(filename), "rb").read()
-        else:
-            return structure(status="error", msg="database error!")
+        if row:
+            filename = row[0]
+            try:
+                with open(str(filename), "rb") as fh:
+                    return filename, fh.read()
+            except IOError as e:
+                logging.error(f"Failed to read report file: {e}")
+                return structure(status="error", msg="file read error!")
+        return structure(status="error", msg="database error!")
@@
-        filename = session.query(Report).filter_by(id=id).first().report_path_filename
-
-        return filename, open(str(filename), "rb").read()
+        report = session.query(Report).filter_by(id=id).first()
+        if not report:
+            return structure(status="error", msg="database error!")
+        try:
+            with open(str(report.report_path_filename), "rb") as fh:
+                return report.report_path_filename, fh.read()
+        except IOError as e:
+            logging.error(f"Failed to read report file: {e}")
+            return structure(status="error", msg="file read error!")

659-671: Cursor is closed inside the loop but reused later.

Close after the loop.

-                cursor.close()
                 hosts.append(
                     {
                         "target": target,
                         "info": {
                             "module_name": module_names,
                             "date": latest_date,
                             "events": events,
                         },
                     }
                 )
 
-            return hosts
+            cursor.close()
+            return hosts
🧹 Nitpick comments (6)
nettacker/database/db.py (6)

268-271: Use connection for ROLLBACK to keep transaction handling consistent.

Minor consistency fix; safer if cursor lifecycle changes.

-                            if connection.in_transaction:
-                                cursor.execute("ROLLBACK")
+                            if connection.in_transaction:
+                                connection.execute("ROLLBACK")

318-319: Standardize transaction control to use the connection.

Keep BEGIN/ROLLBACK consistently on the connection.

-                            cursor.execute("BEGIN")
+                            connection.execute("BEGIN")
@@
-                                if connection.in_transaction:
-                                    connection.execute("ROLLBACK")
+                                if connection.in_transaction:
+                                    connection.execute("ROLLBACK")
@@
-                            if connection.in_transaction:
-                                cursor.execute("ROLLBACK")
+                            if connection.in_transaction:
+                                connection.execute("ROLLBACK")

Also applies to: 351-352, 358-359


611-617: Query plan: DISTINCT with ORDER BY id is ambiguous.

Consider grouping by target and ordering by MAX(id) for “latest per target”.

Example:

SELECT target
FROM scan_events
GROUP BY target
ORDER BY MAX(id) DESC
LIMIT 10 OFFSET ?

791-801: Return parsed options JSON for both backends.

Keeps API consistent.

-        if rows:
-            return [{"options": row[0]} for row in rows]
+        if rows:
+            return [{"options": json.loads(row[0])} for row in rows]
@@
-        return [
-            {"options": log.options}
+        return [
+            {"options": json.loads(log.options)}
             for log in session.query(Report).filter(Report.scan_unique_id == scan_id).all()
         ]

883-897: Consider JSON-decoding fields for graph building and HTML output.

Verify build_graph expectations; if it expects dicts, decode below.

-                "port": log[4],
-                "event": log[5],
-                "json_event": log[6],
+                "port": json.loads(log[4]),
+                "event": json.loads(log[5]),
+                "json_event": json.loads(log[6]),

If build_graph expects strings, skip this change.


735-740: Add supporting DB indexes for frequent filters.

Improve search/query performance at scale.

  • scan_events: composite indexes on (scan_unique_id), (target), (target, module_name), (target, module_name, scan_unique_id), and optionally a full-text or JSON index if supported.
  • temp_events: (target, module_name, scan_unique_id, event_name).
  • reports: (scan_unique_id), (date).

Also applies to: 986-1001, 1010-1014

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c26a9c2 and b1b9851.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • nettacker/config.py (3 hunks)
  • nettacker/core/app.py (1 hunks)
  • nettacker/core/graph.py (2 hunks)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/core/module.py (1 hunks)
  • nettacker/core/utils/common.py (2 hunks)
  • nettacker/database/db.py (17 hunks)
  • pyproject.toml (1 hunks)
  • tests/database/test_db.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • pyproject.toml
  • nettacker/core/utils/common.py
  • tests/database/test_db.py
  • nettacker/core/lib/base.py
  • nettacker/core/graph.py
  • nettacker/core/app.py
  • nettacker/core/module.py
  • nettacker/config.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (191-195)
  • as_dict (26-27)
tests/database/test_models.py (1)
  • session (11-18)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
🔇 Additional comments (1)
nettacker/database/db.py (1)

275-276: Ack: returning True after exhausting retries is intentional.

Not flagging as an error per your stated design to preserve scan continuity.

Also applies to: 363-364

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
tests/core/test_exclude_ports.py (4)

100-105: Same issue: return JSON strings in test_sort_loops.

-    mock_event = MagicMock()
-    mock_event.json_event = json.dumps(
-        {"port": 80, "response": {"conditions_results": {"http": True}}}
-    )
-    mock_find_events.return_value = [mock_event]
+    mock_find_events.return_value = [
+        json.dumps({"port": 80, "response": {"conditions_results": {"http": True}}})
+    ]

125-130: Same issue: return JSON strings in test_start_unsupported_library.

-    mock_event = MagicMock()
-    mock_event.json_event = json.dumps(
-        {"port": 1234, "response": {"conditions_results": {"unsupported_lib": True}}}
-    )
-    mock_find_events.return_value = [mock_event]
+    mock_find_events.return_value = [
+        json.dumps(
+            {"port": 1234, "response": {"conditions_results": {"unsupported_lib": True}}}
+        )
+    ]

185-190: Same issue: return JSON strings in test_sort_loops_behavior.

-    mock_event = MagicMock()
-    mock_event.json_event = json.dumps(
-        {"port": 80, "response": {"conditions_results": {"http": True}}}
-    )
-    mock_find_events.return_value = [mock_event]
+    mock_find_events.return_value = [
+        json.dumps({"port": 80, "response": {"conditions_results": {"http": True}}})
+    ]

313-319: Same issue: return JSON strings for test_load_appends_port_to_existing_protocol.

-    mock_find_events.return_value = [
-        MagicMock(
-            json_event=json.dumps({"port": 80, "response": {"conditions_results": {"http": {}}}})
-        ),
-        MagicMock(
-            json_event=json.dumps({"port": 443, "response": {"conditions_results": {"http": {}}}})
-        ),
-    ]
+    mock_find_events.return_value = [
+        json.dumps({"port": 80, "response": {"conditions_results": {"http": {}}}}),
+        json.dumps({"port": 443, "response": {"conditions_results": {"http": {}}}}),
+    ]
♻️ Duplicate comments (1)
nettacker/database/db.py (1)

19-19: Avoid shadowing imported symbol; use a distinct logger variable.

Rename the local logger instance to avoid shadowing the imported module symbol and improve readability.

-logger = logger.get_logger()
+log = logger.get_logger()

Apply the same rename for all logger usages in this file.

🧹 Nitpick comments (8)
nettacker/database/db.py (6)

244-264: Minor: BEGIN/ROLLBACK use is inconsistent between functions.

submit_logs_to_db uses connection.execute("BEGIN") while submit_temp_logs_to_db uses cursor.execute("BEGIN"). Same for ROLLBACK. Prefer one style for consistency (APSW allows either). Suggest using connection.execute in both places.

-                            cursor.execute("BEGIN")
+                            connection.execute("BEGIN")
@@
-                                cursor.execute("ROLLBACK")
+                                connection.execute("ROLLBACK")

Also applies to: 326-347


424-451: Return type inconsistency in find_temp_events (APSW dict vs ORM object).

APSW branch returns a dict (json.loads), ORM branch returns an ORM row (or None). Downstream code can easily break on mixed types. Choose a single contract (e.g., JSON string or parsed dict) and make both branches conform; update tests accordingly.

Example (return JSON string in both branches):

-            if row:
-                return json.loads(row[0])
+            if row:
+                return row[0]  # raw JSON string
@@
-                    return (
-                        session.query(TempEvents)
-                        .filter(...)
-                        .first()
-                    )
+                    result = session.query(TempEvents).filter(...).first()
+                    return result.event if result else []

If the intended contract is dicts, invert accordingly in both branches.


589-607: Unify error return shape in get_scan_result.

Docstring says this returns result content or an error in JSON type. On file I/O errors you currently return None; consider returning structure(status="error", msg="file read error!") for consistency across branches.

-            except IOError as e:
-                logger.error(f"Failed to read report file: {e}")
-                return None
+            except IOError as e:
+                logger.error(f"Failed to read report file: {e}")
+                return structure(status="error", msg="file read error!")
@@
-        except IOError as e:
-            logger.error(f"Failed to read report file: {e}")
-            return None
+        except IOError as e:
+            logger.error(f"Failed to read report file: {e}")
+            return structure(status="error", msg="file read error!")

626-635: DISTINCT with ORDER BY a non-selected column is fragile.

SQLite allows ORDER BY columns not in the SELECT list, but it’s ambiguous for DISTINCT. To show the latest per target deterministically, order by MAX(id) per target.

-            cursor.execute(
-                """
-                SELECT DISTINCT target 
-                FROM scan_events
-                ORDER BY id DESC 
-                LIMIT 10 OFFSET ?
-                """,
-                [(page - 1) * 10],
-            )
+            cursor.execute(
+                """
+                SELECT target
+                FROM (
+                    SELECT target, MAX(id) AS max_id
+                    FROM scan_events
+                    GROUP BY target
+                )
+                ORDER BY max_id DESC
+                LIMIT 10 OFFSET ?
+                """,
+                [(page - 1) * 10],
+            )

773-784: json_event type mismatch between APSW and ORM paths.

APSW branch returns dicts for json_event; ORM branch returns raw strings. Normalize ORM path to return dicts for parity (or switch APSW to strings, but keep it consistent).

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},

1000-1059: GROUP BY without aggregating non-grouped columns can yield nondeterministic values.

In the APSW search, date is selected but not grouped/aggregated. If determinism matters, consider selecting MAX(date) or removing date from SELECT and appending via another query.

tests/database/test_db.py (2)

706-714: Clarify find_temp_events return type in ORM path.

This test asserts the ORM object is returned. APSW path returns a dict. Consider unifying the contract (string or dict) across both paths and updating tests to match.


871-894: select_reports SQLite exception test mocks the wrong method.

You set mock_cursor.query.side_effect, but the implementation uses cursor.execute. To meaningfully exercise the error path, set execute.side_effect to raise.

-        mock_cursor.query.side_effect = Exception("DB Error")
+        mock_cursor.execute.side_effect = Exception("DB Error")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b1b9851 and 6a9c6cb.

📒 Files selected for processing (5)
  • nettacker/config.py (3 hunks)
  • nettacker/database/db.py (17 hunks)
  • pyproject.toml (2 hunks)
  • tests/core/test_exclude_ports.py (1 hunks)
  • tests/database/test_db.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pyproject.toml
  • nettacker/config.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • tests/database/test_db.py
  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.138Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.138Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.008Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.008Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.306Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.306Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (2)
tests/database/test_db.py (3)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/database/db.py (17)
  • db_inputs (22-39)
  • create_connection (42-81)
  • send_submit_query (84-123)
  • submit_report_to_db (126-173)
  • remove_old_logs (176-223)
  • submit_logs_to_db (226-305)
  • submit_temp_logs_to_db (308-394)
  • find_temp_events (397-452)
  • find_events (455-498)
  • select_reports (501-563)
  • get_scan_result (566-607)
  • last_host_logs (610-730)
  • get_logs_by_scan_id (733-784)
  • get_options_by_scan_id (787-815)
  • logs_to_report_json (818-870)
  • logs_to_report_html (873-979)
  • search_logs (982-1138)
nettacker/config.py (1)
  • as_dict (26-27)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (191-195)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
🪛 GitHub Actions: CI/CD
tests/core/test_exclude_ports.py

[error] 70-70: Test 'test_load_with_service_discovery' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().


[error] 108-108: Test 'test_sort_loops' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().


[error] 135-135: Test 'test_start_unsupported_library' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().


[error] 193-193: Test 'test_sort_loops_behavior' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().


[error] 324-324: Test 'test_load_appends_port_to_existing_protocol' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().

🔇 Additional comments (3)
nettacker/database/db.py (2)

55-59: Busy-timeout unit may be incorrect; confirm seconds→milliseconds conversion.

APSW’s setbusytimeout expects milliseconds. If Config.settings.timeout is in seconds, multiply by 1000 (not 100). Current tests assert 3000 for 30s, which suggests a unit mismatch. Please verify the intended unit and adjust code and/or tests accordingly.

-connection.setbusytimeout(int(config.settings.timeout) * 100)
+# Convert seconds to milliseconds for APSW busy timeout
+connection.setbusytimeout(int(config.settings.timeout) * 1000)

Also applies to: 61-65


483-484: LGTM: normalized JSON string output for find_events (APSW).

Keeping the return type as JSON strings aligns with callers using json.loads on each item.

tests/database/test_db.py (1)

124-131: Confirm busy-timeout expectation in tests.

The test expects setbusytimeout(3000) for a 30 second timeout. If timeout is seconds, APSW needs milliseconds (30000). Please confirm intended units and align code or test accordingly.

@pUrGe12 pUrGe12 requested a review from arkid15r September 2, 2025 13:04
@pUrGe12 pUrGe12 force-pushed the apsw-with-sqlalchemy branch from 37cfb4b to 4cd2bc1 Compare September 3, 2025 04:14
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (4)
nettacker/database/db.py (4)

742-784: json_event type differs across backends.

SQLite branch returns parsed dict; ORM branch returns raw string. Pick one format (prefer dicts here) for callers’ sanity.

Apply:

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},

56-66: Busy-timeout unit conversion is wrong (should be ms).

APSW expects milliseconds; multiplying by 100 undercuts the intended timeout by 10x. Use 1000 and keep the try/except.

Apply:

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            # Convert seconds to milliseconds for APSW busy timeout
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)

96-109: APSW commit retry closes connection every attempt; breaks retries and leaks cursor.

Connection is closed in finally inside the loop; cursor is never closed here. Move cleanup to a single outer finally, add rollback between attempts, and use configured retry params.

Apply:

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            except Exception:
+                pass
+            try:
+                connection.close()
+            except Exception:
+                pass
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

111-123: ORM commit retry lacks rollback and uses hardcoded policy.

Add rollback, use Config.settings.{max_retries,retry_delay}, and return success consistently.

Apply:

-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                success = True
+                break
+            except Exception:
+                session.rollback()
+                time.sleep(Config.settings.retry_delay)
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success
🧹 Nitpick comments (6)
nettacker/core/app.py (1)

159-163: Harden subdomain_scan event parsing to be backend-agnostic and key-safe

json.loads(row) assumes row is a JSON string. With dual backends (APSW/ORM), find_events(...) may yield a dict or an ORM row exposing .json_event. Also, direct indexing into ["response"]["conditions_results"]["content"] can raise on missing keys. Parse defensively and tolerate string/dict/object rows.

Apply this diff within the loop:

-                for row in find_events(target, "subdomain_scan", scan_id):
-                    for sub_domain in json.loads(row)["response"]["conditions_results"]["content"]:
+                for row in find_events(target, "subdomain_scan", scan_id):
+                    event = self._event_to_dict(row)
+                    content = (
+                        event.get("response", {})
+                        .get("conditions_results", {})
+                        .get("content", [])
+                    )
+                    if not isinstance(content, (list, tuple)):
+                        continue
+                    for sub_domain in content:

Add this helper (inside the class or module) to normalize rows:

@staticmethod
def _event_to_dict(row):
    if isinstance(row, dict):
        return row
    # ORM row with `json_event`
    if hasattr(row, "json_event") and isinstance(row.json_event, (str, bytes)):
        try:
            return json.loads(row.json_event)
        except Exception:
            return {}
    # Raw JSON string from APSW
    if isinstance(row, (str, bytes)):
        try:
            return json.loads(row)
        except Exception:
            return {}
    return {}

Verification suggestion:

  • Confirm find_events return type(s) for sqlite/mysql/postgres paths to ensure the helper covers all cases, and that subdomain events consistently populate response.conditions_results.content.
nettacker/database/db.py (5)

46-49: Docstring inaccurate about returns.

For SQLite you return (connection, cursor); for others you return a Session, never False. Update to reflect actual behavior.


799-815: Consider returning parsed options not raw JSON.

Minor, but most functions parse JSON fields before returning.

Apply:

-            return [{"options": row[0]} for row in rows]
+            return [{"options": json.loads(row[0])} for row in rows]

982-1018: Potential DISTINCT/ORDER BY mismatch in search_logs targets query.

Same pattern as last_host_logs; switch to subquery with MAX(id) ordering to avoid undefined behavior.

Apply:

-            cursor.execute(
-                """
-                SELECT DISTINCT target FROM scan_events
-                WHERE target LIKE ? OR date LIKE ? OR module_name LIKE ?
-                OR port LIKE ? OR event LIKE ? OR scan_unique_id LIKE ?
-                ORDER BY id DESC
-                LIMIT 10 OFFSET ?
-                """,
-                (
-                    f"%{query}%",
-                    f"%{query}%",
-                    f"%{query}%",
-                    f"%{query}%",
-                    f"%{query}%",
-                    f"%{query}%",
-                    (page * 10) - 10,
-                ),
-            )
+            cursor.execute(
+                """
+                WITH hits AS (
+                  SELECT id, target
+                  FROM scan_events
+                  WHERE target LIKE ? OR date LIKE ? OR module_name LIKE ?
+                        OR port LIKE ? OR event LIKE ? OR scan_unique_id LIKE ?
+                ),
+                grouped AS (
+                  SELECT target, MAX(id) AS max_id
+                  FROM hits
+                  GROUP BY target
+                )
+                SELECT target FROM grouped
+                ORDER BY max_id DESC
+                LIMIT 10 OFFSET ?
+                """,
+                (*(f"%{query}%",)*6, (page * 10) - 10),
+            )

268-276: Minor: BusyError branch string check is unnecessary.

apsw.BusyError already indicates busy/locked; the message check is redundant. Handle BusyError uniformly and let other exceptions fall through.

Apply (example for one function; mirror in the other):

-                    except apsw.BusyError as e:
-                        if "database is locked" in str(e).lower():
+                    except apsw.BusyError:
                             logger.warn(
                                 f"[Retry {_ + 1}/{Config.settings.max_retries}] Database is locked. Retrying..."
                             )
                             if connection.in_transaction:
                                 connection.execute("ROLLBACK")
                             time.sleep(Config.settings.retry_delay)
                             continue
-                        else:
-                            if connection.in_transaction:
-                                connection.execute("ROLLBACK")
-                            return False

Also applies to: 347-365


688-691: Prefer logger.warning over warn (deprecated in stdlib).

If your logger mirrors stdlib, warning() is preferred. If warn() is intentional in your wrapper, ignore.

Also applies to: 1063-1065, 1134-1135

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 37cfb4b and 4cd2bc1.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • nettacker/config.py (3 hunks)
  • nettacker/core/app.py (1 hunks)
  • nettacker/core/graph.py (2 hunks)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/core/module.py (1 hunks)
  • nettacker/core/utils/common.py (2 hunks)
  • nettacker/database/db.py (17 hunks)
  • pyproject.toml (2 hunks)
  • tests/core/test_exclude_ports.py (5 hunks)
  • tests/database/test_db.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • nettacker/core/module.py
  • tests/database/test_db.py
  • pyproject.toml
  • nettacker/core/utils/common.py
  • nettacker/core/lib/base.py
  • nettacker/core/graph.py
  • nettacker/config.py
  • tests/core/test_exclude_ports.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.138Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.138Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.008Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.008Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.306Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.306Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (191-195)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: CodeQL (python)
  • GitHub Check: CodeQL (javascript)
  • GitHub Check: Run tests
🔇 Additional comments (4)
nettacker/database/db.py (4)

4-8: Good: Optional APSW import guard.

Prevents hard failures when non-SQLite engines are used.


471-488: find_events APSW/ORM return strings consistently.

Good consistency; APSW normalizes with dumps(loads(.)) to match ORM string output.


137-173: Nice: Parameterized SQL and consistent JSON handling.

Inserts/Deletes and report selection use parameter binding and parse JSON fields consistently. Good.

Also applies to: 187-224, 513-563


625-634: Discard DISTINCT/ORDER BY compatibility concern
SQLite permits ordering by non-projected columns in a simple SELECT with DISTINCT, so SELECT DISTINCT target … ORDER BY id DESC is valid. (www2.sqlite.org, sqlite.org)

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (2)
nettacker/database/db.py (2)

111-122: SQLAlchemy commit retry lacks rollback and uses hardcoded policy

Add rollback between attempts and use configured retry count/delay. This was raised earlier; reapplying here.

-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                success = True
+                break
+            except Exception:
+                session.rollback()
+                time.sleep(Config.settings.retry_delay)
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

96-109: Critical: COMMIT retry loop closes the connection on every attempt

Closing in finally inside the loop guarantees subsequent retries operate on a closed connection and also leaks the cursor. Move cleanup outside the loop; add rollback and configurable retry/backoff. Keep a finally (per tests).

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            except Exception:
+                pass
+            try:
+                connection.close()
+            except Exception:
+                pass
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success
🧹 Nitpick comments (4)
nettacker/database/db.py (4)

55-63: Busy-timeout unit mismatch (seconds→milliseconds)

APSW expects milliseconds. Multiplying by 100 likely under-waits. Use 1000.

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            # Convert seconds to milliseconds
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)

285-287: Typo in log message ("retires" → "retries")

Minor but user-facing.

-                # All retires exhausted but we want to continue operation
+                # All retries exhausted but we want to continue operation

329-336: Use connection.execute('BEGIN') for consistency

Elsewhere you use connection-level BEGIN/ROLLBACK. Align here too.

-                        if not connection.in_transaction:
-                            cursor.execute("BEGIN")
+                        if not connection.in_transaction:
+                            connection.execute("BEGIN")

770-782: Unify json_event type across backends (get_logs_by_scan_id)

ORM branch returns a raw JSON string while APSW returns a dict. Prefer consistent dicts here (you already do so in logs_to_report_json).

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4cd2bc1 and 776d6e7.

📒 Files selected for processing (1)
  • nettacker/database/db.py (17 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.138Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.138Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.008Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.008Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.306Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.306Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (191-195)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: CodeQL (javascript)
  • GitHub Check: CodeQL (python)
🔇 Additional comments (2)
nettacker/database/db.py (2)

4-8: APSW import guard — good change

Optional dependency handled cleanly; avoids hard-fail on non-SQLite backends.


884-977: Minor: parsing vs raw in HTML report

You pass raw text fields (as intended for HTML). No changes needed beyond connection close handled above.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (8)
docs/Usage.md (8)

635-641: Tighten "Database" section wording and list formatting

  • Remove the comma after “Nettacker” and avoid blank lines breaking the list.
  • Keep database names consistent with later sections.
-OWASP Nettacker, currently supports three databases:
-
-- SQLite
-- MySQL
-- PostgreSQL
+OWASP Nettacker currently supports three databases:
+
+- SQLite
+- MySQL
+- PostgreSQL

649-658: Use fenced code block with language; avoid indented block

Switch to fenced code with language for better readability and lint compliance (MD046).

-    engine = "sqlite"
-    name = str(CWD / ".nettacker/data/nettacker.db")
-    host = ""
-    port = ""
-    username = ""
-    password = ""
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
+```python
+engine = "sqlite"
+name = str(CWD / ".nettacker/data/nettacker.db")
+host = ""
+port = ""
+username = ""
+password = ""
+ssl_mode = "disable"
+journal_mode = "WAL"
+synchronous_mode = "NORMAL"
+```

659-659: Polish grammar and add minor durability caveat

  • Grammar: “according to need” -> “according to your needs”; “use SQLite database” -> “use the SQLite database”.
  • Optional note: NORMAL improves performance but reduces durability on sudden power loss vs FULL.
-These are the default and recommended settings. Feel free to play around and change them according to need. To use SQLite database, ensure that the `engine` value is set to `sqlite` and the `name` is the path to your database. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations.
+These are the default and recommended settings. Feel free to adjust them according to your needs. To use the SQLite database, ensure that `engine` is set to `sqlite` and `name` points to your database file. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations (note: `synchronous=NORMAL` trades some durability vs `FULL`).

661-661: Remove trailing colon from heading

Complies with MD026.

-## MySQL configuration:
+## MySQL configuration

666-675: Add language to fenced code; remove SQLite-only fields from MySQL example

  • Add “python” on the fence (MD040).
  • Drop journal_mode and synchronous_mode to avoid implying they matter for MySQL. Keep ssl_mode only if the code actually honors it for MySQL.
-```
-    engine = "mysql"
-    name = "nettacker"
-    host = "localhost"
-    port = 3306
-    username = "root"
-    password = "some-password"
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
-```
+```python
+engine = "mysql"
+name = "nettacker"
+host = "localhost"
+port = 3306
+username = "root"
+password = "some-password"
+# ssl_mode: use if supported by your setup (e.g., "require")
+```

Also update the note just below to reflect actual support for ssl_mode on MySQL in this codebase.

Would you like me to scan the repository to confirm which fields are honored for MySQL before we finalize the doc text?

Also applies to: 677-677


681-681: Use consistent product name in heading

Align with earlier “PostgreSQL”.

-## Postgres Configuration
+## PostgreSQL configuration

650-650: Clarify path creation for default SQLite location

If .nettacker/data/ isn’t auto-created, add a note to create it; if it is auto-created by the app, state that explicitly to prevent user confusion.

You can confirm by grepping for mkdir or Path(...).mkdir in database initialization code.


697-697: SSL guidance consistency

You recommend ssl_mode="require" for PostgreSQL. If MySQL also supports SSL in this codebase, mirror the guidance in the MySQL section for consistency.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 776d6e7 and a92c6fd.

📒 Files selected for processing (2)
  • docs/Usage.md (1 hunks)
  • nettacker/config.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • nettacker/config.py
🧰 Additional context used
🪛 LanguageTool
docs/Usage.md

[grammar] ~638-~638: There might be a mistake here.
Context: ...ntly supports three databases: - SQLite - MySQL - PostgreSQL The default database...

(QB_NEW_EN)


[grammar] ~639-~639: There might be a mistake here.
Context: ...ports three databases: - SQLite - MySQL - PostgreSQL The default database is SQLi...

(QB_NEW_EN)


[grammar] ~646-~646: Ensure spelling is correct
Context: ...r liking. ## SQLite configuration The configruations are for a SQLite wrapper called *APSW...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~659-~659: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...e the default and recommended settings. Feel free to play around and change them according t...

(FEEL_FREE_TO_STYLE_ME)


[grammar] ~659-~659: There might be a mistake here.
Context: ...d and change them according to need. To use SQLite database, ensure that the `engin...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md

649-649: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)


661-661: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


665-665: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


683-683: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test Docker image

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/Usage.md (1)

681-691: Add language to the PostgreSQL fenced block (keep engine = "postgres").

Same as earlier feedback: add a language tag; keep engine identifier as "postgres" to match the codebase/tests.

-```
+```python
     engine = "postgres"
     name = "nettacker"
     host = "localhost"
     port = 5432
     username = "root"
     password = "some-password"
     ssl_mode = "disable"
     journal_mode = "WAL"
     synchronous_mode = "NORMAL"

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (6)</summary><blockquote>

<details>
<summary>docs/Usage.md (6)</summary><blockquote>

`634-641`: **Tighten wording and fix minor grammar (remove comma after “Nettacker”, avoid “db”).**

- Remove the comma after “OWASP Nettacker”.
- Prefer “database” over “db” and simplify phrasing.


```diff
-OWASP Nettacker, currently supports three databases:
+OWASP Nettacker currently supports three databases:
...
-The default database is SQLite. You can, however, configure the db to your liking.
+The default database is SQLite. You can configure the database to your needs.

642-656: Use a fenced code block with a language tag (MD046/MD040).

The SQLite snippet is an indented block; convert to a fenced block with a language for better rendering and lint compliance.

-    engine = "sqlite"
-    name = str(CWD / ".nettacker/data/nettacker.db")
-    host = ""
-    port = ""
-    username = ""
-    password = ""
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
+```python
+engine = "sqlite"
+name = str(CWD / ".nettacker/data/nettacker.db")
+host = ""
+port = ""
+username = ""
+password = ""
+ssl_mode = "disable"
+journal_mode = "WAL"
+synchronous_mode = "NORMAL"
+```

659-659: Remove trailing colon in heading (MD026).

Trailing punctuation in headings is flagged by markdownlint; drop the colon.

-## MySQL configuration:
+## MySQL configuration

663-673: Add language to the fenced MySQL block (MD040).

Specify the language to improve syntax highlighting and satisfy linters.

-```
+```python
     engine = "mysql"
     name = "nettacker"
     host = "localhost"
     port = 3306
     username = "root"
     password = "some-password"
     ssl_mode = "disable"
     journal_mode = "WAL"
     synchronous_mode = "NORMAL"

---

`693-695`: **Minor grammar nits (comma after introductory phrase; streamline note).**

Add a comma after “In this case,” and simplify the note sentence.


```diff
-In this case the irrelevant fields are `journal_mode` and `synchronous_mode`. You don't have to update/change/remove them.
+In this case, the irrelevant fields are `journal_mode` and `synchronous_mode`. You don't have to update, change, or remove them.
-**Note**: If you want encryption, then set `ssl_mode` to `require`.
+**Note**: For encryption, set `ssl_mode` to `require`.

642-695: Apply markdownlint fixes for MD026/MD040/MD046 in docs/Usage.md

  • Remove the trailing colon from the “## MySQL configuration:” header (line 659) to satisfy MD026.
  • Add explicit language identifiers (e.g. ini, python, or bash) to all fences in this file to satisfy MD040.
  • Convert any remaining indented (4-space) code examples to fenced code blocks to satisfy MD046.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a92c6fd and cb79437.

📒 Files selected for processing (1)
  • docs/Usage.md (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
🪛 LanguageTool
docs/Usage.md

[grammar] ~636-~636: There might be a mistake here.
Context: ...ntly supports three databases: - SQLite - MySQL - PostgreSQL The default database...

(QB_NEW_EN)


[grammar] ~637-~637: There might be a mistake here.
Context: ...ports three databases: - SQLite - MySQL - PostgreSQL The default database is SQLi...

(QB_NEW_EN)


[style] ~657-~657: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...e the default and recommended settings. Feel free to play around and change them according t...

(FEEL_FREE_TO_STYLE_ME)


[grammar] ~657-~657: There might be a mistake here.
Context: ...d and change them according to need. To use SQLite database, ensure that the `engin...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md

647-647: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)


659-659: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


663-663: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


681-681: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test Docker 27.5.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker image

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (10)
docs/Usage.md (2)

683-693: Add language to fenced code block (MD040)

Specify a language for the Postgres config example.

-```
+```ini
     engine = "postgres"
     name = "nettacker"
     host = "localhost"
     port = 5432
     username = "root"
     password = "some-password"
     ssl_mode = "disable"
     journal_mode = "WAL"
     synchronous_mode = "NORMAL"

---

`657-660`: **Tighten tone and grammar; “SQLite” capitalization; minor wording**

Reword for concision and formality. Also “SQLite” and “lightweight”.


```diff
-These are the default and recommended settings. Feel free to play around and change them according to need. To use SQLite database, ensure that the `engine` value is set to `sqlite` and the `name` is the path to your database. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations.
+These are the default and recommended settings. Adjust as needed. To use the SQLite database, set `engine` to `sqlite` and `name` to your database path. `journal_mode` and `synchronous_mode` are tuned for multithreaded I/O.

Optional:

-> Note: You can choose to use a lite wrapper for Sqlite called APSW by setting the `use_apsw_for_sqlite` parameter inside config to True for performance enhancements.
+> Note: You can enable the lightweight SQLite wrapper APSW by setting `use_apsw_for_sqlite` to `True` in the config for performance gains.
nettacker/database/db.py (8)

56-61: APSW busy-timeout uses 100 ms-per-second; should be 1000

setbusytimeout expects milliseconds. Convert seconds → ms.

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            # Convert seconds to milliseconds
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)

100-113: Critical: connection closed inside retry loop; retries operate on a closed handle

Move close(s) out of the loop, add rollback between attempts, and use configurable retry params.

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            except Exception:
+                pass
+            try:
+                connection.close()
+            except Exception:
+                pass
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success
@@
-    else:
-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+    else:
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                success = True
+                break
+            except Exception:
+                session.rollback()
+                time.sleep(Config.settings.retry_delay)
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

Also applies to: 115-126


147-168: Close APSW connection as well in finally (avoid leak on both success/error paths)

Currently only the cursor is closed here.

-        finally:
-            cursor.close()
+        finally:
+            try:
+                cursor.close()
+            finally:
+                connection.close()

195-219: Close APSW connection in remove_old_logs finally block

Mirror the fix from submit_report_to_db.

-        finally:
-            cursor.close()
+        finally:
+            try:
+                cursor.close()
+            finally:
+                connection.close()

419-419: Contract regression: find_temp_events should return a JSON string; also close connection; fix ORM branch

Align with callers and with find_events; update docstring; close APSW connection.

-        an array with JSON events or an empty array
+        a JSON string or an empty array
@@
-            row = cursor.fetchone()
-            cursor.close()
-            if row:
-                return json.loads(row[0])
+            row = cursor.fetchone()
+            cursor.close()
+            connection.close()
+            if row:
+                return row[0]
             return []
@@
-        return (
-            session.query(TempEvents)
-            .filter(
-                TempEvents.target == target,
-                TempEvents.module_name == module_name,
-                TempEvents.scan_unique_id == scan_id,
-                TempEvents.event_name == event_name,
-            )
-            .first()
-        )
+        result = (
+            session.query(TempEvents)
+            .filter(
+                TempEvents.target == target,
+                TempEvents.module_name == module_name,
+                TempEvents.scan_unique_id == scan_id,
+                TempEvents.event_name == event_name,
+            )
+            .first()
+        )
+        return result.event if result else []

Also applies to: 435-439, 445-454


482-486: Close APSW connection after reads (find_events)

Prevent leaked connections on read-path success and error.

-            rows = cursor.fetchall()
-            cursor.close()
+            rows = cursor.fetchall()
+            cursor.close()
+            connection.close()

638-640: Close APSW connection in last_host_logs on both early-return and success paths

Avoid leaking the connection.

-            if not targets:
-                return structure(status="finished", msg="No more search results")
+            if not targets:
+                cursor.close()
+                connection.close()
+                return structure(status="finished", msg="No more search results")
@@
-            cursor.close()
+            cursor.close()
+            connection.close()
             return hosts

Also applies to: 687-689


534-545: Close APSW connection after select_reports fetch

Add connection.close() after cursor.close().

-            cursor.close()
+            cursor.close()
+            connection.close()
             for row in rows:
🧹 Nitpick comments (4)
docs/Usage.md (3)

647-656: Fix MD046: switch indented block to fenced code block with a language

Use a fenced block for the SQLite snippet to satisfy markdownlint and improve readability.

-    engine = "sqlite"
-    name = str(CWD / ".nettacker/data/nettacker.db")
-    host = ""
-    port = ""
-    username = ""
-    password = ""
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
+```ini
+engine = "sqlite"
+name = ".nettacker/data/nettacker.db"
+host = ""
+port = ""
+username = ""
+password = ""
+ssl_mode = "disable"
+journal_mode = "WAL"
+synchronous_mode = "NORMAL"
+```

661-661: Remove trailing colon from heading (MD026)

Drop the colon at the end of the MySQL heading.

-## MySQL configuration:
+## MySQL configuration

665-675: Add language to fenced code block (MD040)

Specify a language for the MySQL config example.

-```
+```ini
     engine = "mysql"
     name = "nettacker"
     host = "localhost"
     port = 3306
     username = "root"
     password = "some-password"
     ssl_mode = "disable"
     journal_mode = "WAL"
     synchronous_mode = "NORMAL"

</blockquote></details>
<details>
<summary>nettacker/database/db.py (1)</summary><blockquote>

`589-599`: **Use logger.exception for failures; also close connection in APSW branch**

Prefer exception-level logging to capture tracebacks; also close APSW connection after cursor.close().


```diff
-        cursor.close()
+        cursor.close()
+        connection.close()
@@
-            except IOError as e:
-                logger.error(f"Failed to read report file: {e}")
+            except IOError:
+                logger.exception("Failed to read report file")
                 return None
@@
-        except IOError as e:
-            logger.error(f"Failed to read report file: {e}")
+        except IOError:
+            logger.exception("Failed to read report file")
             return None

And earlier:

-            logger.error(f"Failed to create APSW connection: {e}")
+            logger.exception("Failed to create APSW connection")

Also applies to: 605-610, 68-69

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e03522 and 92e0404.

📒 Files selected for processing (3)
  • docs/Usage.md (1 hunks)
  • nettacker/config.py (4 hunks)
  • nettacker/database/db.py (17 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • nettacker/config.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (194-198)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md

647-647: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)


661-661: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


665-665: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


683-683: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.12.2)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


123-123: Consider moving this statement to an else block

(TRY300)


124-124: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


213-213: Do not catch blind exception: Exception

(BLE001)


282-282: Do not catch blind exception: Exception

(BLE001)


286-287: try-except-pass detected, consider logging the exception

(S110)


286-286: Do not catch blind exception: Exception

(BLE001)


363-364: try-except-pass detected, consider logging the exception

(S110)


363-363: Do not catch blind exception: Exception

(BLE001)


371-372: try-except-pass detected, consider logging the exception

(S110)


371-371: Do not catch blind exception: Exception

(BLE001)


374-374: Do not catch blind exception: Exception

(BLE001)


378-379: try-except-pass detected, consider logging the exception

(S110)


378-378: Do not catch blind exception: Exception

(BLE001)


439-439: Consider moving this statement to an else block

(TRY300)


440-440: Do not catch blind exception: Exception

(BLE001)


486-486: Consider moving this statement to an else block

(TRY300)


487-487: Do not catch blind exception: Exception

(BLE001)


544-544: Consider moving this statement to an else block

(TRY300)


546-546: Do not catch blind exception: Exception

(BLE001)


563-563: Do not catch blind exception: Exception

(BLE001)


596-596: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


608-608: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


688-688: Consider moving this statement to an else block

(TRY300)


690-690: Do not catch blind exception: Exception

(BLE001)


1065-1065: Do not catch blind exception: Exception

(BLE001)


1136-1136: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test Docker 26.0.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.1.4-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker image
  • GitHub Check: Build package

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/Usage.md (1)

673-673: Make tone concise and add the missing article.

Previous feedback still applies; rephrase to a formal tone.

-These are the default and recommended settings. Feel free to play around and change them according to need. To use SQLite database, ensure that the `engine` value is set to `sqlite` and the `name` is the path to your database. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations.
+These are the default and recommended settings. To use the SQLite database, set `engine` to `sqlite` and `name` to your database path. `journal_mode` and `synchronous_mode` are selected for multithreaded I/O.
🧹 Nitpick comments (6)
docs/Usage.md (6)

650-657: Tighten grammar and tone in DB intro.

Remove the comma after “Nettacker” and avoid “db” slang; keep it concise.

-OWASP Nettacker, currently supports three databases:
+OWASP Nettacker currently supports three databases:
@@
-The default database is SQLite. You can, however, configure the db to your liking.
+The default database is SQLite. You can configure the database to your needs.

660-672: Fix MD046/MD040: use fenced code block with language for SQLite sample.

Convert the indented block to a fenced block and annotate with a language (use ini for consistency with other samples).

-The configurations below are for a SQLite wrapper called **APSW** (Another Python SQLite Wrapper). The configurations can be found inside `nettacker/config.py` file under the `DBConfig` class. 
-
-
-    engine = "sqlite"
-    name = str(CWD / ".nettacker/data/nettacker.db")
-    host = ""
-    port = ""
-    username = ""
-    password = ""
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
+The configurations below are for a SQLite wrapper called **APSW** (Another Python SQLite Wrapper). The configurations can be found in `nettacker/config.py` under the `DBConfig` class.
+
+```ini
+engine = "sqlite"
+name = ".nettacker/data/nettacker.db"
+host = ""
+port = ""
+username = ""
+password = ""
+ssl_mode = "disable"
+journal_mode = "WAL"
+synchronous_mode = "NORMAL"
+```

678-678: Remove trailing colon in heading (MD026).

-## MySQL configuration:
+## MySQL configuration

681-691: Add language to fenced block (MD040) and drop irrelevant fields to avoid confusion.

Keep the sample minimal; you already note irrelevant fields aren’t required.

-```
+```ini
     engine = "mysql"
     name = "nettacker"
     host = "localhost"
     port = 3306
     username = "root"
     password = "some-password"
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"

---

`699-709`: **Add language to fenced block (MD040) and retain engine string “postgres” per codebase; trim irrelevant fields.**

Align with earlier guidance to keep engine “postgres” and remove SQLite-only knobs from the sample.

```diff
-```
+```ini
     engine = "postgres"
     name = "nettacker"
     host = "localhost"
     port = 5432
     username = "root"
     password = "some-password"
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"

---

`711-714`: **Clarify which fields are ignored and tighten SSL note; optionally align heading to “PostgreSQL”.**

Be explicit and consistent.

```diff
-In this case the irrelevant fields are `journal_mode` and `synchronous_mode`. You don't have to update/change/remove them.
-
-**Note**: If you want encryption, then set `ssl_mode` to `require`.
+In this backend, `journal_mode` and `synchronous_mode` are ignored.
+
+Note: To require TLS, set `ssl_mode = "require"` (if supported by your driver/environment).

Optional: Rename the section heading at Line 695 to “## PostgreSQL configuration” for consistency with common naming.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92e0404 and 6d91f9d.

📒 Files selected for processing (2)
  • docs/Usage.md (1 hunks)
  • nettacker/core/graph.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • nettacker/core/graph.py
🧰 Additional context used
📓 Path-based instructions (1)
docs/**

📄 CodeRabbit inference engine (AGENTS.md)

Place documentation under docs/

Files:

  • docs/Usage.md
🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md

663-663: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)


677-677: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


681-681: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


699-699: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Docker 27.5.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.0.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.1.4-1ubuntu.24.04noble image build
  • GitHub Check: Build package
  • GitHub Check: Test Docker image
🔇 Additional comments (1)
docs/Usage.md (1)

675-675: Capitalize SQLite and tighten APSW note — flag verified.

Replace the docs line with:

Note: To use APSW for SQLite, set use_apsw_for_sqlite = True in the config for potential performance gains.

Flag defined: nettacker/config.py:151-153. Flag checked: nettacker/database/db.py:51-53.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
nettacker/core/lib/base.py (1)

49-58: Avoid infinite wait; add safe JSON handling

Polling without a timeout risks hanging if the event never arrives; also JSON parsing can throw. Add a max-wait (or attempt cap) and handle decode/KeyError gracefully before appending.

-            while True:
-                event = find_temp_events(target, module_name, scan_id, event_name)
-                if event:
-                    events.append(json.loads(event)["response"]["conditions_results"])
-                    break
-                time.sleep(0.1)
+            max_attempts = 300  # ~30s at 0.1s intervals
+            attempts = 0
+            while attempts < max_attempts:
+                event = find_temp_events(target, module_name, scan_id, event_name)
+                if event:
+                    try:
+                        payload = json.loads(event)
+                        events.append(payload.get("response", {}).get("conditions_results", []))
+                        break
+                    except (ValueError, TypeError, KeyError):
+                        # malformed/partial event; keep waiting briefly
+                        pass
+                time.sleep(0.1)
+                attempts += 1
♻️ Duplicate comments (5)
nettacker/database/db.py (5)

56-60: Use milliseconds for APSW busy timeout

APSW’s setbusytimeout expects milliseconds; multiplying by 100 under-waits by 10x for second-based configs.

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            # seconds -> milliseconds
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)
APSW setbusytimeout units (ms vs seconds)?

100-114: APSW commit retry closes connection inside the loop (breaks retries) and can leak cursor

Closing per-iteration invalidates subsequent attempts. Close once after the loop; keep the finally for tests but outside the retry. Also use configured retry/delay.

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            except Exception:
+                pass
+            try:
+                connection.close()
+            except Exception:
+                pass
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

642-644: Close APSW cursor/connection on early return

This path leaks resources when no targets are found.

-            if not targets:
-                return structure(status="finished", msg="No more search results")
+            if not targets:
+                cursor.close()
+                connection.close()
+                return structure(status="finished", msg="No more search results")

115-126: ORM commit retry lacks rollback and uses hardcoded policy

Add rollback between attempts and read retry config.

-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                return True
+            except Exception:
+                try:
+                    session.rollback()
+                except Exception:
+                    pass
+                time.sleep(Config.settings.retry_delay)
+        logger.warn(messages("database_connect_fail"))
+        return False

780-792: Unify json_event type across backends (parse to dict in ORM path)

APSW branch returns parsed dicts; ORM path returns raw string. Parse for consistency.

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},
🧹 Nitpick comments (3)
report.html (1)

1-1: Make the report template valid HTML and easier to populate

This one-line blob is invalid HTML and hard to maintain. Replace with a minimal, valid skeleton with placeholders the renderer fills.

-<table><graph_html>/*css*/</table>datetargetmodule_nameportlogsjson_event<tr>nowx</tr></table><div id="json_length">1</div><p class="footer">Software Details: OWASP Nettacker version 1.0 [beta] in now ScanID: scan-id</p><script>/*js*/</script>
+<!doctype html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <style>/*css*/</style>
+    <title>Nettacker Report</title>
+  </head>
+  <body>
+    <div id="graph"><graph_html></graph_html></div>
+    <table id="events">
+      <thead><tr><th>date</th><th>target</th><th>module_name</th><th>port</th><th>logs</th><th>json_event</th></tr></thead>
+      <tbody><!-- rows injected here --></tbody>
+    </table>
+    <div id="json_length">1</div>
+    <p class="footer">Software Details: OWASP Nettacker version 1.0 [beta] in now ScanID: scan-id</p>
+    <script>/*js*/</script>
+  </body>
+  </html>
tests/database/test_db.py (2)

706-722: Align expected return type for SQLAlchemy path of find_temp_events (JSON string, not dict)

The implementation returns a JSON string (result.event). Update the mock and assertion accordingly.

-        fake_result = MagicMock()
-        fake_result.event = {"foo": "bar"}
+        fake_result = MagicMock()
+        fake_result.event = '{"foo":"bar"}'
@@
-        assert result == {"foo": "bar"}
+        assert result == '{"foo":"bar"}'

860-869: The exception is set on an unused attribute

select_reports uses cursor.execute(...), not cursor.query. Patch execute.side_effect so the test actually exercises the error path.

-        mock_cursor.query.side_effect = Exception("DB Error")
+        mock_cursor.execute.side_effect = Exception("DB Error")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d91f9d and 5a02e5f.

📒 Files selected for processing (4)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/database/db.py (17 hunks)
  • report.html (1 hunks)
  • tests/database/test_db.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • nettacker/core/lib/base.py
  • nettacker/database/db.py
  • tests/database/test_db.py
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/core/lib/base.py
  • nettacker/database/db.py
nettacker/core/**

📄 CodeRabbit inference engine (AGENTS.md)

Place core libraries under nettacker/core/

Files:

  • nettacker/core/lib/base.py
tests/**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/test_*.py: Place tests under tests/ and name files as test_*.py
Use pytest (with pytest-asyncio, xdist) for tests

Files:

  • tests/database/test_db.py
tests/**

📄 CodeRabbit inference engine (AGENTS.md)

Mirror package layout in tests (e.g., tests/core/, tests/lib/, etc.)

Files:

  • tests/database/test_db.py
🧠 Learnings (4)
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/core/lib/base.py
  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/core/lib/base.py
  • nettacker/database/db.py
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
  • tests/database/test_db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (2)
nettacker/database/db.py (4)
nettacker/config.py (1)
  • Config (194-198)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (28-51)
tests/database/test_db.py (4)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/database/db.py (17)
  • db_inputs (22-40)
  • create_connection (43-85)
  • send_submit_query (88-127)
  • submit_report_to_db (130-178)
  • remove_old_logs (181-229)
  • submit_logs_to_db (232-313)
  • submit_temp_logs_to_db (316-403)
  • find_temp_events (406-455)
  • find_events (458-502)
  • select_reports (505-568)
  • get_scan_result (571-613)
  • last_host_logs (616-737)
  • get_logs_by_scan_id (740-792)
  • get_options_by_scan_id (795-824)
  • logs_to_report_json (827-880)
  • logs_to_report_html (883-990)
  • search_logs (993-1150)
tests/database/test_sqlite.py (1)
  • mock_config (12-14)
nettacker/config.py (1)
  • as_dict (26-27)
🪛 Ruff (0.12.2)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


123-123: Consider moving this statement to an else block

(TRY300)


124-124: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


214-214: Do not catch blind exception: Exception

(BLE001)


284-284: Do not catch blind exception: Exception

(BLE001)


288-289: try-except-pass detected, consider logging the exception

(S110)


288-288: Do not catch blind exception: Exception

(BLE001)


363-364: try-except-pass detected, consider logging the exception

(S110)


363-363: Do not catch blind exception: Exception

(BLE001)


371-372: try-except-pass detected, consider logging the exception

(S110)


371-371: Do not catch blind exception: Exception

(BLE001)


374-374: Do not catch blind exception: Exception

(BLE001)


378-379: try-except-pass detected, consider logging the exception

(S110)


378-378: Do not catch blind exception: Exception

(BLE001)


438-438: Consider moving this statement to an else block

(TRY300)


439-439: Do not catch blind exception: Exception

(BLE001)


488-488: Consider moving this statement to an else block

(TRY300)


489-489: Do not catch blind exception: Exception

(BLE001)


547-547: Consider moving this statement to an else block

(TRY300)


549-549: Do not catch blind exception: Exception

(BLE001)


566-566: Do not catch blind exception: Exception

(BLE001)


600-600: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


612-612: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


693-693: Consider moving this statement to an else block

(TRY300)


695-695: Do not catch blind exception: Exception

(BLE001)


1075-1075: Do not catch blind exception: Exception

(BLE001)


1146-1146: Do not catch blind exception: Exception

(BLE001)

tests/database/test_db.py

34-34: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


116-116: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


126-126: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


185-185: Unused method argument: mock_messages

(ARG002)


188-188: Create your own exception

(TRY002)


188-188: Avoid specifying long messages outside the exception class

(TRY003)


214-214: Unused method argument: mock_messages

(ARG002)


248-248: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


376-376: Unused method argument: mock_messages

(ARG002)


383-383: Unused method argument: mock_submit

(ARG002)


476-476: Unused method argument: mock_submit

(ARG002)


495-495: Unused method argument: mock_submit

(ARG002)


548-548: Unused method argument: mock_messages

(ARG002)


598-598: Unused method argument: mock_temp

(ARG002)


693-693: Unused method argument: mock_messages

(ARG002)


832-832: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


852-852: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


879-879: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


892-892: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


917-917: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


933-933: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


943-943: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


948-948: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


951-951: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


1150-1150: Unused method argument: mock_messages

(ARG002)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
nettacker/database/db.py (4)

56-66: APSW busy-timeout uses 100×; APSW expects milliseconds.

If config.settings.timeout is seconds, multiply by 1000, not 100. Current value shortens the wait and can inflate BusyError retries.

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)

Also applies to: 59-59


19-19: Rename local logger to avoid shadowing the imported module.

logger = logger.get_logger() overwrites the imported logger module symbol. Use a distinct name (e.g., log) and update references in this file.

-logger = logger.get_logger()
+log = logger.get_logger()

100-114: Retry loop closes the connection each iteration; breaks retries.

finally: connection.close() runs on every loop pass, so subsequent retries operate on a closed connection. Close once after the loop; use configured retry params.

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            finally:
+                connection.close()
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

115-126: SQLAlchemy commit loop lacks rollback and uses hardcoded policy.

Add rollback() and use Config.settings.{max_retries,retry_delay}.

-    else:
-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+    else:
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                success = True
+                break
+            except Exception:
+                session.rollback()
+                time.sleep(Config.settings.retry_delay)
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success
🧹 Nitpick comments (6)
nettacker/database/db.py (6)

291-293: Typo in log message (“retires”).

Fix the spelling to “retries” for clarity.

-                # All retires exhausted but we want to continue operation
+                # All retries exhausted but we want to continue operation

337-338: Use connection for BEGIN to match other APSW paths.

Be consistent with submit_logs_to_db which uses connection.execute("BEGIN").

-                        if not connection.in_transaction:
-                            cursor.execute("BEGIN")
+                        if not connection.in_transaction:
+                            connection.execute("BEGIN")

406-418: Docstring return type is inaccurate for find_temp_events.

Function returns a single JSON string (or []) due to LIMIT 1, not “an array”.

-    Returns:
-        an array with JSON events or an empty array
+    Returns:
+        a JSON string (event) or [] if not found

780-792: Inconsistent json_event type between APSW and ORM branches.

APSW returns dict via json.loads, ORM returns raw string. Normalize to dict.

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},

68-68: Prefer logger.exception() to capture stack traces.

Improves diagnosability over plain error logging.

-            logger.error(f"Failed to create APSW connection: {e}")
+            logger.exception(f"Failed to create APSW connection: {e}")
@@
-                logger.error(f"Failed to read report file: {e}")
+                logger.exception(f"Failed to read report file: {e}")
@@
-            logger.error(f"Failed to read report file: {e}")
+            logger.exception(f"Failed to read report file: {e}")

Also applies to: 600-600, 612-612


1033-1040: GROUP BY with non-aggregated date is undefined in SQLite; select MAX(date).

Prevents arbitrary date values per group.

-                    SELECT date, module_name, port, event, json_event FROM scan_events
+                    SELECT MAX(date) AS date, module_name, port, event, json_event FROM scan_events
                     WHERE target = ?
-                    GROUP BY module_name, port, scan_unique_id, event
+                    GROUP BY module_name, port, scan_unique_id, event
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a02e5f and 5c7c1fb.

📒 Files selected for processing (1)
  • nettacker/database/db.py (17 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • nettacker/database/db.py
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/database/db.py
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (194-198)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (28-51)
🪛 Ruff (0.12.2)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


123-123: Consider moving this statement to an else block

(TRY300)


124-124: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


214-214: Do not catch blind exception: Exception

(BLE001)


284-284: Do not catch blind exception: Exception

(BLE001)


288-289: try-except-pass detected, consider logging the exception

(S110)


288-288: Do not catch blind exception: Exception

(BLE001)


363-364: try-except-pass detected, consider logging the exception

(S110)


363-363: Do not catch blind exception: Exception

(BLE001)


371-372: try-except-pass detected, consider logging the exception

(S110)


371-371: Do not catch blind exception: Exception

(BLE001)


374-374: Do not catch blind exception: Exception

(BLE001)


378-379: try-except-pass detected, consider logging the exception

(S110)


378-378: Do not catch blind exception: Exception

(BLE001)


438-438: Consider moving this statement to an else block

(TRY300)


439-439: Do not catch blind exception: Exception

(BLE001)


488-488: Consider moving this statement to an else block

(TRY300)


489-489: Do not catch blind exception: Exception

(BLE001)


547-547: Consider moving this statement to an else block

(TRY300)


549-549: Do not catch blind exception: Exception

(BLE001)


566-566: Do not catch blind exception: Exception

(BLE001)


600-600: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


612-612: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


693-693: Consider moving this statement to an else block

(TRY300)


695-695: Do not catch blind exception: Exception

(BLE001)


1075-1075: Do not catch blind exception: Exception

(BLE001)


1079-1080: try-except-pass detected, consider logging the exception

(S110)


1079-1079: Do not catch blind exception: Exception

(BLE001)


1151-1151: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (4)
nettacker/database/db.py (4)

141-179: Report insertion path looks solid.

Transaction boundaries, parameter binding, and cleanup are correct.


192-230: Old logs removal path looks good.

Uses parameterized DELETE and shared commit helper.


470-503: Consistent JSON-string returns across backends.

find_events aligns APSW and ORM to return JSON strings; matches calling code expectations.


803-825: Keep returning raw JSON strings — no change required.

Call sites (nettacker/core/graph.py:340–344) and tests (tests/database/test_db.py:1036–1040) call json.loads(item["options"]), so the current implementation returning raw JSON strings is correct.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
nettacker/database/db.py (3)

115-126: Add rollback and use configured retry parameters.

The SQLAlchemy branch should call session.rollback() after a failed commit and use configured retry parameters instead of hardcoded values.

🔎 Proposed fix
     else:
+        success = False
-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                success = True
+                break
+            except Exception:
+                session.rollback()
+                time.sleep(Config.settings.retry_delay)
+        if not success:
             logger.warn(messages("database_connect_fail"))
-            return False
-        return False
+        return success

767-794: Inconsistent json_event types between backends.

The APSW branch returns json.loads(row[6]) (a dict) for json_event at line 777, while the SQLAlchemy branch returns log.json_event (likely a string) at line 791. This creates inconsistent return types.

Both branches should return the same type - either both parse to dicts or both return strings. Based on the APSW pattern, parsing to dicts appears to be the intent.

🔎 Proposed fix
             {
                 "scan_id": scan_id,
                 "target": log.target,
                 "module_name": log.module_name,
                 "date": str(log.date),
                 "port": json.loads(log.port),
                 "event": json.loads(log.event),
-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},
             }
             for log in session.query(HostsLog).filter(HostsLog.scan_unique_id == scan_id).all()
         ]

59-59: Fix busy timeout millisecond conversion on line 59.

The setbusytimeout() method expects milliseconds. Since config.settings.timeout is in seconds (as evidenced by socket library usage and the default 3.0 value), multiply by 1000 instead of 100 to correctly convert seconds to milliseconds.

Current: connection.setbusytimeout(int(config.settings.timeout) * 100)
Should be: connection.setbusytimeout(int(config.settings.timeout) * 1000)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b09bc9 and bb3e1f9.

📒 Files selected for processing (1)
  • nettacker/database/db.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • nettacker/database/db.py
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/database/db.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🪛 Ruff (0.14.10)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


101-101: Unpacked variable cursor is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


123-123: Consider moving this statement to an else block

(TRY300)


124-124: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


214-214: Do not catch blind exception: Exception

(BLE001)


284-284: Do not catch blind exception: Exception

(BLE001)


288-289: try-except-pass detected, consider logging the exception

(S110)


288-288: Do not catch blind exception: Exception

(BLE001)


363-364: try-except-pass detected, consider logging the exception

(S110)


363-363: Do not catch blind exception: Exception

(BLE001)


371-372: try-except-pass detected, consider logging the exception

(S110)


371-371: Do not catch blind exception: Exception

(BLE001)


374-374: Do not catch blind exception: Exception

(BLE001)


378-379: try-except-pass detected, consider logging the exception

(S110)


378-378: Do not catch blind exception: Exception

(BLE001)


438-438: Consider moving this statement to an else block

(TRY300)


439-439: Do not catch blind exception: Exception

(BLE001)


488-488: Consider moving this statement to an else block

(TRY300)


489-489: Do not catch blind exception: Exception

(BLE001)


547-547: Consider moving this statement to an else block

(TRY300)


549-549: Do not catch blind exception: Exception

(BLE001)


566-566: Do not catch blind exception: Exception

(BLE001)


600-600: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


612-612: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


695-695: Consider moving this statement to an else block

(TRY300)


697-697: Do not catch blind exception: Exception

(BLE001)


1077-1077: Do not catch blind exception: Exception

(BLE001)


1081-1082: try-except-pass detected, consider logging the exception

(S110)


1081-1081: Do not catch blind exception: Exception

(BLE001)


1153-1153: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Docker 26.1.4-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 27.5.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.0.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker image
  • GitHub Check: Build package

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
nettacker/database/db.py (1)

791-791: Inconsistent data types across backends for json_event.

The APSW branch (line 777) returns json.loads(row[6]) (parsed dict), but the SQLAlchemy branch returns log.json_event (raw JSON string). This type inconsistency will break calling code depending on which backend is active.

🔎 Proposed fix
             {
                 "scan_id": scan_id,
                 "target": log.target,
                 "module_name": log.module_name,
                 "date": str(log.date),
                 "port": json.loads(log.port),
                 "event": json.loads(log.event),
-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},
             }
             for log in session.query(HostsLog).filter(HostsLog.scan_unique_id == scan_id).all()
🧹 Nitpick comments (4)
nettacker/database/db.py (4)

115-126: Use configured retry parameters instead of hardcoded values.

The SQLAlchemy branch uses hardcoded retry count (100) and delay (0.1s). For consistency with the APSW branch and configurability, use Config.settings.max_retries and Config.settings.retry_delay.

🔎 Proposed refactor
     else:
         try:
-            for _ in range(1, 100):
+            for _ in range(Config.settings.max_retries):
                 try:
                     session.commit()
                     return True
                 except Exception:
-                    time.sleep(0.1)
+                    session.rollback()
+                    time.sleep(Config.settings.retry_delay)
             logger.warn(messages("database_connect_fail"))
             return False
         except Exception:
             logger.warn(messages("database_connect_fail"))
             return False

Also added session.rollback() for proper transaction cleanup between retries.


101-101: Unused variable: prefix cursor with underscore.

The cursor variable is unpacked but never used in the APSW branch of send_submit_query. Prefix it with an underscore to indicate it's intentionally unused.

🔎 Proposed fix
     if isinstance(session, tuple):
-        connection, cursor = session
+        connection, _cursor = session

68-68: Use logger.exception() for automatic traceback logging.

When logging errors in exception handlers, logger.exception() automatically includes the stack trace, which aids debugging. Replace logger.error() with logger.exception() here and at lines 600, 612.

🔎 Proposed fix
         except Exception as e:
-            logger.error(f"Failed to create APSW connection: {e}")
+            logger.exception(f"Failed to create APSW connection: {e}")
             raise

Apply similarly at lines 600 and 612.


420-442: Consider using finally block for guaranteed resource cleanup.

The APSW resources (cursor and connection) are closed only on success paths. If an exception occurs between cursor operations and the close calls, resources may leak. Wrap in a try-finally for guaranteed cleanup.

🔎 Suggested pattern
     if isinstance(session, tuple):
         connection, cursor = session
         try:
             cursor.execute(
                 """
                 SELECT event
                 FROM temp_events
                 WHERE target = ? AND module_name = ? AND scan_unique_id = ? AND event_name = ?
                 LIMIT 1
             """,
                 (target, module_name, scan_id, event_name),
             )
 
             row = cursor.fetchone()
-            cursor.close()
-            connection.close()
             if row:
                 return row[0]
             return []
         except Exception:
             logger.warn(messages("database_connect_fail"))
             return []
-        return []
+        finally:
+            try:
+                cursor.close()
+            except Exception:
+                pass
+            try:
+                connection.close()
+            except Exception:
+                pass

Apply similar pattern to other read functions for robustness.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bb3e1f9 and 84aa33c.

📒 Files selected for processing (1)
  • nettacker/database/db.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • nettacker/database/db.py
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/database/db.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (2)
nettacker/config.py (2)
  • Config (194-198)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
🪛 Ruff (0.14.10)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


101-101: Unpacked variable cursor is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


123-123: Consider moving this statement to an else block

(TRY300)


124-124: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


214-214: Do not catch blind exception: Exception

(BLE001)


284-284: Do not catch blind exception: Exception

(BLE001)


288-289: try-except-pass detected, consider logging the exception

(S110)


288-288: Do not catch blind exception: Exception

(BLE001)


363-364: try-except-pass detected, consider logging the exception

(S110)


363-363: Do not catch blind exception: Exception

(BLE001)


371-372: try-except-pass detected, consider logging the exception

(S110)


371-371: Do not catch blind exception: Exception

(BLE001)


374-374: Do not catch blind exception: Exception

(BLE001)


378-379: try-except-pass detected, consider logging the exception

(S110)


378-378: Do not catch blind exception: Exception

(BLE001)


438-438: Consider moving this statement to an else block

(TRY300)


439-439: Do not catch blind exception: Exception

(BLE001)


488-488: Consider moving this statement to an else block

(TRY300)


489-489: Do not catch blind exception: Exception

(BLE001)


547-547: Consider moving this statement to an else block

(TRY300)


549-549: Do not catch blind exception: Exception

(BLE001)


566-566: Do not catch blind exception: Exception

(BLE001)


600-600: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


612-612: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


695-695: Consider moving this statement to an else block

(TRY300)


697-697: Do not catch blind exception: Exception

(BLE001)


1077-1077: Do not catch blind exception: Exception

(BLE001)


1081-1082: try-except-pass detected, consider logging the exception

(S110)


1081-1081: Do not catch blind exception: Exception

(BLE001)


1153-1153: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (3)
nettacker/database/db.py (3)

420-455: LGTM: Consistent return types across backends.

Both APSW and SQLAlchemy branches correctly return JSON strings. The APSW branch returns row[0] (raw JSON string from DB), and the SQLAlchemy branch returns result.event (JSON string from ORM). This maintains consistency for calling code.

Based on learnings, this pattern is correct.


471-502: LGTM: JSON string consistency maintained.

The APSW branch uses json.dumps(json.loads()) to normalize JSON strings, matching the SQLAlchemy branch which returns row.json_event. Both backends return lists of JSON strings, ensuring calling code receives consistent data types.

Based on learnings, the seemingly redundant serialization is necessary for consistency.


337-337: Inconsistent transaction management: use connection.execute("BEGIN").

Line 337 uses cursor.execute("BEGIN") while the similar function submit_logs_to_db at line 253 uses connection.execute("BEGIN"). For consistency and APSW best practices, transaction commands should be issued via the connection object.

🔎 Proposed fix
                     try:
                         if not connection.in_transaction:
-                            cursor.execute("BEGIN")
+                            connection.execute("BEGIN")

Likely an incorrect or invalid review comment.

@pUrGe12 pUrGe12 force-pushed the apsw-with-sqlalchemy branch from 84aa33c to bb3e1f9 Compare December 29, 2025 06:56
@pUrGe12
Copy link
Contributor Author

pUrGe12 commented Dec 29, 2025

Note that using cursor.execute("BEGIN") for some functions is intended because it explicitly binds it to the earlier opened connection. The cursor is bound to the connection as its derived using Connection.cursor() for each connection we open.

auto-merge was automatically disabled January 10, 2026 15:55

Head branch was pushed to by a user without write access

@pUrGe12 pUrGe12 force-pushed the apsw-with-sqlalchemy branch from 6199d1a to cbc02e2 Compare January 10, 2026 15:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/core/test_exclude_ports.py (1)

251-255: Inconsistent mock pattern - use JSON strings directly.

This test still uses the old MagicMock with json_event attribute pattern, while all other tests in this file have been updated to return JSON strings directly. This inconsistency means the test won't correctly validate the APSW-migrated find_events behavior.

Based on learnings, find_events now returns JSON strings, not objects with a json_event attribute.

🐛 Proposed fix to align with the new pattern
-    mock_event = MagicMock()
-    mock_event.json_event = json.dumps(
-        {"port": 80, "response": {"conditions_results": {"http": True}}}
-    )
-    mock_find_events.return_value = [mock_event]
+    mock_find_events.return_value = [
+        json.dumps({"port": 80, "response": {"conditions_results": {"http": True}}})
+    ]
🤖 Fix all issues with AI agents
In @nettacker/database/db.py:
- Around line 797-809: The SQLAlchemy branch returns log.json_event as a raw
string while the APSW branch uses json.loads(row[6]), causing inconsistent types
for json_event; update the SQLAlchemy return to parse json_event (e.g.,
json.loads(log.json_event)) so json_event is consistently a dict/list across
backends—modify the list comprehension that builds the dict for each HostsLog in
the session.query(...) result to call json.loads on log.json_event.
- Around line 115-128: The SQLAlchemy commit retry loop in the else branch uses
range(1, Config.settings.max_submit_query_retry) causing an off-by-one retry
count and wraps the loop in a redundant outer try/except; update the loop to
iterate the correct number of attempts (e.g.,
range(Config.settings.max_submit_query_retry) or range(1,
Config.settings.max_submit_query_retry + 1)) so it actually performs the
configured number of retries, remove the outer try/except that surrounds the
for-loop, keep the inner try/except to rollback and sleep on failures, and after
the loop log the failure (logger.warn/messages("database_connect_fail")) and
return False as before.
- Around line 603-627: Both code paths open files without closing them
(open(str(filename), "rb").read() and open(str(report.report_path_filename),
"rb").read()), causing resource leaks; change each to use a context manager:
with open(str(...), "rb") as f: data = f.read(); then return the filename and
data (i.e., replace the direct open(...).read() calls in the APSW branch that
returns filename and in the SQLAlchemy branch that returns
report.report_path_filename). Ensure exceptions are still caught and logged as
before.
- Around line 859-884: The APSW branch that handles a tuple session (connection,
cursor) can fall through and return None when the SELECT yields no rows; ensure
you explicitly return an empty list (or the accumulated return_logs) after the
finally block so the function always returns a list. Locate the branch that
checks isinstance(session, tuple) and uses variables connection, cursor and
return_logs, and add a final explicit return (e.g., return return_logs or return
[]) immediately after the try/finally so the APSW path never implicitly returns
None.
- Around line 100-114: The retry loop in the session-handling branch has an
off-by-one error and unsafe resource handling: change the retry range to include
Config.settings.max_submit_query_retry (e.g., use range(1,
Config.settings.max_submit_query_retry + 1) or a retry counter that runs that
many attempts), remove the unused unpacked variable cursor to silence RUF059,
and do not close the provided connection in the finally block since the session
(connection, cursor) was supplied by the caller—remove the unconditional
connection.close() so the caller remains responsible for closing the session;
keep the commit/rollback/retry logic and the failure logger as-is.
- Around line 56-69: The busy timeout is being set incorrectly: in the DB
connection code where apsw.Connection is created and
connection.setbusytimeout(int(config.settings.timeout) * 100) is called, change
the multiplier to 1000 so setbusytimeout receives milliseconds (e.g.,
connection.setbusytimeout(int(config.settings.timeout) * 1000)); keep the int
conversion and existing error handling around DB_PATH/apsw.Connection and cursor
creation intact.
🧹 Nitpick comments (11)
tests/database/test_db.py (2)

71-101: Consider adding SQLite test for db_inputs.

The tests cover PostgreSQL and MySQL URL construction but don't test the SQLite path. Consider adding a test for db_inputs("sqlite") to ensure complete coverage of the function.

🧪 Proposed additional test
@patch("nettacker.database.db.Config")
def test_db_inputs_sqlite(self, mock_config):
    mock_config.db.as_dict.return_value = {
        "name": "/path/to/nettacker.db"
    }
    
    result = db_inputs("sqlite")
    expected = "sqlite:////path/to/nettacker.db"
    assert result == expected

729-732: Clarify empty result handling.

Lines 730-732 normalize an empty list to None before asserting. This pattern is unusual—if the function can return either [] or None for "no results," consider standardizing the return value in the function itself rather than in tests.

nettacker/config.py (1)

151-165: Review the max_submit_query_retry value.

The retry configuration looks reasonable overall:

  • use_apsw_for_sqlite = False is a safe default (APSW is opt-in)
  • max_retries = 3 is appropriate for transient failures
  • retry_delay = 0.1 (100ms) provides good balance

However, max_submit_query_retry = 100 seems excessive. With the 0.1s delay, this could result in 10 seconds of retry attempts. Consider whether a lower value (e.g., 10-20 retries) would be more appropriate for database submission queries.

docs/Usage.md (3)

660-675: Use fenced code block with language specifier for consistency.

The SQLite configuration example uses an indented code block (lines 663-671), which is inconsistent with the rest of the document that uses fenced code blocks. Also, note that use_apsw_for_sqlite is mentioned in line 675 but not shown in the configuration example above.

📝 Suggested fix
-The configurations below are for a SQLite wrapper called **APSW** (Another Python SQLite Wrapper). The configurations can be found inside `nettacker/config.py` file under the `DBConfig` class. 
-
+The configurations below are for a SQLite wrapper called **APSW** (Another Python SQLite Wrapper). The configurations can be found inside `nettacker/config.py` file under the `DBConfig` class.

-    engine = "sqlite"
-    name = str(CWD / ".nettacker/data/nettacker.db")
-    host = ""
-    port = ""
-    username = ""
-    password = ""
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
+```python
+engine = "sqlite"
+name = str(CWD / ".nettacker/data/nettacker.db")
+host = ""
+port = ""
+username = ""
+password = ""
+ssl_mode = "disable"
+journal_mode = "WAL"
+synchronous_mode = "NORMAL"
+use_apsw_for_sqlite = True  # Set to True to enable APSW
+```

681-691: Add language specifier to fenced code block.

The code block is missing a language specifier, which affects syntax highlighting and violates markdown linting rules (MD040).

📝 Suggested fix
-```
+```python
     engine = "mysql"

699-709: Add language specifier to fenced code block.

Same issue as the MySQL section - the code block is missing a language specifier.

📝 Suggested fix
-```
+```python
     engine = "postgres"
nettacker/database/db.py (5)

12-19: Variable logger shadows the imported module.

At line 12, logger is imported as a module, but at line 19, it's reassigned to the result of logger.get_logger(). While this works, it's confusing and could cause issues if someone tries to use other functions from the logger module later.

📝 Suggested fix
-from nettacker import logger
+from nettacker import logger as logger_module
 from nettacker.api.helpers import structure
 from nettacker.config import Config
 from nettacker.core.messages import messages
 from nettacker.database.models import HostsLog, Report, TempEvents

 config = Config()
-logger = logger.get_logger()
+logger = logger_module.get_logger()

275-287: Potential NameError when referencing apsw.BusyError if APSW failed to import.

While create_connection() guards against apsw=None, if this guard ever fails or if the code path changes, accessing apsw.BusyError at line 275 when apsw is None will raise a NameError.

Consider using a defensive check or importing the exception type at the top:

🔧 Suggested defensive approach
# At top of file, after the apsw import guard:
if apsw is not None:
    APSWBusyError = apsw.BusyError
else:
    APSWBusyError = type(None)  # Dummy that never matches

# Then in the except clause:
except APSWBusyError as e:

Or catch by string comparison on the exception type name if truly needed.


340-341: Inconsistent use of cursor.execute("BEGIN") vs connection.execute("BEGIN").

Line 341 uses cursor.execute("BEGIN") but submit_logs_to_db at line 257 uses connection.execute("BEGIN"). This inconsistency could lead to subtle transaction handling differences.

For consistency, use the same approach across both functions. Using connection.execute("BEGIN") is generally preferred as it makes the transaction binding explicit at the connection level.

🔧 Suggested fix
                         if not connection.in_transaction:
-                            cursor.execute("BEGIN")
+                            connection.execute("BEGIN")

642-717: Consider query optimization for APSW path.

The APSW implementation makes 4 separate queries per target (distinct targets, modules, latest date, events), which could be inefficient for many targets. Consider using JOINs or subqueries to reduce round-trips.

Also, line 652 uses a list [(page - 1) * 10] for the parameter binding, while other places use tuples. While both work, tuples are more conventional for SQLite parameter binding.


903-1015: Consider extracting common HTML generation logic to reduce duplication.

Both APSW and SQLAlchemy paths duplicate the HTML generation code (lines 949-975 and 989-1014). Consider extracting this into a helper function that takes the logs list:

♻️ Refactoring suggestion
def _generate_html_from_logs(logs):
    """Helper to generate HTML content from logs list."""
    html_graph = build_graph("d3_tree_v2_graph", logs)
    html_content = log_data.table_title.format(
        html_graph, log_data.css_1,
        "date", "target", "module_name", "scan_id", "port", "event", "json_event",
    )
    for event in logs:
        html_content += log_data.table_items.format(
            event["date"], event["target"], event["module_name"],
            event["scan_id"], event["port"], event["event"], event["json_event"],
        )
    html_content += log_data.table_end + '<p class="footer">' + messages("nettacker_report") + "</p>"
    return html_content

Then both paths can simply call return _generate_html_from_logs(logs).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b9ff851 and cbc02e2.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • docs/Usage.md
  • nettacker/config.py
  • nettacker/core/app.py
  • nettacker/core/graph.py
  • nettacker/core/lib/base.py
  • nettacker/core/module.py
  • nettacker/core/utils/common.py
  • nettacker/database/db.py
  • pyproject.toml
  • tests/core/test_exclude_ports.py
  • tests/database/test_db.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • nettacker/core/graph.py
  • nettacker/core/module.py
  • nettacker/core/utils/common.py
🧰 Additional context used
📓 Path-based instructions (7)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • tests/core/test_exclude_ports.py
  • tests/database/test_db.py
  • nettacker/core/lib/base.py
  • nettacker/database/db.py
  • nettacker/core/app.py
  • nettacker/config.py
tests/**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/test_*.py: Place tests under tests/ and name files as test_*.py
Use pytest (with pytest-asyncio, xdist) for tests

Files:

  • tests/core/test_exclude_ports.py
  • tests/database/test_db.py
tests/**

📄 CodeRabbit inference engine (AGENTS.md)

Mirror package layout in tests (e.g., tests/core/, tests/lib/, etc.)

Files:

  • tests/core/test_exclude_ports.py
  • tests/database/test_db.py
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/core/lib/base.py
  • nettacker/database/db.py
  • nettacker/core/app.py
  • nettacker/config.py
nettacker/core/**

📄 CodeRabbit inference engine (AGENTS.md)

Place core libraries under nettacker/core/

Files:

  • nettacker/core/lib/base.py
  • nettacker/core/app.py
docs/**

📄 CodeRabbit inference engine (AGENTS.md)

Place documentation under docs/

Files:

  • docs/Usage.md
nettacker/config.py

📄 CodeRabbit inference engine (AGENTS.md)

Manage defaults (API key, DB path, paths) in nettacker/config.py and review sensitive headers list before logging

Files:

  • nettacker/config.py
🧠 Learnings (10)
📓 Common learnings
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • tests/core/test_exclude_ports.py
  • nettacker/core/lib/base.py
  • nettacker/database/db.py
  • docs/Usage.md
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • tests/core/test_exclude_ports.py
  • tests/database/test_db.py
  • nettacker/core/lib/base.py
  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Maintain coverage (pytest configured with --cov=nettacker); add tests for new features and bug fixes

Applied to files:

  • tests/database/test_db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • tests/database/test_db.py
  • nettacker/database/db.py
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • tests/database/test_db.py
  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Run tests with `make test` or `poetry run pytest` and ensure they pass before pushing

Applied to files:

  • pyproject.toml
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Applies to tests/**/test_*.py : Use pytest (with pytest-asyncio, xdist) for tests

Applied to files:

  • pyproject.toml
📚 Learning: 2025-10-25T11:31:26.498Z
Learnt from: Aarush289
Repo: OWASP/Nettacker PR: 1155
File: nettacker/core/hostcheck.py:121-122
Timestamp: 2025-10-25T11:31:26.498Z
Learning: In nettacker/core/hostcheck.py, the resolve_quick() function intentionally applies timeout_sec per resolution pass (not cumulatively), allowing up to 2× timeout_sec total execution time across both AI_ADDRCONFIG passes by design.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Applies to nettacker/config.py : Manage defaults (API key, DB path, paths) in nettacker/config.py and review sensitive headers list before logging

Applied to files:

  • nettacker/database/db.py
  • docs/Usage.md
  • nettacker/config.py
🧬 Code graph analysis (2)
tests/database/test_db.py (3)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/database/db.py (2)
  • db_inputs (22-40)
  • create_connection (43-85)
nettacker/config.py (1)
  • as_dict (26-27)
nettacker/database/db.py (3)
nettacker/config.py (2)
  • Config (195-199)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
🪛 LanguageTool
docs/Usage.md

[style] ~673-~673: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...e the default and recommended settings. Feel free to play around and change them according t...

(FEEL_FREE_TO_STYLE_ME)

🪛 markdownlint-cli2 (0.18.1)
docs/Usage.md

663-663: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)


681-681: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


699-699: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.10)
tests/database/test_db.py

34-34: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


116-116: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


126-126: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


185-185: Unused method argument: mock_messages

(ARG002)


188-188: Create your own exception

(TRY002)


188-188: Avoid specifying long messages outside the exception class

(TRY003)


214-214: Unused method argument: mock_messages

(ARG002)


248-248: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


376-376: Unused method argument: mock_messages

(ARG002)


383-383: Unused method argument: mock_submit

(ARG002)


476-476: Unused method argument: mock_submit

(ARG002)


495-495: Unused method argument: mock_submit

(ARG002)


548-548: Unused method argument: mock_messages

(ARG002)


598-598: Unused method argument: mock_temp

(ARG002)


693-693: Unused method argument: mock_messages

(ARG002)


832-832: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


852-852: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


879-879: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


892-892: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


917-917: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


931-931: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


941-941: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


946-946: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


949-949: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


1138-1138: Unused method argument: mock_messages

(ARG002)

nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


101-101: Unpacked variable cursor is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


106-106: Consider moving this statement to an else block

(TRY300)


107-107: Do not catch blind exception: Exception

(BLE001)


120-120: Consider moving this statement to an else block

(TRY300)


121-121: Do not catch blind exception: Exception

(BLE001)


125-125: Consider moving this statement to an else block

(TRY300)


126-126: Do not catch blind exception: Exception

(BLE001)


163-163: Do not catch blind exception: Exception

(BLE001)


171-172: try-except-pass detected, consider logging the exception

(S110)


171-171: Do not catch blind exception: Exception

(BLE001)


218-218: Do not catch blind exception: Exception

(BLE001)


288-288: Do not catch blind exception: Exception

(BLE001)


292-293: try-except-pass detected, consider logging the exception

(S110)


292-292: Do not catch blind exception: Exception

(BLE001)


367-368: try-except-pass detected, consider logging the exception

(S110)


367-367: Do not catch blind exception: Exception

(BLE001)


375-376: try-except-pass detected, consider logging the exception

(S110)


375-375: Do not catch blind exception: Exception

(BLE001)


378-378: Do not catch blind exception: Exception

(BLE001)


382-383: try-except-pass detected, consider logging the exception

(S110)


382-382: Do not catch blind exception: Exception

(BLE001)


440-440: Consider moving this statement to an else block

(TRY300)


441-441: Do not catch blind exception: Exception

(BLE001)


448-449: try-except-pass detected, consider logging the exception

(S110)


448-448: Do not catch blind exception: Exception

(BLE001)


493-493: Consider moving this statement to an else block

(TRY300)


494-494: Do not catch blind exception: Exception

(BLE001)


501-502: try-except-pass detected, consider logging the exception

(S110)


501-501: Do not catch blind exception: Exception

(BLE001)


556-556: Consider moving this statement to an else block

(TRY300)


558-558: Do not catch blind exception: Exception

(BLE001)


578-578: Do not catch blind exception: Exception

(BLE001)


608-608: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


616-617: try-except-pass detected, consider logging the exception

(S110)


616-616: Do not catch blind exception: Exception

(BLE001)


626-626: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


707-707: Consider moving this statement to an else block

(TRY300)


709-709: Do not catch blind exception: Exception

(BLE001)


716-717: try-except-pass detected, consider logging the exception

(S110)


716-716: Do not catch blind exception: Exception

(BLE001)


795-796: try-except-pass detected, consider logging the exception

(S110)


795-795: Do not catch blind exception: Exception

(BLE001)


837-838: try-except-pass detected, consider logging the exception

(S110)


837-837: Do not catch blind exception: Exception

(BLE001)


883-884: try-except-pass detected, consider logging the exception

(S110)


883-883: Do not catch blind exception: Exception

(BLE001)


946-947: try-except-pass detected, consider logging the exception

(S110)


946-946: Do not catch blind exception: Exception

(BLE001)


1098-1098: Do not catch blind exception: Exception

(BLE001)


1104-1105: try-except-pass detected, consider logging the exception

(S110)


1104-1104: Do not catch blind exception: Exception

(BLE001)


1174-1174: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test Docker 26.1.4-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 27.5.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.0.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker image
🔇 Additional comments (21)
tests/core/test_exclude_ports.py (5)

62-64: LGTM!

The mock correctly returns a JSON string, aligning with the updated find_events behavior after the APSW migration.

Based on learnings, find_events now returns JSON strings rather than objects with a json_event attribute.


97-99: LGTM!

Correctly updated to return JSON strings matching the new find_events interface.


120-122: LGTM!

Mock return value correctly updated to JSON string format.


178-180: LGTM!

Properly updated to use JSON strings.


303-306: LGTM!

Correctly returns multiple JSON string events, properly testing the multi-event scenario.

pyproject.toml (2)

97-99: LGTM!

The pytest markers configuration is correctly set up to support async test execution with pytest-asyncio.


68-68: Consider moving APSW to an optional dependency group.

APSW is imported with a try/except block and only used when Config.settings.use_apsw_for_sqlite is explicitly enabled (defaults to False). Since APSW is truly optional, it should be moved to a Poetry extras group to avoid unnecessary installation overhead for users who don't need SQLite with APSW support.

nettacker/core/app.py (1)

161-161: LGTM! Event deserialization updated correctly.

The change from json.loads(row.json_event) to json.loads(row) correctly reflects that find_events now returns JSON strings directly rather than objects with a json_event attribute. This aligns with the dual-backend (APSW/SQLAlchemy) design where both paths return consistent string representations.

Based on learnings, this pattern ensures consistency across different database backends.

nettacker/core/lib/base.py (1)

55-55: LGTM! Temporary event deserialization updated correctly.

The change from json.loads(event.event) to json.loads(event) correctly reflects that find_temp_events now returns JSON strings directly. This maintains consistency with the dual-backend implementation where both APSW and SQLAlchemy paths return string representations.

Based on learnings, this ensures consistent behavior across database backends.

tests/database/test_db.py (6)

1-66: LGTM! Test class setup is well-structured.

The test class setup properly initializes sample data for various test scenarios. The use of mocking is appropriate for unit testing database operations without requiring actual database instances.


107-148: LGTM! Connection tests cover both backends.

The tests properly verify:

  • APSW connection setup with PRAGMA configurations
  • Timeout settings
  • SQLAlchemy session creation for non-SQLite engines

154-222: LGTM! Query submission tests are comprehensive.

The tests cover:

  • Successful commits for both SQLite and SQLAlchemy
  • Retry logic with transient failures
  • Final failure after exhausting retries
  • Proper warning logging

Note: The static analysis warnings about unused mock_messages parameters are false positives—these are used by the @patch decorators to mock the function, even if not directly referenced in test bodies.


340-509: Excellent test coverage for log submission!

The tests thoroughly cover:

  • SQLite (APSW) and SQLAlchemy paths
  • Transaction management (BEGIN statements)
  • APSW BusyError retry logic with proper warning messages
  • Invalid input handling
  • Generic exceptions
  • Successful and failed submissions

This comprehensive coverage ensures the dual-backend implementation works correctly under various conditions.


515-650: LGTM! Temporary log tests mirror production log tests.

The tests properly verify:

  • Insertion into temp_events table
  • Retry logic with BusyError
  • Operational continuity (returning True after retry exhaustion to allow scans to continue)

The design decision to continue operations even when temp log insertion fails is properly tested here, based on learnings.


766-1277: LGTM! Comprehensive test coverage for all database operations.

The remaining tests thoroughly cover:

  • Event finding and filtering
  • Report selection with pagination
  • Scan result retrieval
  • Host log queries
  • Scan ID-based queries
  • Report generation (JSON and HTML)
  • Log searching with error handling

All tests properly verify both SQLite (APSW) and SQLAlchemy code paths, ensuring the dual-backend implementation works correctly. The use of /tmp paths in tests (flagged by static analysis) is acceptable for test fixtures.

nettacker/config.py (1)

111-112: LGTM! SQLite tuning parameters are well-chosen.

Adding journal_mode = "WAL" and synchronous_mode = "NORMAL" as defaults provides:

  • WAL mode enables better concurrency for reads/writes
  • NORMAL synchronous mode balances data safety with performance

These are appropriate defaults for an application that handles concurrent scans.

nettacker/database/db.py (5)

410-462: LGTM - JSON string return type maintained for consistency.

Both APSW and SQLAlchemy branches correctly return JSON strings (or empty array if not found), maintaining consistency with calling code expectations per prior learnings.


465-513: LGTM - Correct JSON handling per prior learnings.

The json.dumps(json.loads(row[0])) pattern at line 492 is intentional to ensure consistent JSON string format matching the SQLAlchemy branch.


516-580: LGTM - Pagination logic is correct and consistent across both backends.

Both APSW and SQLAlchemy paths correctly implement pagination with equivalent offset calculations.


812-844: LGTM - Both paths return options as raw JSON strings consistently.


1018-1178: LGTM - Search functionality implemented correctly for both backends.

Both paths implement equivalent search and pagination logic, aggregating results by target with module names, ports, dates, events, and json_events.

@arkid15r arkid15r changed the title porting sqlite operations to APSW along with unittests for the same Porting sqlite operations to APSW Jan 10, 2026
@arkid15r arkid15r changed the title Porting sqlite operations to APSW Port sqlite operations to APSW Jan 10, 2026
Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have a chance to test it live. But here ares some obvious items to address:

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @nettacker/core/utils/common.py:
- Around line 36-39: The code assumes result["json_event"] is a JSON string when
not a dict; add a type check and safe parse: if "json_event" exists and is not a
dict, only attempt json.loads when isinstance(result["json_event"], str), wrap
json.loads(...) in a try/except catching json.JSONDecodeError (and fallback to
ValueError for older Pythons) and on failure leave the original value or set
result["json_event"] = None (choose consistent behavior), and handle non-string
types by not calling json.loads and leaving or normalizing them explicitly.

In @nettacker/database/db.py:
- Line 59: The busy timeout conversion is using the wrong multiplier; change the
APSW call in db.py where connection.setbusytimeout(int(config.settings.timeout)
* 100) is used to multiply by 1000 instead so seconds -> milliseconds conversion
is correct (e.g., update the line using connection.setbusytimeout(...) in the
database initialization code referenced by connection and
config.settings.timeout).
🧹 Nitpick comments (6)
docs/Usage.md (2)

676-676: Clarify the APSW description.

The documentation refers to APSW as a "lite wrapper for Sqlite," but APSW is actually a more comprehensive, lower-level wrapper that provides direct access to SQLite's C API, often offering better performance than the standard library's sqlite3 module. Consider rephrasing to something like:

"You can use APSW (Another Python SQLite Wrapper), a performance-oriented SQLite wrapper, by setting the use_apsw_for_sqlite parameter to True."


689-689: Consider noting security implications of ssl_mode = "disable".

The example shows ssl_mode = "disable", which transmits credentials and data in plaintext. While this may be acceptable for local development, consider adding a note recommending ssl_mode = "require" or ssl_mode = "verify-ca" for production deployments to protect sensitive data in transit.

nettacker/database/db.py (4)

53-53: Consider using a custom exception class.

Hardcoded exception messages should be moved to exception class definitions for better maintainability.

♻️ Proposed refactor
+class APSWNotAvailableError(ImportError):
+    """Raised when APSW is required but not installed."""
+    def __init__(self):
+        super().__init__("APSW is required for SQLite backend.")
+
 def create_connection():
     ...
     if Config.db.engine.startswith("sqlite") and Config.settings.use_apsw_for_sqlite:
         if apsw is None:
-            raise ImportError("APSW is required for SQLite backend.")
+            raise APSWNotAvailableError()

68-68: Use logger.exception for better error reporting.

When logging from an exception handler, logger.exception() automatically includes the stack trace, making debugging easier.

♻️ Proposed refactor
         except Exception as e:
-            logger.error(f"Failed to create APSW connection: {e}")
+            logger.exception("Failed to create APSW connection")
             raise

170-171: Consider logging cleanup exceptions.

The silent exception handling in the finally block might hide issues during resource cleanup. Consider logging at debug level.

♻️ Proposed refactor
             finally:
                 try:
                     cursor.close()
                     connection.close()
                 except Exception:
-                    pass
+                    logger.debug("Exception during resource cleanup in submit_report_to_db")

514-1176: LGTM: Comprehensive dual-backend support.

All query functions (select_reports, get_scan_result, last_host_logs, get_logs_by_scan_id, get_options_by_scan_id, logs_to_report_json, logs_to_report_html, search_logs) correctly implement both APSW and SQLAlchemy paths with consistent:

  • Data structure returns
  • JSON parsing/serialization
  • Resource cleanup in finally blocks
  • Error handling

The implementation maintains backward compatibility while adding APSW performance benefits for SQLite.

Optionally, consider logging exceptions in the try-except-pass cleanup blocks throughout these functions at debug level to aid troubleshooting, similar to the suggestion for submit_report_to_db.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cbc02e2 and 238bf27.

📒 Files selected for processing (6)
  • docs/Usage.md
  • nettacker/config.py
  • nettacker/core/utils/common.py
  • nettacker/database/db.py
  • pyproject.toml
  • tests/database/test_db.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • nettacker/config.py
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • tests/database/test_db.py
  • nettacker/database/db.py
  • nettacker/core/utils/common.py
tests/**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/test_*.py: Place tests under tests/ and name files as test_*.py
Use pytest (with pytest-asyncio, xdist) for tests

Files:

  • tests/database/test_db.py
tests/**

📄 CodeRabbit inference engine (AGENTS.md)

Mirror package layout in tests (e.g., tests/core/, tests/lib/, etc.)

Files:

  • tests/database/test_db.py
docs/**

📄 CodeRabbit inference engine (AGENTS.md)

Place documentation under docs/

Files:

  • docs/Usage.md
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/database/db.py
  • nettacker/core/utils/common.py
nettacker/core/**

📄 CodeRabbit inference engine (AGENTS.md)

Place core libraries under nettacker/core/

Files:

  • nettacker/core/utils/common.py
🧠 Learnings (14)
📓 Common learnings
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Maintain coverage (pytest configured with --cov=nettacker); add tests for new features and bug fixes

Applied to files:

  • tests/database/test_db.py
  • docs/Usage.md
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • tests/database/test_db.py
  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • tests/database/test_db.py
  • nettacker/database/db.py
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • tests/database/test_db.py
  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Applies to nettacker/**/*.py : Add docstrings for public APIs in the nettacker package

Applied to files:

  • docs/Usage.md
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Applies to nettacker/config.py : Manage defaults (API key, DB path, paths) in nettacker/config.py and review sensitive headers list before logging

Applied to files:

  • docs/Usage.md
  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • docs/Usage.md
  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Run tests with `make test` or `poetry run pytest` and ensure they pass before pushing

Applied to files:

  • pyproject.toml
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Applies to nettacker.py : Repository provides nettacker.py as an entry script (use as Python entry point)

Applied to files:

  • pyproject.toml
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Applies to tests/**/test_*.py : Use pytest (with pytest-asyncio, xdist) for tests

Applied to files:

  • pyproject.toml
📚 Learning: 2026-01-10T16:07:16.834Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:100-114
Timestamp: 2026-01-10T16:07:16.834Z
Learning: In nettacker/database/db.py, the send_submit_query function intentionally closes the connection in its finally block. This is by design because send_submit_query is meant to be a one-off operation that commits data and performs complete cleanup. The pattern is: create connection → pass to send_submit_query → send_submit_query commits and closes the connection.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-10-25T11:31:26.498Z
Learnt from: Aarush289
Repo: OWASP/Nettacker PR: 1155
File: nettacker/core/hostcheck.py:121-122
Timestamp: 2025-10-25T11:31:26.498Z
Learning: In nettacker/core/hostcheck.py, the resolve_quick() function intentionally applies timeout_sec per resolution pass (not cumulatively), allowing up to 2× timeout_sec total execution time across both AI_ADDRCONFIG passes by design.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Parametrize tests where useful

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
tests/database/test_db.py (1)
nettacker/database/db.py (17)
  • db_inputs (22-40)
  • create_connection (43-84)
  • send_submit_query (87-127)
  • submit_report_to_db (130-181)
  • remove_old_logs (184-232)
  • submit_logs_to_db (235-315)
  • submit_temp_logs_to_db (318-405)
  • find_temp_events (408-460)
  • find_events (463-511)
  • select_reports (514-578)
  • get_scan_result (581-625)
  • last_host_logs (628-754)
  • get_logs_by_scan_id (757-807)
  • get_options_by_scan_id (810-842)
  • logs_to_report_json (845-898)
  • logs_to_report_html (901-1013)
  • search_logs (1016-1176)
🪛 LanguageTool
docs/Usage.md

[style] ~674-~674: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...e the default and recommended settings. Feel free to play around and change them according t...

(FEEL_FREE_TO_STYLE_ME)

🪛 Ruff (0.14.10)
tests/database/test_db.py

34-34: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


116-116: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


126-126: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


185-185: Unused method argument: mock_messages

(ARG002)


188-188: Create your own exception

(TRY002)


188-188: Avoid specifying long messages outside the exception class

(TRY003)


214-214: Unused method argument: mock_messages

(ARG002)


248-248: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


376-376: Unused method argument: mock_messages

(ARG002)


383-383: Unused method argument: mock_submit

(ARG002)


476-476: Unused method argument: mock_submit

(ARG002)


495-495: Unused method argument: mock_submit

(ARG002)


548-548: Unused method argument: mock_messages

(ARG002)


598-598: Unused method argument: mock_temp

(ARG002)


693-693: Unused method argument: mock_messages

(ARG002)


832-832: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


852-852: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


879-879: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


892-892: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


917-917: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


931-931: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


941-941: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


946-946: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


949-949: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


1138-1138: Unused method argument: mock_messages

(ARG002)

nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


124-124: Consider moving this statement to an else block

(TRY300)


125-125: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


170-171: try-except-pass detected, consider logging the exception

(S110)


170-170: Do not catch blind exception: Exception

(BLE001)


217-217: Do not catch blind exception: Exception

(BLE001)


286-286: Do not catch blind exception: Exception

(BLE001)


290-291: try-except-pass detected, consider logging the exception

(S110)


290-290: Do not catch blind exception: Exception

(BLE001)


365-366: try-except-pass detected, consider logging the exception

(S110)


365-365: Do not catch blind exception: Exception

(BLE001)


373-374: try-except-pass detected, consider logging the exception

(S110)


373-373: Do not catch blind exception: Exception

(BLE001)


376-376: Do not catch blind exception: Exception

(BLE001)


380-381: try-except-pass detected, consider logging the exception

(S110)


380-380: Do not catch blind exception: Exception

(BLE001)


438-438: Consider moving this statement to an else block

(TRY300)


439-439: Do not catch blind exception: Exception

(BLE001)


446-447: try-except-pass detected, consider logging the exception

(S110)


446-446: Do not catch blind exception: Exception

(BLE001)


491-491: Consider moving this statement to an else block

(TRY300)


492-492: Do not catch blind exception: Exception

(BLE001)


499-500: try-except-pass detected, consider logging the exception

(S110)


499-499: Do not catch blind exception: Exception

(BLE001)


554-554: Consider moving this statement to an else block

(TRY300)


556-556: Do not catch blind exception: Exception

(BLE001)


576-576: Do not catch blind exception: Exception

(BLE001)


606-606: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


614-615: try-except-pass detected, consider logging the exception

(S110)


614-614: Do not catch blind exception: Exception

(BLE001)


624-624: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


705-705: Consider moving this statement to an else block

(TRY300)


707-707: Do not catch blind exception: Exception

(BLE001)


714-715: try-except-pass detected, consider logging the exception

(S110)


714-714: Do not catch blind exception: Exception

(BLE001)


793-794: try-except-pass detected, consider logging the exception

(S110)


793-793: Do not catch blind exception: Exception

(BLE001)


835-836: try-except-pass detected, consider logging the exception

(S110)


835-835: Do not catch blind exception: Exception

(BLE001)


881-882: try-except-pass detected, consider logging the exception

(S110)


881-881: Do not catch blind exception: Exception

(BLE001)


944-945: try-except-pass detected, consider logging the exception

(S110)


944-944: Do not catch blind exception: Exception

(BLE001)


1096-1096: Do not catch blind exception: Exception

(BLE001)


1102-1103: try-except-pass detected, consider logging the exception

(S110)


1102-1102: Do not catch blind exception: Exception

(BLE001)


1172-1172: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test build on macos-latest
  • GitHub Check: Test build on ubuntu-24.04
  • GitHub Check: Test Docker 26.1.4-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.0.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 27.5.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker image
🔇 Additional comments (20)
nettacker/core/utils/common.py (1)

6-6: LGTM! Import properly placed.

The json import is required for the new deserialization logic below and follows proper alphabetical ordering.

pyproject.toml (3)

62-62: Verify the Python version upper bound constraint.

The constraint ^3.9, <3.13 excludes Python 3.13 and later versions. While this may be intentional due to compatibility concerns, it could limit adoption of newer Python releases. Confirm whether this upper bound is necessary or if it should be relaxed to support future Python versions.


73-73: LGTM! Async testing infrastructure added.

The addition of pytest-asyncio as a dev dependency and the asyncio marker configuration properly support asynchronous testing, which aligns with the async patterns likely introduced in the database operations refactor.

Also applies to: 98-100


49-49: Remove the concern about impacket's relevance; the dependency versions are secure and current.

The versions pinned in pyproject.toml are secure and appropriately chosen:

  • apsw ^3.50.0.0: No known CVEs; latest available is 3.51.2.0.
  • pymysql 1.1.1: Properly fixes CVE-2024-36039 (SQL injection via untrusted JSON keys); earlier versions are vulnerable.
  • impacket ^0.11.0: No vulnerabilities; CVE-2020-17049 was fixed in this release. The dependency is actively used in the codebase for SMB protocol support (nettacker/core/lib/smb.py), so it is directly relevant to the project's scope.

All three versions can optionally be updated to their latest releases (3.51.2.0, 1.1.2, and 0.13.0 respectively) within the semantic versioning constraints if desired.

docs/Usage.md (4)

649-657: LGTM! Clear database overview.

The documentation clearly introduces the three supported database backends (MySQL, PostgreSQL, SQLite) and notes that SQLite is the default, setting appropriate context for the configuration sections that follow.


658-675: LGTM! Comprehensive SQLite/APSW configuration documented.

The SQLite configuration section clearly documents:

  • The default database path (.nettacker/data/nettacker.db)
  • Performance-optimized settings (journal_mode="WAL" and synchronous_mode="NORMAL")
  • All configuration fields with appropriate defaults

These settings are well-suited for multithreaded I/O operations as mentioned.


678-695: LGTM! MySQL configuration clearly documented.

The MySQL configuration section provides a complete example with all necessary fields (engine, name, host, port, credentials) and helpfully notes which fields are irrelevant for MySQL. The structure is clear and easy to follow.


696-715: LGTM! PostgreSQL configuration well documented.

The PostgreSQL configuration section is comprehensive and includes a helpful security note about enabling encryption with ssl_mode = "require". The documentation clearly identifies which fields are irrelevant for PostgreSQL, making it easy for users to understand what to configure.

nettacker/database/db.py (7)

4-8: LGTM: Clean optional import pattern.

The optional APSW import with fallback to None is well-implemented and allows graceful degradation when APSW is not available.


99-127: LGTM: Dual backend submit logic is well-implemented.

The function correctly handles both APSW (tuple) and SQLAlchemy (session) backends with retry logic. The connection close in the finally block is intentional for the APSW path (as per learnings) since this is a one-off commit operation.

Based on learnings, the broad exception handling is necessary for the retry mechanism to work across different database error types.


144-171: LGTM: Dual backend report submission is correctly implemented.

The APSW path with explicit BEGIN and SQL INSERT is well-structured. The PR author confirmed that BEGIN is intentional to bind transactions to the connection. Resource cleanup in the finally block handles both success and failure cases appropriately.


249-298: LGTM: Robust retry logic for APSW path.

The APSW implementation includes comprehensive retry handling with specific BusyError detection for database locks. The transaction state checks (in_transaction) prevent double-rollback errors.

Based on learnings, returning True after exhausting retries (line 295) is an intentional design decision to ensure operational continuity—allowing scans to continue even if individual log entries fail.


332-388: LGTM: Consistent implementation with submit_logs_to_db.

The temporary logs submission follows the same robust patterns as the main log submission, including retry logic, BusyError handling, and intentional return of True after retry exhaustion for operational continuity.


422-460: LGTM: Consistent JSON string returns across backends.

Both APSW and SQLAlchemy paths correctly return raw JSON strings (not parsed dicts) for consistency with calling code expectations. Based on learnings, this pattern is intentional—APSW returns row[0] directly and SQLAlchemy returns result.event.


476-511: LGTM: Correct JSON string handling per learnings.

The APSW path uses json.dumps(json.loads(row[0])) pattern which, per learnings, is required to maintain consistency with the SQLAlchemy branch that returns raw JSON strings. Both backends correctly return lists of JSON strings.

tests/database/test_db.py (5)

1-66: LGTM: Well-structured test suite.

The test class setup is clean with comprehensive sample data fixtures. The imports are appropriate for testing both APSW and SQLAlchemy paths with proper mocking.


67-149: LGTM: Comprehensive connection creation tests.

The tests properly verify both APSW SQLite path (with PRAGMA settings and busy timeout) and SQLAlchemy path for other databases. Mocking is correctly applied to isolate the functionality under test.


150-334: LGTM: Thorough testing of submit operations.

The tests cover both success and failure scenarios for both backends. The retry logic is properly tested including the case where retries are exhausted.

Note: Static analysis flags unused method arguments (e.g., mock_messages, mock_submit) but these are necessary for @patch decorators and are not actual issues.


336-651: LGTM: Excellent coverage of log submission scenarios.

The tests thoroughly validate:

  • Both APSW and SQLAlchemy paths
  • Retry logic with BusyError handling
  • Transaction state management
  • Error cases (invalid input, database errors, submission failures)
  • Operational continuity (returning True after retry exhaustion)

The test structure mirrors the complexity of the implementation and validates all critical paths.


653-1278: LGTM: Comprehensive query and report generation tests.

Excellent test coverage for all database query operations:

  • find_temp_events and find_events with both backends
  • Report selection and retrieval
  • Log retrieval by scan ID
  • HTML and JSON report generation
  • Search functionality with pagination

Tests validate both successful operations and error handling across APSW and SQLAlchemy paths. The mock data structures accurately simulate real database results.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Fix all issues with AI agents
In @nettacker/database/db.py:
- Around line 603-607: The try block in the function that returns (filename,
data) opens the file with open(str(filename), "rb").read() without closing the
handle; change it to use a context manager (with open(str(filename), "rb") as
f:) and read into a variable, then return filename and the data, ensuring the
file is closed even on exceptions; keep the existing except IOError as e handler
and return None on error.
- Around line 728-732: The list comprehension accesses
session.query(HostsLog).filter(...).order_by(...).first().date which can raise
AttributeError if .first() returns None; change it to fetch the row into a
temporary variable (e.g., last_log =
session.query(HostsLog).filter(HostsLog.target ==
host.target).order_by(HostsLog.id.desc()).first()) and then use a conditional
like last_log.date if last_log else None (or another appropriate default) when
building the dict so it matches the APSW path's None-safe behavior.
- Around line 621-625: The try block in the function that returns
report.report_path_filename opens the file with
open(str(report.report_path_filename), "rb") but never closes it, causing a
resource leak; change the logic to open the file using a context manager (with
open(...) as f:) to read the bytes into a variable and then return
(report.report_path_filename, data) so the file is always closed, while keeping
the existing except IOError as e handler that logs the error and returns None.
- Around line 99-113: The retry loop and session handling are wrong: change the
loop to attempt exactly Config.settings.max_submit_query_retry times (use
range(Config.settings.max_submit_query_retry) semantics) and stop discarding the
cursor; when session is a tuple extract both connection and cursor (e.g.,
connection, cursor = session) and call cursor.execute("COMMIT") /
cursor.execute("ROLLBACK") rather than connection.execute so behavior matches
the rest of the codebase; keep the same sleep/close/fallback logging
(logger.warn(messages("database_connect_fail"))) but ensure connection.close()
is still called in the finally block.
- Around line 1159-1166: The comparisons are using the raw JSON string
(data.port, data.event, data.json_event) but the lists in
selected[capture]["info"] store parsed objects, so duplicates are always added;
fix by parsing the JSON first and comparing the parsed value before appending
(e.g., compute parsed_port = json.loads(data.port) and check parsed_port not in
selected[capture]["info"]["port"] before appending), and do the same for
data.event and data.json_event to mirror the APSW logic.
- Around line 161-171: The outer finally redundantly closes cursor/connection
that send_submit_query(session) already commits and closes; remove the
cursor.close() and connection.close() calls from this finally block so cleanup
is handled only inside send_submit_query, or alternatively guard the outer close
to run only if connection/cursor exist and are not already closed; focus changes
around the finally that references cursor and connection and the
send_submit_query(session) call to avoid double-closing the same DB resources.
- Around line 850-877: The APSW branch in the code that uses
create_connection()/cursor executes a SELECT into rows but only returns
return_logs inside the if rows: block, causing an implicit None when no rows are
found; update the APSW path (the code around create_connection, cursor.execute,
rows, and return_logs) so that after the rows handling completes (but before
exiting the try/finally) it always returns return_logs (an empty list when no
rows), mirroring the ORM path and ensuring callers can safely iterate the
result.
- Around line 790-802: The ORM branch returns raw JSON string for
HostsLog.json_event causing a type mismatch with the APSW branch; update the
list comprehension that builds the dict (the branch using HostsLog and keys
"port","event","json_event") to parse json_event the same way as APSW (e.g., use
json.loads(log.json_event) and handle null/empty safely), so json_event is a
parsed object consistently across backends.
🧹 Nitpick comments (1)
nettacker/database/db.py (1)

942-968: Consider extracting duplicated HTML generation logic.

The HTML generation code (table building, formatting, footer) is duplicated between the APSW and ORM paths. Extracting this into a helper function would improve maintainability.

Also applies to: 982-1008

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 238bf27 and 99f0fcc.

📒 Files selected for processing (2)
  • nettacker/core/utils/common.py
  • nettacker/database/db.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • nettacker/core/utils/common.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • nettacker/database/db.py
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/database/db.py
🧠 Learnings (9)
📓 Common learnings
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2026-01-10T16:07:16.834Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:100-114
Timestamp: 2026-01-10T16:07:16.834Z
Learning: In nettacker/database/db.py, the send_submit_query function intentionally closes the connection in its finally block. This is by design because send_submit_query is meant to be a one-off operation that commits data and performs complete cleanup. The pattern is: create connection → pass to send_submit_query → send_submit_query commits and closes the connection.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-10-25T11:31:26.498Z
Learnt from: Aarush289
Repo: OWASP/Nettacker PR: 1155
File: nettacker/core/hostcheck.py:121-122
Timestamp: 2025-10-25T11:31:26.498Z
Learning: In nettacker/core/hostcheck.py, the resolve_quick() function intentionally applies timeout_sec per resolution pass (not cumulatively), allowing up to 2× timeout_sec total execution time across both AI_ADDRCONFIG passes by design.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Applies to nettacker/config.py : Manage defaults (API key, DB path, paths) in nettacker/config.py and review sensitive headers list before logging

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Parametrize tests where useful

Applied to files:

  • nettacker/database/db.py
🪛 Ruff (0.14.10)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


124-124: Consider moving this statement to an else block

(TRY300)


125-125: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


170-171: try-except-pass detected, consider logging the exception

(S110)


170-170: Do not catch blind exception: Exception

(BLE001)


217-217: Do not catch blind exception: Exception

(BLE001)


286-286: Do not catch blind exception: Exception

(BLE001)


290-291: try-except-pass detected, consider logging the exception

(S110)


290-290: Do not catch blind exception: Exception

(BLE001)


365-366: try-except-pass detected, consider logging the exception

(S110)


365-365: Do not catch blind exception: Exception

(BLE001)


373-374: try-except-pass detected, consider logging the exception

(S110)


373-373: Do not catch blind exception: Exception

(BLE001)


376-376: Do not catch blind exception: Exception

(BLE001)


380-381: try-except-pass detected, consider logging the exception

(S110)


380-380: Do not catch blind exception: Exception

(BLE001)


438-438: Consider moving this statement to an else block

(TRY300)


439-439: Do not catch blind exception: Exception

(BLE001)


446-447: try-except-pass detected, consider logging the exception

(S110)


446-446: Do not catch blind exception: Exception

(BLE001)


491-491: Consider moving this statement to an else block

(TRY300)


492-492: Do not catch blind exception: Exception

(BLE001)


499-500: try-except-pass detected, consider logging the exception

(S110)


499-499: Do not catch blind exception: Exception

(BLE001)


554-554: Consider moving this statement to an else block

(TRY300)


556-556: Do not catch blind exception: Exception

(BLE001)


576-576: Do not catch blind exception: Exception

(BLE001)


606-606: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


614-615: try-except-pass detected, consider logging the exception

(S110)


614-614: Do not catch blind exception: Exception

(BLE001)


624-624: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


705-705: Consider moving this statement to an else block

(TRY300)


707-707: Do not catch blind exception: Exception

(BLE001)


714-715: try-except-pass detected, consider logging the exception

(S110)


714-714: Do not catch blind exception: Exception

(BLE001)


788-789: try-except-pass detected, consider logging the exception

(S110)


788-788: Do not catch blind exception: Exception

(BLE001)


830-831: try-except-pass detected, consider logging the exception

(S110)


830-830: Do not catch blind exception: Exception

(BLE001)


876-877: try-except-pass detected, consider logging the exception

(S110)


876-876: Do not catch blind exception: Exception

(BLE001)


939-940: try-except-pass detected, consider logging the exception

(S110)


939-939: Do not catch blind exception: Exception

(BLE001)


1091-1091: Do not catch blind exception: Exception

(BLE001)


1097-1098: try-except-pass detected, consider logging the exception

(S110)


1097-1097: Do not catch blind exception: Exception

(BLE001)


1167-1167: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Docker 26.0.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 27.5.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.1.4-1ubuntu.24.04noble image build
  • GitHub Check: Build package
  • GitHub Check: Test Docker image
🔇 Additional comments (1)
nettacker/database/db.py (1)

436-438: Return type is consistent with expectations.

Both APSW and ORM branches return JSON strings (or empty list), which matches the documented expectation that calling code expects JSON strings. Based on learnings, this pattern is intentional.

…dle __enter__ (as earlier we didn't have to)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
nettacker/database/db.py (3)

130-176: APSW branch of submit_report_to_db() needs a finally cleanup.

On success it returns after send_submit_query(session), and on exception it returns after ROLLBACK, but neither path closes cursor/connection here. If you adopt the send_submit_query() fix above (closing cursor+connection), this becomes safe; otherwise add a local finally.


43-85: Fix busy-timeout multiplier: should be * 1000 (milliseconds), not * 100.

The config.settings.timeout defaults to 3.0 (seconds). APSW's setbusytimeout() expects milliseconds, so the current * 100 multiplier results in 300ms instead of the intended 3000ms timeout. Change to * 1000 to correctly convert seconds to milliseconds.

The PRAGMA f-strings (f"PRAGMA journal_mode={Config.db.journal_mode}") are currently safe since values are hardcoded config defaults, not user input. However, if these become user-configurable in the future, add validation against an allowlist to prevent PRAGMA/statement injection via APSW's multi-statement support.


750-801: get_logs_by_scan_id() returns inconsistent json_event types across backends.

APSW path returns json.loads(row[6]) (dict), while ORM path returns log.json_event (JSON string). Both should return the same type for consistency.

Caution: The suggested fix to align both branches on dict type aligns with the test expectation and matches the pattern in get_logs_by_target(). However, this will cause graph.py:134 to fail, which calls .strip() on json_event—a method that only exists on strings, not dicts. Verify that this code path is tested or reachable before applying the fix, or adjust graph.py:134 to handle dict type.

Suggested fix (align both to dict)
         return [
             {
@@
-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},
             }
             for log in session.query(HostsLog).filter(HostsLog.scan_unique_id == scan_id).all()
         ]
🤖 Fix all issues with AI agents
In @nettacker/database/db.py:
- Around line 99-127: In submit_report_to_db() and send_submit_query() ensure
APSW cursors and connections are always closed and fix the retry off-by-one:
change the retry loops from range(1, Config.settings.max_submit_query_retry)
(which makes max-1 attempts) to either
range(Config.settings.max_submit_query_retry) or range(1,
Config.settings.max_submit_query_retry + 1) and wrap the commit/rollback logic
in try/finally that closes cursor.close() and connection.close() (for the
tuple/APS W branch use the cursor variable returned from the APSW session and
close it before closing the connection; for the SQLAlchemy/session branch ensure
session is properly closed or removed in the finally block). Ensure error paths
also close resources and preserve existing logging and sleep(retry_delay).

In @tests/database/test_db.py:
- Around line 29-66: The test hard-codes "/tmp/test_report.json" in
TestDatabase.setup_method (sample_event["options"]["report_path_filename"]),
triggering ruff S108; fix by either replacing that literal with a pytest
tmp_path-based value (e.g., use tmp_path / "test_report.json" converted to str
and pass tmp_path into the test that constructs sample_event, or build
sample_event inside a fixture that receives tmp_path) or, if you intend to keep
a constant, annotate the literal with "# noqa: S108" next to
sample_event["options"]["report_path_filename"]; update references to
sample_event accordingly so tests use the tmp_path-derived path or the
noqa-marked constant.
- Around line 183-223: Tests patch decorators inject args that are unused (e.g.,
in test_send_submit_query_sqlite_failure and
test_send_submit_query_sqlalchemy_failure); rename those unused parameters to
start with an underscore (for example _mock_warn, _mock_messages) to satisfy
ruff ARG002; apply the same _-prefix convention to other affected test functions
mentioned (around lines 374-380, 546-552, 909-945, 1132-1134) so patched but
unused fixtures are clearly marked as intentionally unused.
- Around line 705-733: The tests for find_temp_events currently use a dict and
expect a dict; change them to match the DB contract which returns JSON strings
by setting fake_result.event to a JSON string (e.g., '{"foo":"bar"}') and update
the assertion to expect that string; also change the "no result" test to assert
that find_temp_events(...) returns [] directly instead of coercing to None.
Ensure you modify the mocks used in test_sqlalchemy_successful_lookup
(fake_result.event) and the assertions in both test_sqlalchemy_successful_lookup
and test_sqlalchemy_no_result to match the string/empty-list contract.
🧹 Nitpick comments (1)
tests/database/test_db.py (1)

340-513: Nice coverage across dual backends; consider pytest parametrization to cut repetition.

Also applies to: 515-652

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99f0fcc and 6924186.

📒 Files selected for processing (2)
  • nettacker/database/db.py
  • tests/database/test_db.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • nettacker/database/db.py
  • tests/database/test_db.py
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/database/db.py
tests/**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/test_*.py: Place tests under tests/ and name files as test_*.py
Use pytest (with pytest-asyncio, xdist) for tests

Files:

  • tests/database/test_db.py
tests/**

📄 CodeRabbit inference engine (AGENTS.md)

Mirror package layout in tests (e.g., tests/core/, tests/lib/, etc.)

Files:

  • tests/database/test_db.py
🧠 Learnings (10)
📓 Common learnings
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
  • tests/database/test_db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
  • tests/database/test_db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
  • tests/database/test_db.py
📚 Learning: 2026-01-10T16:07:16.834Z
Learnt from: pUrGe12
Repo: OWASP/Nettacker PR: 1076
File: nettacker/database/db.py:100-114
Timestamp: 2026-01-10T16:07:16.834Z
Learning: In nettacker/database/db.py, the send_submit_query function intentionally closes the connection in its finally block. This is by design because send_submit_query is meant to be a one-off operation that commits data and performs complete cleanup. The pattern is: create connection → pass to send_submit_query → send_submit_query commits and closes the connection.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-10-25T11:31:26.498Z
Learnt from: Aarush289
Repo: OWASP/Nettacker PR: 1155
File: nettacker/core/hostcheck.py:121-122
Timestamp: 2025-10-25T11:31:26.498Z
Learning: In nettacker/core/hostcheck.py, the resolve_quick() function intentionally applies timeout_sec per resolution pass (not cumulatively), allowing up to 2× timeout_sec total execution time across both AI_ADDRCONFIG passes by design.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Applies to nettacker/config.py : Manage defaults (API key, DB path, paths) in nettacker/config.py and review sensitive headers list before logging

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Parametrize tests where useful

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-09-07T19:20:58.332Z
Learnt from: CR
Repo: OWASP/Nettacker PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-09-07T19:20:58.332Z
Learning: Maintain coverage (pytest configured with --cov=nettacker); add tests for new features and bug fixes

Applied to files:

  • nettacker/database/db.py
  • tests/database/test_db.py
🧬 Code graph analysis (2)
nettacker/database/db.py (2)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
tests/database/test_db.py (2)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/database/db.py (17)
  • db_inputs (22-40)
  • create_connection (43-84)
  • send_submit_query (87-127)
  • submit_report_to_db (130-175)
  • remove_old_logs (178-226)
  • submit_logs_to_db (229-309)
  • submit_temp_logs_to_db (312-399)
  • find_temp_events (402-454)
  • find_events (457-505)
  • select_reports (508-572)
  • get_scan_result (575-623)
  • last_host_logs (626-747)
  • get_logs_by_scan_id (750-800)
  • get_options_by_scan_id (803-835)
  • logs_to_report_json (838-891)
  • logs_to_report_html (894-1006)
  • search_logs (1009-1169)
🪛 Ruff (0.14.10)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


124-124: Consider moving this statement to an else block

(TRY300)


125-125: Do not catch blind exception: Exception

(BLE001)


145-145: Unpacked variable connection is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


162-162: Do not catch blind exception: Exception

(BLE001)


211-211: Do not catch blind exception: Exception

(BLE001)


280-280: Do not catch blind exception: Exception

(BLE001)


284-285: try-except-pass detected, consider logging the exception

(S110)


284-284: Do not catch blind exception: Exception

(BLE001)


359-360: try-except-pass detected, consider logging the exception

(S110)


359-359: Do not catch blind exception: Exception

(BLE001)


367-368: try-except-pass detected, consider logging the exception

(S110)


367-367: Do not catch blind exception: Exception

(BLE001)


370-370: Do not catch blind exception: Exception

(BLE001)


374-375: try-except-pass detected, consider logging the exception

(S110)


374-374: Do not catch blind exception: Exception

(BLE001)


432-432: Consider moving this statement to an else block

(TRY300)


433-433: Do not catch blind exception: Exception

(BLE001)


440-441: try-except-pass detected, consider logging the exception

(S110)


440-440: Do not catch blind exception: Exception

(BLE001)


485-485: Consider moving this statement to an else block

(TRY300)


486-486: Do not catch blind exception: Exception

(BLE001)


493-494: try-except-pass detected, consider logging the exception

(S110)


493-493: Do not catch blind exception: Exception

(BLE001)


548-548: Consider moving this statement to an else block

(TRY300)


550-550: Do not catch blind exception: Exception

(BLE001)


570-570: Do not catch blind exception: Exception

(BLE001)


600-600: Consider moving this statement to an else block

(TRY300)


602-602: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


610-611: try-except-pass detected, consider logging the exception

(S110)


610-610: Do not catch blind exception: Exception

(BLE001)


620-620: Consider moving this statement to an else block

(TRY300)


622-622: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


703-703: Consider moving this statement to an else block

(TRY300)


705-705: Do not catch blind exception: Exception

(BLE001)


712-713: try-except-pass detected, consider logging the exception

(S110)


712-712: Do not catch blind exception: Exception

(BLE001)


786-787: try-except-pass detected, consider logging the exception

(S110)


786-786: Do not catch blind exception: Exception

(BLE001)


828-829: try-except-pass detected, consider logging the exception

(S110)


828-828: Do not catch blind exception: Exception

(BLE001)


874-875: try-except-pass detected, consider logging the exception

(S110)


874-874: Do not catch blind exception: Exception

(BLE001)


937-938: try-except-pass detected, consider logging the exception

(S110)


937-937: Do not catch blind exception: Exception

(BLE001)


1089-1089: Do not catch blind exception: Exception

(BLE001)


1095-1096: try-except-pass detected, consider logging the exception

(S110)


1095-1095: Do not catch blind exception: Exception

(BLE001)


1165-1165: Do not catch blind exception: Exception

(BLE001)

tests/database/test_db.py

34-34: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


116-116: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


126-126: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


185-185: Unused method argument: mock_messages

(ARG002)


188-188: Create your own exception

(TRY002)


188-188: Avoid specifying long messages outside the exception class

(TRY003)


214-214: Unused method argument: mock_messages

(ARG002)


248-248: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


376-376: Unused method argument: mock_messages

(ARG002)


383-383: Unused method argument: mock_submit

(ARG002)


476-476: Unused method argument: mock_submit

(ARG002)


495-495: Unused method argument: mock_submit

(ARG002)


548-548: Unused method argument: mock_messages

(ARG002)


598-598: Unused method argument: mock_temp

(ARG002)


693-693: Unused method argument: mock_messages

(ARG002)


832-832: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


852-852: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


879-879: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


892-892: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


911-911: Unused method argument: mock_open

(ARG002)


916-916: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


926-926: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


936-936: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


941-941: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


944-944: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


1133-1133: Unused method argument: mock_messages

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test build on ubuntu-24.04
  • GitHub Check: Test Docker 26.1.4-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.0.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 27.5.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker image
🔇 Additional comments (1)
nettacker/database/db.py (1)

402-506: find_temp_events() / find_events() return JSON strings consistently — keep tests aligned.

The APSW branches returning raw JSON strings (or json round-trip in find_events) matches the established calling expectations. Based on learnings, avoid changing these to parsed dicts.

arkid15r
arkid15r previously approved these changes Jan 13, 2026
Signed-off-by: Achintya Jai <153343775+pUrGe12@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@nettacker/core/app.py`:
- Around line 159-163: The loop that calls find_events and does json.loads(row)
can raise on malformed rows; guard parsing and structure access by wrapping
json.loads(row) and the ["response"]["conditions_results"]["content"] lookup in
a try/except, validate that the parsed object has the expected keys and that
content is an iterable before iterating, and on any exception or missing/invalid
shape log a warning (including the error and the offending row) and continue to
the next row so the scan does not abort; target symbols to change: the loop in
nettacker/core/app.py that uses find_events(target, "subdomain_scan", scan_id),
json.loads(row), and self.arguments.targets.
🧹 Nitpick comments (1)
nettacker/core/graph.py (1)

85-101: Avoid shadowing the module logger in build_text_table.
Rename the local log variable to keep the logger reference clear.

♻️ Suggested refactor
-        log = merge_logs_to_list(event, [])
+        log_list = merge_logs_to_list(event, [])
         _table.add_rows(
             [
                 table_headers,
                 [
                     event["date"],
                     event["target"],
                     event["module_name"],
                     str(event["port"]),
-                    "\n".join(log) if log else "Detected",
+                    "\n".join(log_list) if log_list else "Detected",
                 ],
             ]
         )

@securestep9 securestep9 self-assigned this Feb 3, 2026
@securestep9 securestep9 added this pull request to the merge queue Feb 3, 2026
Merged via the queue into OWASP:master with commit 29d09c7 Feb 3, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gsoc GSoC work