Fix crawler results pagination in web UI by kwei-zhang · Pull Request #1294 · OWASP/Nettacker

kwei-zhang · 2026-02-21T11:43:07Z

Proposed change

Problem:

Wrong number of pages: Crawler results showed multiple pages (e.g. 6) even when everything fit on one page.
Same results on every page: Page 1 and page 2 showed the same results.

Your PR description goes here.

Type of change

New core framework functionality
Bugfix (non-breaking change which fixes an issue)
Code refactoring without any functionality changes
New or existing module/payload change
Documentation/localization improvement
Test coverage improvement
Dependency upgrade
Other improvement (best practice, cleanup, optimization, etc)

Solve:

This is a simple fix, pagination used res.length on the raw response string, so the “length” was the JSON character count, not the number of results.
Search_logs() in db.py used offset (page * 10) - 10 (1-based), while the API sends a 0-based page. So page 1 and 2 both used offset 0 and returned the same rows.

Checklist

I've followed the [contributing guidelines][contributing-guidelines]
I've run make pre-commit, it didn't generate any changes
I've run make test, all tests passed locally

coderabbitai · 2026-02-21T11:43:29Z

Summary by CodeRabbit

Bug Fixes
- Corrected backend pagination offset calculation to consistently use zero-based offsets for pages.
- Enforced 1-based, non-negative page parsing in the API to prevent invalid page values.
- Made client-side response handling more robust (accepts string or JSON) and adjusted pagination to use actual returned item counts.

Walkthrough

Adjusted pagination handling and response parsing: database offset changed to zero-based calculation (page - 1) * 10; API page parsing now defaults to and enforces 1-based pages (no decrement); frontend added string-response guards, normalizes data before pagination, and updated totalPages/itemCount logic.

Changes

Cohort / File(s)	Summary
Database Pagination `nettacker/database/db.py`	Changed pagination offset calculation in `search_logs` from `(page * 10) - 10` to `(page - 1) * 10` in both SQL and ORM paths.
API Page Parsing `nettacker/api/engine.py`	Changed `go_for_search_logs` page handling: parsing errors now default to `1`, and pages < 1 are clamped to `1` (removed prior decrement behavior).
Frontend response & pagination `nettacker/web/static/js/main.js`	Added type guards to parse string responses as JSON, normalized `data` before use, switched pagination to compute `itemCount` and `totalPages` from normalized data, and updated calls to `show_crawler(data)`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly relates to the main objective: fixing pagination issues in the crawler results web UI.
Description check	✅ Passed	The description clearly explains the pagination problems, root causes, and solutions implemented across three files, directly relating to the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This pull request fixes two critical bugs in the crawler results pagination feature of the web UI:

Wrong page count: The JavaScript code was calculating the total number of pages using the string length of the JSON response instead of the array length, causing it to show hundreds of phantom pages (e.g., "Page 1 of 500" when there were only 5 results).
Duplicate results: The database pagination offset calculation was incorrect for 0-based page indexing, causing pages 1 and 2 to show the same results.

Changes:

Fixed JavaScript pagination logic to parse JSON before counting items and calculate total pages using a heuristic approach based on the number of items returned
Corrected database offset calculation from (page * 10) - 10 to page * 10 to properly handle 0-based page indexing from the API

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
nettacker/web/static/js/main.js	Fixed pagination by parsing JSON response correctly and counting array items instead of string length; implemented heuristic for total pages calculation
nettacker/database/db.py	Corrected offset calculation for 0-based page indexing in both SQLite and SQLAlchemy query paths

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-21T11:47:52Z

nettacker/web/static/js/main.js

+      const totalPages =
+        itemCount < 10
+          ? crawler_page
+          : Math.max(crawler_page, crawler_page + 1);


The pagination logic uses a heuristic approach: if fewer than 10 items are returned, it assumes this is the last page. However, this will show an unnecessary extra page when there are exactly N*10 items (e.g., if there are exactly 20 results, page 2 will show 10 items and indicate "Page 2 of 3", and clicking page 3 will show an empty page).

This is a common limitation when the API doesn't return a total count. Consider one of these improvements: (1) Have the API return the total count alongside results, or (2) Fetch one extra item (limit 11) and only display 10, using the 11th item's presence to determine if there's a next page. Option 2 would avoid the "empty last page" issue without requiring API changes.

Copilot · 2026-02-21T11:47:52Z

nettacker/database/db.py

                    f"%{query}%",
                    f"%{query}%",
-                    (page * 10) - 10,
+                    page * 10,


The offset calculation has been changed from (page * 10) - 10 to page * 10. This is correct for the actual usage because the API endpoint at nettacker/api/engine.py:546-547 converts 1-based page numbers to 0-based before calling this function.

However, this breaks the existing unit tests in tests/database/test_db.py which call search_logs() directly with page=1 (1-based), expecting it to return the first page of results. After this change, page=1 will return the second page (offset=10). The tests at lines 1193, 1225, 1249, 1261, and 1271 will need to be updated to pass page=0 instead of page=1, or the test setup needs to account for the API's conversion logic.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

nettacker/web/static/js/main.js (1)

978-981: Math.max(crawler_page, crawler_page + 1) is always crawler_page + 1 — simplify.

Math.max(x, x + 1) unconditionally returns x + 1, so the Math.max call is a no-op that adds noise. The whole expression reduces to a straightforward ternary:

♻️ Suggested simplification

-      const totalPages =
-        itemCount < 10
-          ? crawler_page
-          : Math.max(crawler_page, crawler_page + 1);
+      const totalPages = itemCount < 10 ? crawler_page : crawler_page + 1;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nettacker/web/static/js/main.js` around lines 978 - 981, The ternary
computing totalPages uses Math.max(crawler_page, crawler_page + 1) which is
always crawler_page + 1; simplify the expression by replacing the Math.max call
so totalPages is set to crawler_page when itemCount < 10 and to crawler_page + 1
otherwise (look for the const totalPages declaration and the variables itemCount
and crawler_page).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nettacker/database/db.py`:
- Line 1046: search_logs currently computes the SQL offset as page * 10 which
treats page as 0-based and causes the API route go_for_search_logs (which passes
1-based page values) to skip the first results; update the offset calculation in
search_logs to use the same 1-based convention as select_reports and
last_host_logs by changing the offset expression to (page - 1) * 10 (ensure this
uses the existing page parameter and preserves current limit of 10).

---

Nitpick comments:
In `@nettacker/web/static/js/main.js`:
- Around line 978-981: The ternary computing totalPages uses
Math.max(crawler_page, crawler_page + 1) which is always crawler_page + 1;
simplify the expression by replacing the Math.max call so totalPages is set to
crawler_page when itemCount < 10 and to crawler_page + 1 otherwise (look for the
const totalPages declaration and the variables itemCount and crawler_page).

nettacker/database/db.py

coderabbitai

🧹 Nitpick comments (1)

nettacker/api/engine.py (1)

544-554: Pagination fix is correct; consider aligning the same pattern in sibling endpoints.

The three-part fix—catching both TypeError/ValueError, defaulting to 1, and clamping page < 1—correctly ensures a 1-based page value is always forwarded to search_logs, which now computes offset as (page - 1) * 10.

The analogous handlers get_results() (Line 367) and get_last_host_logs() (Line 461) use a weaker guard pattern (if not page: page = 1 then bare int(page)) that would raise an unhandled ValueError on a non-numeric page parameter and has no negative-value clamp. Consider harmonising them:

♻️ Suggested alignment for sibling endpoints

 `@app.route`("/results/get_list", methods=["GET"])
 def get_results():
     api_key_is_valid(app, flask_request)
-    page = get_value(flask_request, "page")
-    if not page:
+    try:
+        page = int(get_value(flask_request, "page"))
+    except (TypeError, ValueError):
         page = 1
-    return jsonify(select_reports(int(page))), 200
+    if page < 1:
+        page = 1
+    return jsonify(select_reports(page)), 200

 `@app.route`("/logs/get_list", methods=["GET"])
 def get_last_host_logs():
     api_key_is_valid(app, flask_request)
-    page = get_value(flask_request, "page")
-    if not page:
+    try:
+        page = int(get_value(flask_request, "page"))
+    except (TypeError, ValueError):
         page = 1
-    return jsonify(last_host_logs(int(page))), 200
+    if page < 1:
+        page = 1
+    return jsonify(last_host_logs(page)), 200

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nettacker/api/engine.py` around lines 544 - 554, The pagination parsing in
search_logs was hardened; apply the same pattern to get_results and
get_last_host_logs by replacing their current fragile page handling (where they
do if not page: page = 1 and then int(page)) with the robust sequence: fetch
page via get_value(flask_request, "page"), wrap int(...) in a try/except
catching TypeError and ValueError to default page to 1, then clamp any page < 1
to 1 before passing it on (refer to the page parsing logic used around
search_logs and the get_value call to locate the exact lines to change).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@nettacker/api/engine.py`:
- Around line 544-554: The pagination parsing in search_logs was hardened; apply
the same pattern to get_results and get_last_host_logs by replacing their
current fragile page handling (where they do if not page: page = 1 and then
int(page)) with the robust sequence: fetch page via get_value(flask_request,
"page"), wrap int(...) in a try/except catching TypeError and ValueError to
default page to 1, then clamp any page < 1 to 1 before passing it on (refer to
the page parsing logic used around search_logs and the get_value call to locate
the exact lines to change).

pUrGe12 · 2026-02-21T18:05:11Z

LGTM! You might have to resolve the test_db.py case as copilot suggested.

Fix crawler results pagination in web UI

1ba4933

Copilot AI review requested due to automatic review settings February 21, 2026 11:43

kwei-zhang requested review from arkid15r and securestep9 as code owners February 21, 2026 11:43

Copilot started reviewing on behalf of kwei-zhang February 21, 2026 11:43 View session

Copilot AI reviewed Feb 21, 2026

View reviewed changes

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

nettacker/database/db.py Outdated Show resolved Hide resolved

update engine.py to keep all search consistent

eea7006

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

Fix crawler results pagination in web UI#1294

Fix crawler results pagination in web UI#1294
kwei-zhang wants to merge 2 commits intoOWASP:masterfrom
kwei-zhang:master

kwei-zhang commented Feb 21, 2026

Uh oh!

coderabbitai bot commented Feb 21, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

pUrGe12 commented Feb 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Comments

Conversation

kwei-zhang commented Feb 21, 2026

Proposed change

Type of change

Checklist

Uh oh!

coderabbitai bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

pUrGe12 commented Feb 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Feb 21, 2026 •

edited

Loading