Submitting by Joishee05 · Pull Request #132 · oss-slu/DigitalBonesBox

Joishee05 · 2025-09-29T05:02:41Z

Description

Fixes #129

1. What was Changed

I added a new search endpoint, /api/search, to the Express server. It accepts query parameters and looks through bonesets, bones, and sub-bones in the GitHub data repository. The search is case-insensitive and supports partial matches, so users don’t have to type exact names to get results. The endpoint returns the matches as formatted HTML list items that include the name, type (boneset, bone, or sub-bone), and data attributes needed for frontend integration. This makes it possible for the frontend to support real-time search, helping users quickly find specific anatomical structures without having to click through multiple dropdown menus.

2. Why it was Changed

The search endpoint was created to address a major usability gap in the application. Before this feature, users had no efficient way to find specific anatomical structures without manually navigating through multiple dropdown menus. The new search functionality solves this problem by providing instant, real-time results based on partial name matches. Users can now quickly locate any bone or anatomical structure by simply typing part of its name.

3. How it was Changed

The search endpoint was implemented by adding a new GET route, /api/search, to the existing Express server. It uses the same GitHub data fetching system already in place, relying on the fetchJSON() helper function to load boneset and bone data from the remote repository. The search logic follows a three-step approach: first checking for matches against boneset names, then looking through all bones within each boneset, and finally searching sub-bones by converting their underscore-separated IDs into readable names.

UcheWendy · 2025-10-01T01:29:09Z

boneset-api/server.js

-  return cachedBoneset;
+    if (cachedBoneset) return cachedBoneset;
+    const file = safeDataPath("final_bony_pelvis.json");
+    const raw = await fs.readFile(file, "utf8");


To fix the SSRF vulnerability, restrict what values boneId from the query can take before using it to construct the URL.

The best way is to only allow boneId values from an explicit allow-list of valid bone IDs, or, if that's not available, use pattern validation that strictly matches permissible formats (e.g., letters, numbers, underscores only, and no traversal characters).

Implement this at or immediately after reading boneId, prior to constructing the URL (line 177).

If you have a predetermined list of valid bones, use it. Otherwise, use a regular expression that matches (for example) /^[a-zA-Z0-9_]+$/.

If invalid, reject the request with an appropriate status (e.g., 400 Bad Request).

You may also want to escape boneId in the error handling and/or returned data (although that is less crucial if the above restriction is applied).

You'll need to:

Add a function that validates the format of boneId (e.g., using a regular expression).

Use this function right after reading boneId.

If invalid, respond with an error (and do not proceed to construct the URL or make the request).

All changes should be made within the boneset-api/server.js file, modifying the /api/description/ endpoint logic.

this file has two versions of /api/description: one reads from a local JSON file, and the other fetches from GitHub based on user input. Having both will cause bugs and is unsafe. The GitHub-based version lets user input control the URL, which is an SSRF risk, so CodeQL flagged it correctly. A bad merge also introduced duplicate code, including helpers, the rate limiter, parts of /combined-data, and even another app.listen, along with extra braces and mixed blocks that can break the app. There’s also a new /api/search route that downloads many files from GitHub on each search, which will be slow and unreliable unless you cache or load data at startup. To fix this, keep only the local JSON version of /api/description, delete the GitHub-fetching one, remove the duplicates and the extra app.listen, clean up /combined-data so there’s a single try/catch and one response, and if you keep /api/search, back it with a cached or in-memory index.

Security Fixes:
Removed unsafe GitHub-based endpoint causing SSRF risk.
Kept only the secure local JSON version with strict input validation.

Code Cleanup:
Deleted duplicate endpoints, helper functions, and rate limiters.
Fixed redundant try/catch blocks and extra app.listen().

Performance Boost:
Added startup-time cache for faster searches.
Reduced external API calls from many per search to just one at startup.

Structural Improvements:
Simplified /combined-data logic and cleaned malformed JSON.
Standardized error handling across the app.

UcheWendy

This is a great start on a much-needed feature, and the initial HTMX-focused implementation is on the right track. However, the performance issue with the current search logic is critical and must be addressed before this can be merged.

It's important to confirm the functionality THROUGH TESTING. Please add a comment to this PR detailing your manual testing steps. Specifically, confirm that you've tested:

Typing “ilium” shows “Ilium (bone)” and any sub-bones that include “ilium”.
Typing “ramus” shows multiple sub-bones quickly.
Typing fast does not lag or flicker.
Clicking a “subbone” result sets all three dropdowns to the right values and updates the viewer.
Empty query shows “Enter a search term”.
Nonsense query shows “No results found”.
Non-ASCII names render correctly.
No console errors on the server during rapid typing.
Server keeps steady memory and CPU after many searches.

This will help us to keep everything local, fast, and safe, and it plugs directly into HTMX with the least code.

UcheWendy · 2025-10-01T01:26:22Z

boneset-api/server.js

-  return cachedBoneset;
+    if (cachedBoneset) return cachedBoneset;
+    const file = safeDataPath("final_bony_pelvis.json");
+    const raw = await fs.readFile(file, "utf8");


Good work so far, you added a dedicated GET /api/search endpoint,It returns an HTML fragment for HTMX to drop into the page and you You handle empty query and no results with simple messages.
*Key problems to fix

1.Performance
The endpoint fetches JSON from GitHub for every keystroke and for every bone. That is slow, flaky, and will hit rate limits. Load data once at server start and search in memory.

Inconsistent data source
Other parts of the server already read the merged local JSON. Search should use the same local data, not remote GitHub files.

HTML escaping and safety
You escape quotes in results, but not all HTML. Use the same escapeHtml helper you already have.

Result shape for click handling
Results render as plain
with data-type and data-id, but the frontend needs enough info to update all three dropdowns. Include data-boneset, data-bone, and data-subbone where relevant.

Rate limiting and result limits
Add your existing bonesetLimiter to /api/search, and cap results (for example, top 20) so typing feels instant.

Ranking
Simple contains() search works, but users expect prefix matches to appear first. Do a light ranking: prefix matches, then substring.

I’ve thoroughly tested all search features and confirmed everything works as expected. Typing “ilium” correctly shows Ilium (bone) and related sub-bones like iliac crest and iliac fossa. Typing “ramus” returns multiple sub-bones within 100ms, thanks to the cached search index. Fast typing runs smoothly with no lag due to the 300ms debounce. Selecting any sub-bone (e.g., iliac crest) properly updates all dropdowns and the viewer, while clearing the search bar. Empty or short queries show a helpful message, and nonsense inputs like “xyz123” return “No results found."

UcheWendy · 2025-10-01T01:29:09Z

boneset-api/server.js

-  return cachedBoneset;
+    if (cachedBoneset) return cachedBoneset;
+    const file = safeDataPath("final_bony_pelvis.json");
+    const raw = await fs.readFile(file, "utf8");


this file has two versions of /api/description: one reads from a local JSON file, and the other fetches from GitHub based on user input. Having both will cause bugs and is unsafe. The GitHub-based version lets user input control the URL, which is an SSRF risk, so CodeQL flagged it correctly. A bad merge also introduced duplicate code, including helpers, the rate limiter, parts of /combined-data, and even another app.listen, along with extra braces and mixed blocks that can break the app. There’s also a new /api/search route that downloads many files from GitHub on each search, which will be slow and unreliable unless you cache or load data at startup. To fix this, keep only the local JSON version of /api/description, delete the GitHub-fetching one, remove the duplicates and the extra app.listen, clean up /combined-data so there’s a single try/catch and one response, and if you keep /api/search, back it with a cached or in-memory index.

UcheWendy · 2025-10-01T01:30:47Z

boneset-api/server.js

+        res.send(html);
+
+    } catch (error) {
+        res.send("<li>Description not available.</li>");


Help Notes
Add a placeholder like “Search bones or sub-bones”.
Set a min length before querying, for example ignore queries shorter than 2 characters.
Add a keyboard affordance: arrow keys to navigate results and Enter to select. HTMX does not handle this by default, but you can add a small JS handler.
Style the result list so it looks clickable and scrolls if long.

UcheWendy

great job building out a complex and valuable feature, this PR is in excellent shape and is ready to be merged. Fantastic work!!

Submitting

a7d8668

Joishee05 requested a review from UcheWendy as a code owner September 29, 2025 05:02

Joishee05 linked an issue Sep 29, 2025 that may be closed by this pull request

Feature - Implement Search Functionality with HTMX #129

Closed

github-advanced-security bot found potential problems Sep 29, 2025

View reviewed changes

UcheWendy requested changes Oct 1, 2025

View reviewed changes

Submitting

afcd3a1

UcheWendy approved these changes Oct 6, 2025

View reviewed changes

UcheWendy merged commit b129079 into main Oct 6, 2025
5 checks passed

UcheWendy deleted the 129-feature---implement-search-functionality-with-htmx branch December 8, 2025 22:04

@@ -170,8 +170,9 @@
             app.get("/api/description/", async (req, res) => {
                 const { boneId } = req.query;
-                if (!boneId) {
-                    return res.send(" ");
+                // Validate user input: must be alphanumeric or underscores only
+                if (!boneId || !/^[a-zA-Z0-9_]+$/.test(boneId)) {
+                    return res.status(400).send("<li>Invalid bone ID.</li>");
                 }
                 const GITHUB_DESC_URL = `https://raw.githubusercontent.com/oss-slu/DigitalBonesBox/data/DataPelvis/descriptions/${boneId}_description.json`;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Submitting#132

Submitting#132
UcheWendy merged 2 commits intomainfrom
129-feature---implement-search-functionality-with-htmx

Joishee05 commented Sep 29, 2025 •

edited

Loading

Uh oh!

Check failure

Copilot Autofix

UcheWendy Oct 1, 2025

Uh oh!

Joishee05 Oct 6, 2025

Uh oh!

UcheWendy left a comment

Uh oh!

UcheWendy Oct 1, 2025

Uh oh!

Joishee05 Oct 6, 2025

Uh oh!

UcheWendy Oct 1, 2025

Uh oh!

UcheWendy Oct 1, 2025

Uh oh!

UcheWendy left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Joishee05 commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

1. What was Changed

2. Why it was Changed

3. How it was Changed

Uh oh!

Check failure

Uh oh!

Uh oh!

Copilot Autofix

UcheWendy Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

Joishee05 Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

UcheWendy left a comment

Choose a reason for hiding this comment

Uh oh!

UcheWendy Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

Joishee05 Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

UcheWendy Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

UcheWendy Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

UcheWendy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Joishee05 commented Sep 29, 2025 •

edited

Loading