Skip to content

Submitting#132

Merged
UcheWendy merged 2 commits intomainfrom
129-feature---implement-search-functionality-with-htmx
Oct 6, 2025
Merged

Submitting#132
UcheWendy merged 2 commits intomainfrom
129-feature---implement-search-functionality-with-htmx

Conversation

@Joishee05
Copy link
Collaborator

@Joishee05 Joishee05 commented Sep 29, 2025

Description

Fixes #129

1. What was Changed

I added a new search endpoint, /api/search, to the Express server. It accepts query parameters and looks through bonesets, bones, and sub-bones in the GitHub data repository. The search is case-insensitive and supports partial matches, so users don’t have to type exact names to get results. The endpoint returns the matches as formatted HTML list items that include the name, type (boneset, bone, or sub-bone), and data attributes needed for frontend integration. This makes it possible for the frontend to support real-time search, helping users quickly find specific anatomical structures without having to click through multiple dropdown menus.

2. Why it was Changed

The search endpoint was created to address a major usability gap in the application. Before this feature, users had no efficient way to find specific anatomical structures without manually navigating through multiple dropdown menus. The new search functionality solves this problem by providing instant, real-time results based on partial name matches. Users can now quickly locate any bone or anatomical structure by simply typing part of its name.

3. How it was Changed

The search endpoint was implemented by adding a new GET route, /api/search, to the existing Express server. It uses the same GitHub data fetching system already in place, relying on the fetchJSON() helper function to load boneset and bone data from the remote repository. The search logic follows a three-step approach: first checking for matches against boneset names, then looking through all bones within each boneset, and finally searching sub-bones by converting their underscore-separated IDs into readable names.

@Joishee05 Joishee05 linked an issue Sep 29, 2025 that may be closed by this pull request
return cachedBoneset;
if (cachedBoneset) return cachedBoneset;
const file = safeDataPath("final_bony_pelvis.json");
const raw = await fs.readFile(file, "utf8");

Check failure

Code scanning / CodeQL

Server-side request forgery Critical

The
URL
of this request depends on a
user-provided value
.

Copilot Autofix

AI 5 months ago

To fix the SSRF vulnerability, restrict what values boneId from the query can take before using it to construct the URL.

  • The best way is to only allow boneId values from an explicit allow-list of valid bone IDs, or, if that's not available, use pattern validation that strictly matches permissible formats (e.g., letters, numbers, underscores only, and no traversal characters).
  • Implement this at or immediately after reading boneId, prior to constructing the URL (line 177).
  • If you have a predetermined list of valid bones, use it. Otherwise, use a regular expression that matches (for example) /^[a-zA-Z0-9_]+$/.
  • If invalid, reject the request with an appropriate status (e.g., 400 Bad Request).
  • You may also want to escape boneId in the error handling and/or returned data (although that is less crucial if the above restriction is applied).

You'll need to:

  • Add a function that validates the format of boneId (e.g., using a regular expression).
  • Use this function right after reading boneId.
  • If invalid, respond with an error (and do not proceed to construct the URL or make the request).

All changes should be made within the boneset-api/server.js file, modifying the /api/description/ endpoint logic.


Suggested changeset 1
boneset-api/server.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/boneset-api/server.js b/boneset-api/server.js
--- a/boneset-api/server.js
+++ b/boneset-api/server.js
@@ -170,8 +170,9 @@
 
 app.get("/api/description/", async (req, res) => {
     const { boneId } = req.query;
-    if (!boneId) {
-        return res.send(" ");
+    // Validate user input: must be alphanumeric or underscores only
+    if (!boneId || !/^[a-zA-Z0-9_]+$/.test(boneId)) {
+        return res.status(400).send("<li>Invalid bone ID.</li>");
     }
     
     const GITHUB_DESC_URL = `https://raw.githubusercontent.com/oss-slu/DigitalBonesBox/data/DataPelvis/descriptions/${boneId}_description.json`;
EOF
@@ -170,8 +170,9 @@

app.get("/api/description/", async (req, res) => {
const { boneId } = req.query;
if (!boneId) {
return res.send(" ");
// Validate user input: must be alphanumeric or underscores only
if (!boneId || !/^[a-zA-Z0-9_]+$/.test(boneId)) {
return res.status(400).send("<li>Invalid bone ID.</li>");
}

const GITHUB_DESC_URL = `https://raw.githubusercontent.com/oss-slu/DigitalBonesBox/data/DataPelvis/descriptions/${boneId}_description.json`;
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file has two versions of /api/description: one reads from a local JSON file, and the other fetches from GitHub based on user input. Having both will cause bugs and is unsafe. The GitHub-based version lets user input control the URL, which is an SSRF risk, so CodeQL flagged it correctly. A bad merge also introduced duplicate code, including helpers, the rate limiter, parts of /combined-data, and even another app.listen, along with extra braces and mixed blocks that can break the app. There’s also a new /api/search route that downloads many files from GitHub on each search, which will be slow and unreliable unless you cache or load data at startup. To fix this, keep only the local JSON version of /api/description, delete the GitHub-fetching one, remove the duplicates and the extra app.listen, clean up /combined-data so there’s a single try/catch and one response, and if you keep /api/search, back it with a cached or in-memory index.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Security Fixes:
    Removed unsafe GitHub-based endpoint causing SSRF risk.
    Kept only the secure local JSON version with strict input validation.
  2. Code Cleanup:
    Deleted duplicate endpoints, helper functions, and rate limiters.
    Fixed redundant try/catch blocks and extra app.listen().
  3. Performance Boost:
    Added startup-time cache for faster searches.
    Reduced external API calls from many per search to just one at startup.
  4. Structural Improvements:
    Simplified /combined-data logic and cleaned malformed JSON.
    Standardized error handling across the app.

Copy link
Collaborator

@UcheWendy UcheWendy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start on a much-needed feature, and the initial HTMX-focused implementation is on the right track. However, the performance issue with the current search logic is critical and must be addressed before this can be merged.

It's important to confirm the functionality THROUGH TESTING. Please add a comment to this PR detailing your manual testing steps. Specifically, confirm that you've tested:

  1. Typing “ilium” shows “Ilium (bone)” and any sub-bones that include “ilium”.
  2. Typing “ramus” shows multiple sub-bones quickly.
  3. Typing fast does not lag or flicker.
  4. Clicking a “subbone” result sets all three dropdowns to the right values and updates the viewer.
  5. Empty query shows “Enter a search term”.
  6. Nonsense query shows “No results found”.
  7. Non-ASCII names render correctly.
  8. No console errors on the server during rapid typing.
  9. Server keeps steady memory and CPU after many searches.

This will help us to keep everything local, fast, and safe, and it plugs directly into HTMX with the least code.

return cachedBoneset;
if (cachedBoneset) return cachedBoneset;
const file = safeDataPath("final_bony_pelvis.json");
const raw = await fs.readFile(file, "utf8");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work so far, you added a dedicated GET /api/search endpoint,It returns an HTML fragment for HTMX to drop into the page and you You handle empty query and no results with simple messages.
*Key problems to fix

1.Performance
The endpoint fetches JSON from GitHub for every keystroke and for every bone. That is slow, flaky, and will hit rate limits. Load data once at server start and search in memory.

  1. Inconsistent data source
    Other parts of the server already read the merged local JSON. Search should use the same local data, not remote GitHub files.

  2. HTML escaping and safety
    You escape quotes in results, but not all HTML. Use the same escapeHtml helper you already have.

  3. Result shape for click handling
    Results render as plain

  4. with data-type and data-id, but the frontend needs enough info to update all three dropdowns. Include data-boneset, data-bone, and data-subbone where relevant.

  5. Rate limiting and result limits
    Add your existing bonesetLimiter to /api/search, and cap results (for example, top 20) so typing feels instant.

  6. Ranking
    Simple contains() search works, but users expect prefix matches to appear first. Do a light ranking: prefix matches, then substring.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve thoroughly tested all search features and confirmed everything works as expected. Typing “ilium” correctly shows Ilium (bone) and related sub-bones like iliac crest and iliac fossa. Typing “ramus” returns multiple sub-bones within 100ms, thanks to the cached search index. Fast typing runs smoothly with no lag due to the 300ms debounce. Selecting any sub-bone (e.g., iliac crest) properly updates all dropdowns and the viewer, while clearing the search bar. Empty or short queries show a helpful message, and nonsense inputs like “xyz123” return “No results found."

return cachedBoneset;
if (cachedBoneset) return cachedBoneset;
const file = safeDataPath("final_bony_pelvis.json");
const raw = await fs.readFile(file, "utf8");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file has two versions of /api/description: one reads from a local JSON file, and the other fetches from GitHub based on user input. Having both will cause bugs and is unsafe. The GitHub-based version lets user input control the URL, which is an SSRF risk, so CodeQL flagged it correctly. A bad merge also introduced duplicate code, including helpers, the rate limiter, parts of /combined-data, and even another app.listen, along with extra braces and mixed blocks that can break the app. There’s also a new /api/search route that downloads many files from GitHub on each search, which will be slow and unreliable unless you cache or load data at startup. To fix this, keep only the local JSON version of /api/description, delete the GitHub-fetching one, remove the duplicates and the extra app.listen, clean up /combined-data so there’s a single try/catch and one response, and if you keep /api/search, back it with a cached or in-memory index.

res.send(html);

} catch (error) {
res.send("<li>Description not available.</li>");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help Notes
Add a placeholder like “Search bones or sub-bones”.
Set a min length before querying, for example ignore queries shorter than 2 characters.
Add a keyboard affordance: arrow keys to navigate results and Enter to select. HTMX does not handle this by default, but you can add a small JS handler.
Style the result list so it looks clickable and scrolls if long.

Copy link
Collaborator

@UcheWendy UcheWendy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great job building out a complex and valuable feature, this PR is in excellent shape and is ready to be merged. Fantastic work!!

@UcheWendy UcheWendy merged commit b129079 into main Oct 6, 2025
5 checks passed
@UcheWendy UcheWendy deleted the 129-feature---implement-search-functionality-with-htmx branch December 8, 2025 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature - Implement Search Functionality with HTMX

2 participants