issue127-skull-boneset by Taktar · Pull Request #133 · oss-slu/DigitalBonesBox

Taktar · 2025-09-29T05:07:51Z

Description

Partially addresses #127 – Content Pipeline: Process and Integrate a New Boneset (Skull).
This PR sets up the Skull extraction pipeline and checks in the raw, per-slide outputs we’ll use to build the final skull.json. It adds the folder scaffolding, unzips the PPTX into XML, and runs our extraction scripts to produce images plus JSON annotations/descriptions.

Dependencies: Python 3 stdlib only (xml.etree, json, os). No new npm deps.

1. What was Changed

Created data_extraction/skull/{ppt,annotations,images}/.
Added the Skull deck locally and unzipped to data_extraction/skull/ppt/unzipped/….
Ran/parameterized:
- data_extraction/extract_ppt_annotations.py → per-slide *_annotations.json + extracted slide images.
- data_extraction/Extract_Bone_Descriptions.py → per-slide *_Descriptions.json.
Outputs committed:
- 32 *_annotations.json files (slide 1 has no image/labels).
- 33 *_Descriptions.json files (one per slide).
- Extracted slide images under data_extraction/skull/images/slide*/….
Kept the PPTX and unzipped XML out of git history (local only).

2. Why it was Changed

Kick off the content pipeline for the Skull boneset and produce clean, machine-readable labels and descriptions. This unblocks creation of a single skull.json and later API/UI integration alongside Bony Pelvis.

3. How it was Changed

Staging & unzip
- Copied Bone Box (Skull).pptx → data_extraction/skull/ppt/skull.pptx.
- Unzipped to data_extraction/skull/ppt/unzipped/ to parse ppt/slides/*.xml, relationships, and media.
Image/annotation extraction (extract_ppt_annotations.py)
- Configured paths:
  - slides: …/ppt/unzipped/ppt/slides
  - rels: …/ppt/unzipped/ppt/slides/_rels
  - media: …/ppt/unzipped/ppt/media
  - out: data_extraction/skull/images & data_extraction/skull/annotations
- For each slide, resolved r:embed → media, wrote images to images/slideN/…, saved slideN_annotations.json.
- Slide 1 has no image/labels — script logs No images found… Skipping.
Description extraction (Extract_Bone_Descriptions.py)
- Parses each slide’s shapes, locates the right-panel text box by x/y/width/height, collects bullet text.
- First token becomes bone name; remaining tokens become the description list.
- Emits { id, name, description: [...] } to slideN_Descriptions.json.
Verification
- Counts:
  - *_annotations.json: 32
  - *_Descriptions.json: 33
- Spot-checked files (e.g., slide3_annotations.json) for sensible label coordinates/text.
Notes / challenges
- Slide 1 is a title slide; handled gracefully.
- Position heuristics tuned to this deck; can be adjusted if later slides differ.

4. Screenshots (if applicable)

N/A (no UI/API changes in this PR). Folder tree and terminal logs available upon request.

Next steps (tracked in #127)

Merge per-slide outputs into a single skull.json with full hierarchy (bones, sub-bones, descriptions, annotations, image paths).
Add a GitHub-hosted data directory for Skull; update backend server.js to load from the new raw URL.
Update UI dropdown to include “Skull” alongside “Bony Pelvis.”

Taktar · 2025-09-29T05:49:12Z

…cription

Taktar · 2025-09-29T06:36:56Z

got the descriptions to pop up for skull

Taktar · 2025-10-01T04:02:05Z

UcheWendy

The issue asks for a new GitHub directory for Skull (e.g., DataSkull) with skull.json and images accessible via raw GitHub URL. In this PR, Skull data lives under boneset-api/data/final_skull.json. I don’t see a DataSkull/ folder or a raw-URL fetch path added. This misses the “store in GitHub for raw access” requirement and makes prod dependent on bundling data in the server image.
Lets maintain a consistant image paths.Because Skull is served from a local JSON file, the image paths inside final_skull.json must be valid from the app’s point of view (either local static paths you serve, or raw GitHub URLs). The PR doesn’t show that wiring (static express folder or new GitHub paths). Verify these resolve in the UI.

Tips to note
To fix this, clean /combined-data by replacing the handler with a single flow: load the pelvis from the existing GitHub JSONs, push its boneset/bones/subbones, load the skull with await loadLocalBoneset("skull"), push its boneset/bones/subbones, and return { bonesets, bones, subbones } once; this removes the duplicate bonesets declaration and the nested “fetch pelvis inside a pelvis loop” bug.
Next, unify the cache layer by keeping bonesetCache = new Map() with loadLocalBoneset(id) and deleting the old cachedBoneset and loadBoneset() so all local reads go through the new function. Fix the /api/description duplication by removing the second for (const line of lines) so descriptions render once, and keep validation on boneId and ALLOWED_BONESETS.has(bonesetId).

To meet the “DataSkull” acceptance criteria, create DataSkull/ in the repo (parallel to DataPelvis/), commit boneset/skull.json and all images there, and add Skull GitHub base URLs like:
---> const SKULL_REPO = "https://raw.githubusercontent.com/oss-slu/DigitalBonesBox/data/DataSkull/";
---> const SKULL_JSON_URL = ${SKULL_REPO}boneset/skull.json; ---> const SKULL_BONES_DIR = ${SKULL_REPO}bones/`;.

For consistency, have /combined-data pull both pelvis and skull from raw GitHub URLs. Finally, verify the dropdown by confirming /combined-data returns bonesetscontaining both{ id: "bony_pelvis" } and { id: "skull" }`, so the UI shows both options.

UcheWendy · 2025-10-01T04:54:57Z

boneset-api/server.js


-// Unchanged: used by the dropdowns in the current UI
+// Unchanged pelvis aggregation + add Skull from local final_skull.json
 app.get("/combined-data", async (_req, res) => {


The route re-declares bonesets and mixes old pelvis logic with the new flow. You initialize bonesets twice and start a loop on bonesetData.bones while also fetching pelvis again inside that loop. This will either throw or produce wrong results. Clean this to one clear flow: build pelvis, then skull, then return

UcheWendy · 2025-10-01T04:56:06Z

boneset-api/server.js


  try {
-    const set = await loadBoneset();
+    const set = await loadLocalBoneset(bonesetId); // <- IMPORTANT


You introduce a bonesetCache and a new loadLocalBoneset(id) while also keeping the old single-file cache fields (cachedBoneset, old loadBoneset() pattern). Pick one approach and remove the leftovers so readers aren’t confused.

UcheWendy · 2025-10-01T04:57:17Z

boneset-api/server.js

+// Safe path + allowlist + rate limit to fetch the full local JSON
 app.get("/api/boneset/:bonesetId", bonesetLimiter, async (req, res) => {
  const { bonesetId } = req.params;



/api/description renders lines twice, The code appends the description lines in a loop, then immediately appends them again in a second loop. You’ll get duplicated (li) items.

Taktar · 2025-10-01T05:22:53Z

Taktar · 2025-10-03T04:31:38Z

images still not loading

Taktar · 2025-10-03T04:48:49Z

update @UcheWendy

Done (per acceptance criteria & review):

Created DataSkull/ in the repo with boneset/skull.json and all images checked in.
Updated skull.json image_url fields to Raw GitHub URLs (currently pointing to the feature branch: https://raw.githubusercontent.com/oss-slu/DigitalBonesBox/issue127-skull-boneset/DataSkull/...).
Server updated:
/combined-data now merges Pelvis (DataPelvis) + Skull (DataSkull) from raw GitHub.
Branch-aware skull source via SKULL_BRANCH/SKULL_REPO env (so this PR works before data lands on data).
/api/description serves skull from GitHub and pelvis from local file (as discussed).
UI: dropdown now shows Bony Pelvis and Skull options. that went away during the process but is back.

Current blocker I’m fixing: images not rendering

Skull images are reachable on the web (HTTP 200 from Raw GitHub), but the browser isn’t showing them because our front-end helper prefixes even absolute URLs with http://127.0.0.1:8000.
Pelvis images still point to local /images/... paths which we’re not serving yet (or they’re not present), so those show 404s.

UcheWendy · 2025-10-07T01:02:54Z

Scope too large

unzipped ppt

cadc1f4

Taktar requested a review from UcheWendy as a code owner September 29, 2025 05:07

feat(skull): add skull master data and wire descriptions via /api/des…

0eae4ea

…cription

Taktar self-assigned this Sep 29, 2025

UcheWendy requested changes Oct 1, 2025

View reviewed changes

DataSkull

51c52d7

Point skull image URLs to issue127-skull-boneset branch

4898612

Taktar requested a review from UcheWendy October 3, 2025 04:48

Remove DataSkull/images (not used yet)

4df5ead

UcheWendy closed this Oct 7, 2025

UcheWendy deleted the issue127-skull-boneset branch December 8, 2025 22:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issue127-skull-boneset#133

issue127-skull-boneset#133
Taktar wants to merge 5 commits intomainfrom
issue127-skull-boneset

Taktar commented Sep 29, 2025

Uh oh!

Taktar commented Sep 29, 2025

Uh oh!

Taktar commented Sep 29, 2025

Uh oh!

Taktar commented Oct 1, 2025

Uh oh!

UcheWendy left a comment •

edited

Loading

Uh oh!

UcheWendy Oct 1, 2025

Uh oh!

UcheWendy Oct 1, 2025

Uh oh!

UcheWendy Oct 1, 2025 •

edited

Loading

Uh oh!

Taktar commented Oct 1, 2025

Uh oh!

Taktar commented Oct 3, 2025 •

edited

Loading

Uh oh!

Taktar commented Oct 3, 2025

Uh oh!

UcheWendy commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Taktar commented Sep 29, 2025

Description

1. What was Changed

2. Why it was Changed

3. How it was Changed

4. Screenshots (if applicable)

Next steps (tracked in #127)

Uh oh!

Taktar commented Sep 29, 2025

Uh oh!

Taktar commented Sep 29, 2025

Uh oh!

Taktar commented Oct 1, 2025

Uh oh!

UcheWendy left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

UcheWendy Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

UcheWendy Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

UcheWendy Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Taktar commented Oct 1, 2025

Uh oh!

Taktar commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Taktar commented Oct 3, 2025

Uh oh!

UcheWendy commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

UcheWendy left a comment •

edited

Loading

UcheWendy Oct 1, 2025 •

edited

Loading

Taktar commented Oct 3, 2025 •

edited

Loading