Jekyll static site showcasing GitHub repos with semantic search. Live: https://tom-doerr.github.io/repo_posts/
tools/generate_related.py- Generates embeddings and related.jsontools/generate_search_index.py- Builds client-side search indextools/export_embeddings_bin.py- Exports embeddings for browser semantic searchtools/export_3d_coords.py- UMAP reduction to 3D for visualizationdocs/assets/js/sem.js- Browser-based semantic search (WebGPU/ONNX)docs/assets/js/map3d.js- Three.js 3D semantic map visualization
generate_related.pycreatesdocs/_data/embeddings.npz(float16)export_embeddings_bin.pyconverts todocs/assets/embeddings.f32+.meta.jsonsem.jsloads binary embeddings in browser, uses Xenova/transformers for query embedding- Cosine similarity computed client-side via dot product (pre-normalized vectors)
generate-related-min.yml- push to docs/_posts or tools → embeddings, search indexpages-min.yml- push to docs → build and deploy Jekyllrss-smoke.yml- after pages deploy → validate site healthimage-compress.yml- PR to docs/assets → optimize images
docs/_includes/meta-description.html- Extracts post description for meta tagdocs/_includes/schema-software.html- SoftwareSourceCode JSON-LD schema- Both included in
docs/_layouts/default.html
- Search clear button (X), result count, empty state ("NO DATA" in EVA style)
- Skip link for accessibility, mobile responsive at <640px
- ARIA:
role="listbox"on results,role="option"on items
Run tests: python -m pytest tests/ -q
Generate embeddings locally: python tools/export_embeddings_bin.py
Generate 3D coords: python tools/export_3d_coords.py (requires umap-learn)
Embedding model: sentence-transformers/all-MiniLM-L6-v2 (384 dims)
Use Jekyll for proper CSS/images: cd docs && jekyll serve --host 0.0.0.0 --port 4000
Access at http://localhost:4000/repo_posts/ (first build ~17min for 13K posts)
Gotcha: Do NOT create docs/Gemfile or commit docs/vendor/ - breaks GitHub Pages
Branding: "MAGI//ARCHIVE" title, "// SURVEILLANCE FEED" tagline, "NERV // OPEN SOURCE DIVISION" footer
Fonts: Orbitron (headings), Roboto Mono (body/data)
Colors: --eva-purple-deep:#1a0a2e, --eva-orange:#ff6611, --eva-green:#00ff41, --eva-red:#ff0040
Effects: Scan lines, hexagon grid bg, glitch text on hover, pulsing status dot
Cards: "REC ●" indicator (red), date YYYY.MM.DD, entrance animation with rotation
Accessibility: All animations respect prefers-reduced-motion
Interactive visualization at /map.html - posts float in 3D space arranged by semantic similarity.
- Generation:
tools/export_3d_coords.pyuses UMAP to reduce 384D embeddings → 3D - Output:
docs/assets/embeddings.3d.json(~855KB, 13K posts) - Rendering: Three.js with InstancedMesh, OrbitControls
- Interaction: Hover for tooltip, click opens info box with "VIEW →" link
- Highlight:
?hl=<url>param highlights point green and centers camera - URLs: All tools generate URLs with
/repo_posts/prefix for GitHub Pages
- Image lightbox: Click post image to view fullscreen (above scan lines overlay)
- 3D Map link: "View in 3D Map" opens map with that repo highlighted