Skip to content

Add recent activity listings to generated index page#63

Merged
simonw merged 1 commit intomainfrom
codex/add-recently-added-and-updated-lists-to-index.html
Oct 18, 2025
Merged

Add recent activity listings to generated index page#63
simonw merged 1 commit intomainfrom
codex/add-recently-added-and-updated-lists-to-index.html

Conversation

@simonw
Copy link
Owner

@simonw simonw commented Oct 18, 2025

Summary

  • add a dedicated build_index.py script that enriches README output with recent additions and updates
  • run the new script from build.sh and the Pages workflow so the generated index includes recent activity columns

Refs:

Testing

  • python -m compileall build_index.py
  • ./build.sh (fails in this environment: proxy prevents installing markdown)

https://chatgpt.com/s/cd_68f319603cbc8191805768d3b9ec837b

@claude
Copy link
Contributor

claude bot commented Oct 18, 2025

Claude finished @simonw's task —— View job


PR Review: Add recent activity listings to generated index page

Todo List:

  • Read repository guidelines (CLAUDE.md) - No CLAUDE.md found
  • Analyze new build_index.py script
  • Review changes to build.sh
  • Review changes to .github/workflows/pages.yml
  • Check for code quality issues
  • Identify potential bugs
  • Assess performance considerations
  • Review security concerns
  • Evaluate test coverage needs
  • Provide comprehensive feedback

Code Review Summary

This PR introduces a well-structured solution for adding recent activity sections to the generated index page. The implementation is generally solid with good separation of concerns and defensive programming practices.

✅ Strengths

Code Quality

  • Clean, readable code with good function decomposition
  • Proper type hints throughout (build_index.py:5-9)
  • Defensive programming with error handling (build_index.py:11-17, build_index.py:33-40)
  • Good use of constants for file paths (build_index.py:19-21)

Implementation Design

  • Smart injection strategy using a marker in HTML (build_index.py:141-147)
  • Elegant date formatting with ordinal suffixes (build_index.py:24-30)
  • Proper exclusion logic to avoid duplicates between recent lists (build_index.py:134-137)

⚠️ Issues & Recommendations

1. Hardcoded HTML Injection Point

injection_marker = '<h2 id="image-and-media">Image and media</h2>'  # build_index.py:141

This creates a brittle dependency on the README structure. If this heading changes or is removed, the feature silently fails.

Recommendation: Add logging when injection fails or use a more flexible approach:

if injection_marker in body_html:
    body_html = body_html.replace(injection_marker, recent_section_html + injection_marker, 1)
else:
    print("Warning: Injection marker not found, prepending to content")
    body_html = recent_section_html + body_html

2. Missing Error Handling for JSON Loading
The _load_tools() function (build_index.py:47-51) doesn't handle JSON parsing errors, which could cause the entire build to fail.

Recommendation:

def _load_tools() -> List[dict]:
    if not TOOLS_JSON_PATH.exists():
        return []
    try:
        with TOOLS_JSON_PATH.open("r", encoding="utf-8") as fp:
            return json.load(fp)
    except (json.JSONDecodeError, OSError) as e:
        print(f"Warning: Failed to load tools.json: {e}")
        return []

3. No Input Validation
The code assumes tools.json contains well-formed data but doesn't validate the structure (build_index.py:62-79).

4. Potential XSS Risk
URLs and slugs are inserted directly into HTML without sanitization (build_index.py:88-89, build_index.py:101). While this data comes from internal sources, it's a security best practice to sanitize.

Recommendation: Use html.escape() for user-facing content:

items.append(
    f'<li><a href="{html.escape(url)}">{html.escape(slug)}</a>{date_html}</li>'
)

5. Build Process Concerns

  • The workflow installs markdown twice (build.sh:8 and .github/workflows/pages.yml:37)
  • No validation that build_index.py succeeded before continuing

🔍 Performance Considerations

  • The script processes all tools for each run - for large datasets, consider caching
  • Multiple file I/O operations could be optimized with batch processing
  • The markdown conversion happens on every build (appropriate for this use case)

🧪 Test Coverage

Missing Test Coverage:

  • No tests for the new build_index.py script
  • Date parsing edge cases (_parse_iso_datetime)
  • HTML injection logic
  • Error handling paths

Recommendation: Add basic unit tests:

def test_parse_iso_datetime():
    assert _parse_iso_datetime("2023-01-01T00:00:00Z") is not None
    assert _parse_iso_datetime("invalid") is None
    assert _parse_iso_datetime(None) is None

📁 File-Specific Comments

build.sh

  • Good simplification removing complex logic
  • Clear, focused build steps

.github/workflows/pages.yml

  • Proper addition of build_index.py step
  • Maintains existing workflow structure

🎯 Priority Recommendations

  1. High: Add JSON parsing error handling
  2. Medium: Add logging for injection marker failures
  3. Medium: Add basic unit tests
  4. Low: Add HTML sanitization for external-facing content

📋 Security Assessment

No significant security concerns detected. The code handles local files and generates static HTML. The XSS risk mentioned above is low-priority given the controlled input sources.

Overall, this is a well-implemented feature that adds valuable functionality with minimal risk. The suggested improvements would make it more robust and maintainable.


@cloudflare-workers-and-pages
Copy link

Deploying tools with  Cloudflare Pages  Cloudflare Pages

Latest commit: 7678780
Status: ✅  Deploy successful!
Preview URL: https://99e8f1f8.tools-b1q.pages.dev
Branch Preview URL: https://codex-add-recently-added-and.tools-b1q.pages.dev

View logs

@simonw
Copy link
Owner Author

simonw commented Oct 18, 2025

From the Cloudflare preview: https://99e8f1f8.tools-b1q.pages.dev/

CleanShot 2025-10-17 at 21 35 33@2x

It's in the wrong place, and the dates are all today.

@simonw
Copy link
Owner Author

simonw commented Oct 18, 2025

@copilot take over and fix this.

@simonw
Copy link
Owner Author

simonw commented Oct 18, 2025

Looks like asking Copilot to take over this one didn't work. I'm going to merge it and start a new session with Copilot.

@simonw simonw merged commit 18063c9 into main Oct 18, 2025
4 checks passed
@simonw simonw deleted the codex/add-recently-added-and-updated-lists-to-index.html branch October 18, 2025 04:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants