Skip to content

fix: address review findings in qyl-continuation stop-judge.py (#159)#159

Merged
ANcpLua merged 1 commit intomainfrom
fix/qyl-continuation-review
Mar 6, 2026
Merged

fix: address review findings in qyl-continuation stop-judge.py (#159)#159
ANcpLua merged 1 commit intomainfrom
fix/qyl-continuation-review

Conversation

@ANcpLua
Copy link
Owner

@ANcpLua ANcpLua commented Mar 6, 2026

Fixes all findings from PR #158 reviews (Gemini, Copilot, Codex):

Security (Gemini — High)

  • Throttle file moved from /tmp to ~/.claude/qyl-continuation/ — eliminates symlink attack vector

Bugs (Copilot + Gemini)

  • Removed stop_hook_active gate — field doesn't exist in Stop hook payload, throttle guard was never firing
  • Read msg.message.content in addition to msg.content — transcript entries nest content, so H1-H4 heuristics were inoperative
  • Pure Python transcript reading — replaces tail subprocess (platform-independent)
  • Byte-aware truncation for MAX_CONTEXT_BYTES — no more mid-codepoint slicing
  • Handle structured_output: null — prevents AttributeError in judge response parsing

Logic (Codex)

  • H3/H4 check NEXT_STEP_RX before stopping — don't stop when assistant explicitly states pending next action

Summary by CodeRabbit

  • Bug Fixes

    • Improved transcript loading and context handling reliability
    • Enhanced JSON parsing robustness with improved fallback handling
    • Better evaluation decision logic for edge cases
    • Fixed rate-limiting and throttle mechanism behavior
  • Refactor

    • Streamlined response evaluation heuristics and flow
    • Improved working directory and file management
    • Optimized transcript context truncation logic

- Security: move throttle file from /tmp to ~/.claude/qyl-continuation/ (symlink attack)
- Bug: remove stop_hook_active gate — field doesn't exist in Stop payload
- Bug: read msg.message.content for transcript entries (H1-H4 were inoperative)
- Bug: read transcript in Python instead of shelling out to tail
- Bug: byte-aware truncation for MAX_CONTEXT_BYTES
- Logic: H3/H4 now check for NEXT_STEP_RX before stopping
- Robustness: handle structured_output being null
Copilot AI review requested due to automatic review settings March 6, 2026 19:28
@ANcpLua ANcpLua merged commit 236cf8f into main Mar 6, 2026
6 of 8 checks passed
@ANcpLua ANcpLua deleted the fix/qyl-continuation-review branch March 6, 2026 19:28
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses several critical findings from previous reviews, significantly enhancing the robustness, security, and logical accuracy of the qyl-continuation stop-judge. The changes ensure more reliable operation by fixing bugs related to content parsing and throttling, improving security by relocating temporary files, and refining decision-making logic to better interpret the assistant's intent regarding continuation.

Highlights

  • Security Improvement: The throttle file, previously stored in /tmp, has been relocated to ~/.claude/qyl-continuation/ to mitigate potential symlink attack vectors.
  • Bug Fixes: The stop_hook_active gate has been removed, ensuring the throttle guard consistently fires. The system now correctly reads nested content (msg.message.content) in addition to msg.content, resolving issues with H1-H4 heuristics. The transcript reading mechanism has been rewritten in pure Python, replacing the tail subprocess for improved platform independence. Byte-aware truncation for MAX_CONTEXT_BYTES prevents mid-codepoint slicing, and structured_output: null is now properly handled to avoid AttributeError in judge response parsing.
  • Logic Refinement: Heuristics H3 and H4 now check for NEXT_STEP_RX before stopping, preventing premature stops when the assistant explicitly indicates pending next actions.
Changelog
  • plugins/qyl-continuation/hooks/stop-judge.py
    • Relocated the throttle file from /tmp to a user-specific directory (~/.claude/qyl-continuation/) to enhance security.
    • Removed the stop_hook_active condition, making the throttle guard always active.
    • Implemented a pure Python solution for reading transcript files, replacing the external tail command.
    • Introduced a new helper function _get_content to correctly extract content from messages, including nested structures.
    • Modified H3 and H4 heuristics to incorporate a check for NEXT_STEP_RX, preventing premature stops when the assistant indicates further steps.
    • Ensured byte-aware truncation for the transcript JSON to prevent errors with multi-byte characters.
    • Improved the parsing of the judge's response to gracefully handle structured_output: null.
Activity
  • No human activity has been recorded on this pull request since its creation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Mar 6, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7926660b-16b8-4a88-8246-b2f4e293aa9c

📥 Commits

Reviewing files that changed from the base of the PR and between 57f8db0 and 3cfd5b4.

📒 Files selected for processing (1)
  • plugins/qyl-continuation/hooks/stop-judge.py

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting


📝 Walkthrough

Walkthrough

Modified the stop-judge hook to introduce persistent working directory storage, refactor transcript loading from subprocess-based to direct disk reading, update throttle and heuristic evaluation logic, and enhance judge phase robustness with improved JSON parsing and context truncation.

Changes

Cohort / File(s) Summary
Infrastructure & Persistence
plugins/qyl-continuation/hooks/stop-judge.py
Added WORK_DIR constant for persistent storage, replaced hard-coded /tmp throttle path with persistent throttle file in WORK_DIR, integrated WORK_DIR as cwd for judge process.
Transcript & Content Handling
plugins/qyl-continuation/hooks/stop-judge.py
Switched transcript loading from tail subprocess to direct disk reading with fixed-size byte-based tail, added pre-checks for transcript path existence, introduced private _get_content helper, refactored text_of and has_block_type to use new helper.
Decision Logic & Evaluation
plugins/qyl-continuation/hooks/stop-judge.py
Introduced NEXT_STEP_RX regex to detect explicit user-next-step signals, updated heuristic conditions (H1-H4) to incorporate next-step detection, enhanced judge output parsing to tolerate missing structured_output fields, updated decision logic to default to stopping when evaluation is unparseable.
Control Flow & Execution
plugins/qyl-continuation/hooks/stop-judge.py
Modified approve and block functions to exit immediately after writing JSON output, removed throttle logic gating (now always executed), refactored UTF-8 byte-based context truncation for MAX_CONTEXT_BYTES compliance.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/qyl-continuation-review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3cfd5b46e8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +83 to +84
tail_bytes = raw[-(TAIL_LINES * 8192):] if len(raw) > TAIL_LINES * 8192 else raw
lines = [l.strip() for l in tail_bytes.decode("utf-8", errors="ignore").split("\n") if l.strip()]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Tail transcript by lines instead of fixed-size byte window

Using raw[-(TAIL_LINES * 8192):] assumes each of the last transcript lines is <=8KB, so a valid but large final JSON line gets truncated and fails json.loads, which can leave messages empty and force approve("Empty transcript") even when there is actionable context to continue. This is a behavioral regression from line-based tailing and will misclassify long tool outputs or assistant messages.

Useful? React with 👍 / 👎.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses previous review findings, enhancing security, fixing bugs, and refining logic, specifically by relocating the throttle file and improving message content parsing. However, it introduces two security issues: a potential path traversal vulnerability on Windows due to incomplete sanitization of the session_id, and a denial-of-service risk caused by reading the entire transcript file into memory. This memory-intensive approach also poses a significant performance issue for large files. These issues should be addressed to ensure the robustness and security of the hook across all platforms.

lines = [l.strip() for l in tail.stdout.strip().split("\n") if l.strip()]
except (subprocess.TimeoutExpired, OSError):
p = Path(transcript_path)
raw = p.read_bytes()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The current implementation reads the entire transcript file into memory using p.read_bytes(). This poses a denial-of-service risk, as extremely large files or named pipes could cause the script to crash or hang indefinitely due to excessive memory consumption. This also represents a performance regression from the previous tail-based approach. A more memory-efficient approach would be to seek near the end of the file and read only the last chunk.

transcript_path = event.get("transcript_path", "")
session_id = event.get("session_id", "unknown")
throttle_file = Path(f"/tmp/.qyl-continue-{session_id.replace('/', '_')}")
throttle_file = WORK_DIR / f"throttle-{session_id.replace('/', '_')}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The session_id is sanitized by replacing forward slashes with underscores, but backslashes (which are also path separators on Windows) are not replaced. This allows for potential path traversal out of the intended WORK_DIR on Windows systems. An attacker who can control the session_id could potentially point the throttle_file to a sensitive location, leading to unauthorized file deletion or modification when write_text() or unlink() is called.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the qyl-continuation stop hook judge to address prior review findings, primarily around safer throttling, correct transcript parsing, and more robust judge-response handling.

Changes:

  • Moves the throttle state file into ~/.claude/qyl-continuation/ and removes the non-functional stop_hook_active gating so throttling actually applies.
  • Fixes transcript parsing by reading msg.message.content in addition to msg.content, and replaces tail with a Python-based tail implementation.
  • Improves continuation logic and robustness (NEXT_STEP heuristic checks, byte-aware prompt truncation, and structured_output: null handling).

Comment on lines +25 to +26
WORK_DIR = Path.home() / ".claude" / "qyl-continuation"
WORK_DIR.mkdir(parents=True, exist_ok=True)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WORK_DIR.mkdir(...) runs at import time and can raise (e.g., HOME not set, permission issues), which would crash the hook without emitting the required JSON decision. Consider wrapping directory creation in a try/except and falling back to a safe behavior (e.g., disable throttling / use a temporary per-run dir) while still calling approve/block.

Copilot uses AI. Check for mistakes.
Comment on lines +81 to +85
p = Path(transcript_path)
raw = p.read_bytes()
tail_bytes = raw[-(TAIL_LINES * 8192):] if len(raw) > TAIL_LINES * 8192 else raw
lines = [l.strip() for l in tail_bytes.decode("utf-8", errors="ignore").split("\n") if l.strip()]
lines = lines[-TAIL_LINES:]
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transcript tailing logic reads the entire transcript into memory (read_bytes()), and then slices a fixed byte window (TAIL_LINES * 8192). Besides being costly for large transcripts, it can also miss/partial the last JSONL entries when a single line is larger than the byte window, causing json.loads to fail and heuristics to operate on an incomplete/empty message set. Consider reading from the end of the file via seek/backward scan until you have N newlines (and avoid loading the whole file).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants