Skip to content

fix: add compliance rule for MAX_TOKENS model registry#2281

Open
naorpeled wants to merge 4 commits intomainfrom
fix/compliance-max-tokens-registry
Open

fix: add compliance rule for MAX_TOKENS model registry#2281
naorpeled wants to merge 4 commits intomainfrom
fix/compliance-max-tokens-registry

Conversation

@naorpeled
Copy link
Copy Markdown
Collaborator

Summary

  • Adds a compliance checklist rule clarifying that MAX_TOKENS dictionary entries in pr_agent/algo/__init__.py are the designated model registry pattern, not hardcoded configuration
  • Prevents Qodo agentic review from flagging new model entries as "rule violations" (see Add OpenAI's latest GPT-5.4-mini and GPT-5.4-nano model #2266)

Test plan

The Qodo agentic review was flagging new model entries in the
MAX_TOKENS dictionary as "hardcoded config" rule violations. This
adds a compliance rule clarifying that MAX_TOKENS is the designated
model registry, not hardcoded configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Add compliance rule for MAX_TOKENS model registry pattern

📝 Documentation

Grey Divider

Walkthroughs

Description
• Adds compliance rule clarifying MAX_TOKENS dictionary pattern
• Prevents false positive violations in model registry entries
• Documents expected model token limit registration approach
Diagram
flowchart LR
  A["Compliance Checklist"] -->|"adds new rule"| B["Model Registry Rule"]
  B -->|"clarifies"| C["MAX_TOKENS is designated registry"]
  C -->|"prevents"| D["False positive violations"]
Loading

Grey Divider

File Changes

1. pr_compliance_checklist.yaml 📝 Documentation +6/-0

Add model registry compliance rule to checklist

• Adds new compliance rule "Model Registry Entries Are Not Hardcoded Config"
• Clarifies that MAX_TOKENS dictionary in pr_agent/algo/__init__.py is the designated model token
 registry
• Defines success criteria as following existing MAX_TOKENS patterns with comments
• Defines failure criteria as adding model tokens outside registry or without proper format

pr_compliance_checklist.yaml


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 21, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (2) 📎 Requirement gaps (0) 📐 Spec deviations (0)

Grey Divider


Action required

1. Unsupported config key 🐞 Bug ✓ Correctness ⭐ New
Description
.pr_agent.toml adds extra_instructions under [review_agent], but the PR review flow reads
get_settings().pr_reviewer.extra_instructions, so this new instruction will not be included in the
reviewer prompt. As a result, the intended behavior change (guidance about MAX_TOKENS) will not
take effect for /review runs in this repository.
Code

.pr_agent.toml[21]

+extra_instructions = "The MAX_TOKENS dictionary in pr_agent/algo/__init__.py is the designated model token registry. Adding new model entries there is the expected pattern, not a hardcoded configuration violation."
Evidence
The repo-level config adds instructions under the review_agent section, but the actual prompt
variable wired into the reviewer tool is sourced from the pr_reviewer section
(get_settings().pr_reviewer.extra_instructions). The default configuration also defines
extra_instructions under [pr_reviewer], reinforcing that this is the supported knob for review
prompt customization in this codebase.

.pr_agent.toml[18-21]
pr_agent/tools/pr_reviewer.py[77-102]
pr_agent/settings/configuration.toml[74-90]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The added setting `review_agent.extra_instructions` is not used by the PR review tool implementation in this repo; the reviewer prompt consumes `pr_reviewer.extra_instructions` instead. This makes the new instruction effectively a no-op for `/review`.

## Issue Context
`PRReviewer` constructs prompt variables and explicitly sets `extra_instructions` from `get_settings().pr_reviewer.extra_instructions`.

## Fix Focus Areas
- .pr_agent.toml[18-21]
- pr_agent/tools/pr_reviewer.py[77-102]
- pr_agent/settings/configuration.toml[74-90]

## Proposed fix
Either:
1) Move/duplicate the new `extra_instructions` entry under `[pr_reviewer]` in `.pr_agent.toml` (so it is actually consumed by `PRReviewer`),

or:
2) If `review_agent` is intended to be a real tool/command in this repo, implement consumption of `get_settings().review_agent.extra_instructions` in the relevant execution path and add a corresponding defaults section in `pr_agent/settings/configuration.toml`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. extra_instructions exceeds 120 chars 📘 Rule violation ⚙ Maintainability
Description
The newly added extra_instructions line in .pr_agent.toml exceeds the project’s 120-character
line-length convention, creating an avoidable noisy formatting deviation. This violates the
requirement to preserve existing formatting and keep 120-character lines.
Code

.pr_agent.toml[21]

+extra_instructions = "The MAX_TOKENS dictionary in pr_agent/algo/__init__.py is the designated model token registry. Adding new model entries there is the expected pattern, not a hardcoded configuration violation."
Evidence
PR Compliance ID 8 requires preserving existing formatting conventions including 120-character line
length. The added extra_instructions assignment is a single long line that exceeds 120 characters.

AGENTS.md
.pr_agent.toml[21-21]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The added `.pr_agent.toml` `extra_instructions` value is on a single line exceeding the repository’s 120-character line-length convention.
## Issue Context
This repository’s formatting guidance expects 120-character lines; long single-line config strings should be wrapped (e.g., via TOML multiline basic strings).
## Fix Focus Areas
- .pr_agent.toml[21-21]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. AGENTS.md line exceeds 120 📘 Rule violation ⚙ Maintainability
Description
The modified bullet line in AGENTS.md exceeds the project’s 120-character line-length convention,
reducing readability and diverging from established formatting. This violates the requirement to
preserve existing formatting and keep 120-character lines.
Code

AGENTS.md[7]

+- **Do** adjust configuration through `.pr_agent.toml` or files under `pr_agent/settings/` instead of hard-coding values. The `MAX_TOKENS` dictionary in `pr_agent/algo/__init__.py` is the designated model token registry — new model entries there are expected, not hardcoded configuration.
Evidence
PR Compliance ID 8 requires preserving existing formatting conventions including 120-character line
length. The updated list item is a single long line that exceeds 120 characters instead of being
wrapped appropriately for Markdown lists.

AGENTS.md
AGENTS.md[7-7]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The updated `AGENTS.md` bullet exceeds the repository’s 120-character line-length convention.
## Issue Context
Markdown list items should be wrapped onto subsequent indented lines to preserve readability and match the project’s formatting standard.
## Fix Focus Areas
- AGENTS.md[7-7]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. Ambiguous comment requirement🐞 Bug ⚙ Maintainability
Description
The new compliance rule’s success_criteria requires “appropriate comments” for new MAX_TOKENS
entries, but the existing MAX_TOKENS registry contains many entries without comments, so this does
not reflect the “existing patterns” it references. This inconsistency can lead to unpredictable or
confusing compliance expectations for future model additions.
Code

pr_compliance_checklist.yaml[R32-36]

+  - title: "Model Registry Entries Are Not Hardcoded Config"
+    compliance_label: false
+    objective: "The MAX_TOKENS dictionary in pr_agent/algo/__init__.py is the designated model token registry. Adding new model entries there is the expected pattern, not a hardcoded configuration violation"
+    success_criteria: "New model entries follow existing patterns in the MAX_TOKENS dictionary with appropriate comments"
+    failure_criteria: "Model token limits added outside of the MAX_TOKENS registry or without following the existing entry format"
Evidence
The newly added rule explicitly requires comments for new MAX_TOKENS entries. However, the
referenced registry file shows many entries in MAX_TOKENS without comments (while only some have
them), so “existing patterns” do not consistently include comments.

pr_compliance_checklist.yaml[32-36]
pr_agent/algo/init.py[1-11]
pr_agent/algo/init.py[12-18]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The compliance rule’s `success_criteria` says new `MAX_TOKENS` entries should include “appropriate comments,” but the existing `MAX_TOKENS` registry does not consistently include comments. This makes the rule internally inconsistent with its own stated reference point (“existing patterns”).
### Issue Context
`MAX_TOKENS` in `pr_agent/algo/__init__.py` contains many entries with no inline comment, and some entries with comments.
### Fix Focus Areas
- pr_compliance_checklist.yaml[32-36]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Grey Divider

Previous review results

Review updated until commit 7f42313

Results up to commit N/A


🐞 Bugs (0) 📘 Rule violations (2) 📎 Requirement gaps (0) 📐 Spec deviations (0)

Grey Divider
Action required
1. extra_instructions exceeds 120 chars 📘 Rule violation ⚙ Maintainability
Description
The newly added extra_instructions line in .pr_agent.toml exceeds the project’s 120-character
line-length convention, creating an avoidable noisy formatting deviation. This violates the
requirement to preserve existing formatting and keep 120-character lines.
Code

.pr_agent.toml[21]

+extra_instructions = "The MAX_TOKENS dictionary in pr_agent/algo/__init__.py is the designated model token registry. Adding new model entries there is the expected pattern, not a hardcoded configuration violation."
Evidence
PR Compliance ID 8 requires preserving existing formatting conventions including 120-character line
length. The added extra_instructions assignment is a single long line that exceeds 120 characters.

AGENTS.md
.pr_agent.toml[21-21]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The added `.pr_agent.toml` `extra_instructions` value is on a single line exceeding the repository’s 120-character line-length convention.
## Issue Context
This repository’s formatting guidance expects 120-character lines; long single-line config strings should be wrapped (e.g., via TOML multiline basic strings).
## Fix Focus Areas
- .pr_agent.toml[21-21]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. AGENTS.md line exceeds 120 📘 Rule violation ⚙ Maintainability
Description
The modified bullet line in AGENTS.md exceeds the project’s 120-character line-length convention,
reducing readability and diverging from established formatting. This violates the requirement to
preserve existing formatting and keep 120-character lines.
Code

AGENTS.md[7]

+- **Do** adjust configuration through `.pr_agent.toml` or files under `pr_agent/settings/` instead of hard-coding values. The `MAX_TOKENS` dictionary in `pr_agent/algo/__init__.py` is the designated model token registry — new model entries there are expected, not hardcoded configuration.
Evidence
PR Compliance ID 8 requires preserving existing formatting conventions including 120-character line
length. The updated list item is a single long line that exceeds 120 characters instead of being
wrapped appropriately for Markdown lists.

AGENTS.md
AGENTS.md[7-7]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The updated `AGENTS.md` bullet exceeds the repository’s 120-character line-length convention.
## Issue Context
Markdown list items should be wrapped onto subsequent indented lines to preserve readability and match the project’s formatting standard.
## Fix Focus Areas
- AGENTS.md[7-7]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended
3. Ambiguous comment requirement🐞 Bug ⚙ Maintainability
Description
The new compliance rule’s success_criteria requires “appropriate comments” for new MAX_TOKENS
entries, but the existing MAX_TOKENS registry contains many entries without comments, so this does
not reflect the “existing patterns” it references. This inconsistency can lead to unpredictable or
confusing compliance expectations for future model additions.
Code

pr_compliance_checklist.yaml[R32-36]

+  - title: "Model Registry Entries Are Not Hardcoded Config"
+    compliance_label: false
+    objective: "The MAX_TOKENS dictionary in pr_agent/algo/__init__.py is the designated model token registry. Adding new model entries there is the expected pattern, not a hardcoded configuration violation"
+    success_criteria: "New model entries follow existing patterns in the MAX_TOKENS dictionary with appropriate comments"
+    failure_criteria: "Model token limits added outside of the MAX_TOKENS registry or without following the existing entry format"
Evidence
The newly added rule explicitly requires comments for new MAX_TOKENS entries. However, the
referenced registry file shows many entries in MAX_TOKENS without comments (while only some have
them), so “existing patterns” do not consistently include comments.

pr_compliance_checklist.yaml[32-36]
pr_agent/algo/init.py[1-11]
pr_agent/algo/init.py[12-18]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The compliance rule’s `success_criteria` says new `MAX_TOKENS` entries should include “appropriate comments,” but the existing `MAX_TOKENS` registry does not consistently include comments. This makes the rule internally inconsistent with its own stated reference point (“existing patterns”).
### Issue Context
`MAX_TOKENS` in `pr_agent/algo/__init__.py` contains many entries with no inline comment, and some entries with comments.
### Fix Focus Areas
- pr_compliance_checklist.yaml[32-36]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider Grey Divider

Qodo Logo

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new PR compliance checklist rule to clarify that MAX_TOKENS entries in pr_agent/algo/__init__.py are the intended model registry, aiming to prevent automated reviews from incorrectly flagging new model additions as “hardcoded config”.

Changes:

  • Introduces a compliance checklist item explicitly designating MAX_TOKENS as the approved model token registry location.
  • Defines success/failure criteria to guide where model token limits should be added.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- title: "Model Registry Entries Are Not Hardcoded Config"
compliance_label: false
objective: "The MAX_TOKENS dictionary in pr_agent/algo/__init__.py is the designated model token registry. Adding new model entries there is the expected pattern, not a hardcoded configuration violation"
success_criteria: "New model entries follow existing patterns in the MAX_TOKENS dictionary with appropriate comments"
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The success criteria currently requires "appropriate comments" for new MAX_TOKENS entries, but many existing registry entries in pr_agent/algo/init.py have no inline comment. If this checklist is meant to prevent false positives, consider relaxing this to "follow the existing MAX_TOKENS entry format" (and only require comments when the limit is non-obvious), otherwise the rule may introduce new false positives for otherwise consistent entries.

Suggested change
success_criteria: "New model entries follow existing patterns in the MAX_TOKENS dictionary with appropriate comments"
success_criteria: "New model entries follow the existing entry format in the MAX_TOKENS dictionary; add inline comments when the token limit is non-obvious"

Copilot uses AI. Check for mistakes.
Move the MAX_TOKENS registry guidance to review_agent extra_instructions
in .pr_agent.toml so the agentic review actually sees and respects it.
Remove the compliance checklist entry which doesn't affect review comments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 21, 2026

Code Review by Qodo

Grey Divider

New Review Started

This review has been superseded by a new analysis

Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

The Qodo agentic review cites AGENTS.md rules as evidence for its
findings. Clarify that the MAX_TOKENS dictionary is the designated
model registry so it stops flagging new entries as rule violations.
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 21, 2026

Persistent review updated to latest commit cf8fb2f

@naorpeled
Copy link
Copy Markdown
Collaborator Author

@ofir-frd what's the best approach to avoid Qodo from adding the finding of the hardcoded values (like in #2266 (comment))?

[review_agent]
enabled = true
publish_output = true
extra_instructions = "The MAX_TOKENS dictionary in pr_agent/algo/__init__.py is the designated model token registry. Adding new model entries there is the expected pattern, not a hardcoded configuration violation."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. extra_instructions exceeds 120 chars 📘 Rule violation ⚙ Maintainability

The newly added extra_instructions line in .pr_agent.toml exceeds the project’s 120-character
line-length convention, creating an avoidable noisy formatting deviation. This violates the
requirement to preserve existing formatting and keep 120-character lines.
Agent Prompt
## Issue description
The added `.pr_agent.toml` `extra_instructions` value is on a single line exceeding the repository’s 120-character line-length convention.

## Issue Context
This repository’s formatting guidance expects 120-character lines; long single-line config strings should be wrapped (e.g., via TOML multiline basic strings).

## Fix Focus Areas
- .pr_agent.toml[21-21]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

AGENTS.md Outdated
- **Do** match the interpreter requirement declared in `pyproject.toml` (Python ≥ 3.12) and install `requirements.txt` plus `requirements-dev.txt` before running tools.
- **Do** run tests with `PYTHONPATH=.` set to keep imports functional (for example `PYTHONPATH=. ./.venv/bin/pytest tests/unittest/test_fix_json_escape_char.py -q`).
- **Do** adjust configuration through `.pr_agent.toml` or files under `pr_agent/settings/` instead of hard-coding values.
- **Do** adjust configuration through `.pr_agent.toml` or files under `pr_agent/settings/` instead of hard-coding values. The `MAX_TOKENS` dictionary in `pr_agent/algo/__init__.py` is the designated model token registry — new model entries there are expected, not hardcoded configuration.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. agents.md line exceeds 120 📘 Rule violation ⚙ Maintainability

The modified bullet line in AGENTS.md exceeds the project’s 120-character line-length convention,
reducing readability and diverging from established formatting. This violates the requirement to
preserve existing formatting and keep 120-character lines.
Agent Prompt
## Issue description
The updated `AGENTS.md` bullet exceeds the repository’s 120-character line-length convention.

## Issue Context
Markdown list items should be wrapped onto subsequent indented lines to preserve readability and match the project’s formatting standard.

## Fix Focus Areas
- AGENTS.md[7-7]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 22, 2026

Persistent review updated to latest commit 7f42313

[review_agent]
enabled = true
publish_output = true
extra_instructions = "The MAX_TOKENS dictionary in pr_agent/algo/__init__.py is the designated model token registry. Adding new model entries there is the expected pattern, not a hardcoded configuration violation."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Unsupported config key 🐞 Bug ✓ Correctness

.pr_agent.toml adds extra_instructions under [review_agent], but the PR review flow reads
get_settings().pr_reviewer.extra_instructions, so this new instruction will not be included in the
reviewer prompt. As a result, the intended behavior change (guidance about MAX_TOKENS) will not
take effect for /review runs in this repository.
Agent Prompt
## Issue description
The added setting `review_agent.extra_instructions` is not used by the PR review tool implementation in this repo; the reviewer prompt consumes `pr_reviewer.extra_instructions` instead. This makes the new instruction effectively a no-op for `/review`.

## Issue Context
`PRReviewer` constructs prompt variables and explicitly sets `extra_instructions` from `get_settings().pr_reviewer.extra_instructions`.

## Fix Focus Areas
- .pr_agent.toml[18-21]
- pr_agent/tools/pr_reviewer.py[77-102]
- pr_agent/settings/configuration.toml[74-90]

## Proposed fix
Either:
1) Move/duplicate the new `extra_instructions` entry under `[pr_reviewer]` in `.pr_agent.toml` (so it is actually consumed by `PRReviewer`),

or:
2) If `review_agent` is intended to be a real tool/command in this repo, implement consumption of `get_settings().review_agent.extra_instructions` in the relevant execution path and add a corresponding defaults section in `pr_agent/settings/configuration.toml`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra_instructions string in .pr_agent.toml hardcodes a specific file path (pr_agent/algo/__init__.py) to suppress a false-positive compliance flag. If that file is ever renamed or relocated, this instruction silently becomes stale and either stops working or misleads the review agent. It would be worth adding a brief comment nearby (or in a CHANGELOG/DECISIONS doc) explaining why this override was necessary and what the original compliance rule was triggering on, so future maintainers understand the intent rather than treating it as dead config. Additionally, if other token/model registries exist elsewhere in the codebase with similar patterns, this instruction only exempts the one file — consider whether the scope is broad enough or if the underlying compliance rule itself should be tuned to recognize registry patterns more generally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants