handle variable length args in tag validation #351

joshuadavidthomas · 2025-11-04T06:52:30Z

closes #339

Summary by CodeRabbit

Bug Fixes
- Fixed false-positive "accepts at most N arguments" errors for expressions with operators and for quoted strings containing spaces in template tags.
Tests
- Added comprehensive tests covering multi-token expressions, quoted/escaped strings, varargs, assignments, zero-arg tags, and edge cases.
Documentation
- Added a "Fixed" subsection under Unreleased in the changelog.
Refactor
- Improved tag-argument parsing and validation to correctly handle tokenization and per-argument consumption rules.

codspeed-hq · 2025-11-04T07:13:41Z

CodSpeed Performance Report

Merging #351 will improve performances by 16.37%

_{Comparing fix-variable-args-in-tag (2d70bb8) with main (2da341b)}

Summary

⚡ 1 improvement
✅ 19 untouched

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`parse_template[django/medium/admin_login.html]`	112.1 µs	96.3 µs	+16.37%

coderabbitai · 2025-11-04T13:05:01Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2025-11-04T13:05:04Z

Walkthrough

Replaces naive whitespace splitting with a quote-aware tag-argument tokenizer and refactors argument validation to consume tokens per TagArg variant, preventing false-positive "accepts at most N arguments" errors for operator-containing expressions and quoted strings with spaces.

Changes

Cohort / File(s)	Summary
Changelog `CHANGELOG.md`	Added "Fixed" entries under Unreleased for false positives when tag arguments contain operators (e.g., `{% if x > 0 %}`) and for quoted strings with spaces (e.g., `{% translate "Contact the owner" %}`).
Parser — tag tokenization `crates/djls-templates/src/parser.rs`	Introduced a private `parse_tag_args` quote-aware tokenizer handling quoted strings and escapes; `parse_block` now calls it to produce `(name, tokens[])` instead of naive whitespace splitting.
Semantic — argument validation `crates/djls-semantic/src/semantic/args.rs`	Removed unconditional TooManyArguments check; refactored validation to consume tokens per `TagArg` variant (Var/String: one token; Expr: greedy until next literal; Assignment: up to literal with "as"/"=" handling; VarArgs: remaining tokens). Added `find_next_literal` helper and extensive tests covering expressions, quoted strings, assignments, varargs, and extra-argument cases.

Sequence Diagram(s)

sequenceDiagram
    participant Parser
    participant Tokenizer as parse_tag_args()
    participant Semantic as validate_args()
    participant Validator as validate_choices_and_order()

    Parser->>Tokenizer: tag content string
    activate Tokenizer
    Tokenizer-->>Parser: (name, tokens[]) 
    deactivate Tokenizer

    Parser->>Semantic: pass tokens[]
    activate Semantic
    Semantic->>Semantic: extract literal specs
    Semantic->>Validator: validate order & token consumption
    activate Validator
    alt Token is Expr
        Validator->>Validator: consume tokens greedily until next literal or end
        Note right of Validator: multi-token expressions (e.g., "message.input_tokens > 0")
    else Token is String or Var
        Validator->>Validator: consume exactly one token
    else Token is Assignment
        Validator->>Validator: consume up to next literal with "as"/"=" handling
    else Token is VarArgs
        Validator->>Validator: consume remaining tokens
    end
    Validator-->>Semantic: validation result / errors
    deactivate Validator
    Semantic-->>Parser: report errors or success
    deactivate Semantic

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Inspect crates/djls-semantic/src/semantic/args.rs for correct token-consumption logic (Expr greedy behavior and Assignment edge cases).
Verify crates/djls-templates/src/parser.rs parse_tag_args handles quotes, escapes, and both quote types to align with semantic expectations.
Review added tests for coverage of operator expressions, quoted strings with spaces, varargs, and extra-argument scenarios.

Poem

🐇 I nibble tokens in quotes and leap past signs,
No more stray counts when an expression aligns.
Spaces in strings stay snug and one,
Assignments and varargs hopped home, job done. 🎉

Pre-merge checks and finishing touches

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'handle variable length args in tag validation' clearly and specifically summarizes the main change addressing argument count validation in Django template tags.
Linked Issues check	✅ Passed	The PR successfully addresses issue #339 by fixing false positive S105 errors for expressions with operators like comparisons in template tags.
Out of Scope Changes check	✅ Passed	All changes focus on fixing argument validation in tag handling; no unrelated modifications were introduced outside the scope of issue #339.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix-variable-args-in-tag

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e67012 and 2d70bb8.

📒 Files selected for processing (1)

crates/djls-semantic/src/semantic/args.rs (4 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

crates/djls-semantic/src/semantic/args.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: Python 3.13, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.14, Django main (ubuntu-latest)
GitHub Check: Python 3.14, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.13, Django 6.0a1 (ubuntu-latest)
GitHub Check: Python 3.14, Django 6.0a1 (ubuntu-latest)
GitHub Check: Python 3.14, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.12, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.13, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.10, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.13, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.12, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.10, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.11, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.10, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.12, Django main (ubuntu-latest)
GitHub Check: Python 3.12, Django 6.0a1 (ubuntu-latest)
GitHub Check: Python 3.12, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.11, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.11, Django 5.2 (ubuntu-latest)
GitHub Check: benchmarks

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

joshuadavidthomas · 2025-11-04T13:06:07Z

@coderabbitai review

coderabbitai · 2025-11-04T13:06:21Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

crates/djls-semantic/src/semantic/args.rs (1)

318-750: Excellent test coverage for the core scenarios.

The test suite comprehensively covers:

Multi-token expressions (the original issue in #339)

Quoted strings as single tokens

Optional and required literals

VarArgs handling

Assignment patterns

Zero-argument tags rejecting extras

Complex tag structures

The check_validation_errors helper (lines 390-431) properly constructs templates with appropriate closing tags for validation.

Consider adding tests for the edge cases noted in previous comments:

An Expr followed by a Literal when only the literal token is provided

An Assignment followed by a Literal where the assignment consumes the literal (verifying the fix for the major issue flagged earlier)

These would help prevent regressions and clarify the intended behavior.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7375b18 and 399072a.

📒 Files selected for processing (3)

CHANGELOG.md (1 hunks)
crates/djls-semantic/src/semantic/args.rs (4 hunks)
crates/djls-templates/src/parser.rs (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

CHANGELOG.md

🧰 Additional context used

🧬 Code graph analysis (2)

crates/djls-templates/src/parser.rs (1)

crates/djls-templates/src/nodelist.rs (1)

span (41-49)

crates/djls-semantic/src/semantic/args.rs (4)

crates/djls-semantic/src/templatetags/builtins.rs (1)

django_builtin_specs (716-718)

crates/djls-semantic/src/templatetags/specs.rs (1)

name (264-274)

crates/djls-semantic/src/db.rs (4)

tag_specs (12-12)

tag_index (14-14)

template_dirs (16-16)

diagnostics_config (19-19)

crates/djls-semantic/src/lib.rs (1)

validate_nodelist (41-50)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: Python 3.14, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.13, Django 6.0a1 (ubuntu-latest)
GitHub Check: Python 3.14, Django 6.0a1 (ubuntu-latest)
GitHub Check: Python 3.13, Django main (ubuntu-latest)
GitHub Check: Python 3.11, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.14, Django main (ubuntu-latest)
GitHub Check: Python 3.13, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.12, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.14, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.14, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.12, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.13, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.11, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.12, Django 6.0a1 (ubuntu-latest)
GitHub Check: Python 3.13, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.10, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.10, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.12, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.11, Django 5.1 (ubuntu-latest)
GitHub Check: benchmarks

🔇 Additional comments (8)

crates/djls-templates/src/parser.rs (2)

88-88: LGTM: Clean delegation to quote-aware parser.

The refactor delegates tag argument parsing to the new parse_tag_args helper, which implements quote-aware tokenization to fix the variable-length argument issue.

94-130: Quote-aware tokenization looks solid.

The implementation correctly:

Preserves quoted strings as single tokens (lines 108-114)

Handles escape sequences within quotes (lines 102-107)

Splits on whitespace only outside quotes (lines 116-120)

Handles edge cases like trailing tokens (lines 124-126) and empty tags (line 128)

One minor note: Unclosed quotes will cause the remainder of the content to be treated as a single token (the in_quote state remains Some). This is reasonable for error recovery, though it means malformed quotes won't be explicitly flagged here.
crates/djls-semantic/src/semantic/args.rs (6)
147-157: Appropriate setup for enhanced validation logic.

The clippy::too_many_lines allow is justified given the complexity of per-argument-type validation. Using enumerate to track arg_index enables the subsequent find_next_literal lookups.

204-207: LGTM: Explicit single-token consumption for atomic arguments.

Var and String arguments correctly consume exactly one token each, making the validation behavior explicit and clear.

208-231: Expression consumption logic handles the core issue correctly.

The greedy consumption until the next literal (lines 218-225) correctly solves the problem where {% if message.input_tokens > 0 %} was incorrectly flagged as too many arguments.

Minor edge case: If an Expr is immediately followed by a Literal in the spec, and the only remaining token is that literal, lines 228-230 will consume the literal as part of the expression. This creates an error state where both the expression and literal end up incorrect. However, this is an unlikely scenario in practice since well-formed templates would provide tokens for both required arguments.

Consider adding a test for this edge case:
// {% mytag literal %} where spec expects [Expr(required), Literal("literal", required)]
let bits = vec!["literal".to_string()];
let args = vec![
    TagArg::Expr { name: "expr".into(), required: true },
    TagArg::Literal { lit: "literal".into(), required: true },
];
// What error should this produce?
263-267: LGTM: VarArgs correctly consumes all remaining tokens.

Setting bit_index = bits.len() ensures that variable-length argument tags (like url and load) never trigger the "too many arguments" check at line 271.

270-281: Correct fix: Check for unconsumed tokens after processing all arguments.

This is the core fix for issue #339. After processing all argument specs, any remaining tokens indicate too many arguments were provided. The max: args.len() represents the number of argument positions in the spec (not the number of tokens), which correctly reflects how Django counts "arguments."

Note that VarArgs sets bit_index = bits.len() (line 265), so this check is never reached for tags with variable-length arguments.

295-316: LGTM: Helper functions are straightforward and correct.

argument_name (lines 295-305): Extracts the name from any TagArg variant for error messages.

find_next_literal (lines 307-316): Locates the next Literal keyword in the remaining arguments, enabling Expr and Assignment arguments to know when to stop consuming tokens.

Both helpers are implemented correctly and serve their purposes well.

crates/djls-semantic/src/semantic/args.rs

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

crates/djls-semantic/src/semantic/args.rs (1)
232-263: Assignment "as" handling can still consume the next literal keyword.

While the initial check (lines 241-245) correctly validates before consuming (addressing part of the previous review), the "as" handling (lines 255-261) still consumes the variable name without checking if it matches next_literal. This can swallow the literal keyword, making it unavailable for subsequent Literal arguments.

Apply this diff to check before consuming the variable name after "as":
             // If we hit "as", consume one more token (the variable name)
             if token == "as" {
                 if bit_index < bits.len() {
-                    bit_index += 1;
+                    // Don't consume if it's the next literal keyword
+                    if let Some(lit) = next_literal {
+                        if bits[bit_index] != lit {
+                            bit_index += 1;
+                        }
+                    } else {
+                        bit_index += 1;
+                    }
                 }
                 break;
             }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d1c020 and 6e67012.

📒 Files selected for processing (1)

crates/djls-semantic/src/semantic/args.rs (4 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

crates/djls-semantic/src/semantic/args.rs (4)

crates/djls-semantic/src/templatetags/builtins.rs (1)

django_builtin_specs (716-718)

crates/djls-semantic/src/templatetags/specs.rs (1)

name (264-274)

crates/djls-semantic/src/db.rs (4)

tag_specs (12-12)

tag_index (14-14)

template_dirs (16-16)

diagnostics_config (19-19)

crates/djls-semantic/src/lib.rs (1)

validate_nodelist (41-50)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: Python 3.14, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.14, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.14, Django main (ubuntu-latest)
GitHub Check: Python 3.14, Django 6.0a1 (ubuntu-latest)
GitHub Check: Python 3.13, Django 6.0a1 (ubuntu-latest)
GitHub Check: Python 3.12, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.10, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.13, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.13, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.13, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.10, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.12, Django 6.0a1 (ubuntu-latest)
GitHub Check: Python 3.12, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.12, Django main (ubuntu-latest)
GitHub Check: Python 3.11, Django 5.2 (ubuntu-latest)
GitHub Check: Python 3.10, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.11, Django 4.2 (ubuntu-latest)
GitHub Check: Python 3.12, Django 5.1 (ubuntu-latest)
GitHub Check: Python 3.11, Django 5.1 (ubuntu-latest)
GitHub Check: benchmarks

🔇 Additional comments (3)

crates/djls-semantic/src/semantic/args.rs (3)

271-282: Correct handling of excess arguments.

The logic correctly identifies unconsumed tokens after processing all argument specs as a TooManyArguments error. The VarArgs case is properly handled by consuming all remaining tokens, preventing false positives.

308-317: Clean helper implementation.

The find_next_literal helper is straightforward and correctly identifies the next literal keyword in the argument list, enabling expression and assignment arguments to know when to stop token consumption.

319-751: Excellent test coverage addressing the core issue.

The test suite comprehensively validates the new token consumption logic:

Core fix verification: test_if_tag_with_comparison_operator (lines 435-454) directly tests issue #339, ensuring multi-token expressions with operators no longer trigger false-positive S105 errors.

Edge cases: Tests cover quoted strings, optional literals, complex expressions, assignments with filters, VarArgs, and literal boundary conditions.

Regression prevention: Tests for no-arg tags (csrf_token, debug) and fixed-arg tags (autoescape, now, regroup) ensure the refactored logic doesn't incorrectly accept extra arguments.

The test infrastructure using TestDatabase and check_validation_errors properly exercises the validation path through real Django builtin specs.

crates/djls-semantic/src/semantic/args.rs

joshuadavidthomas force-pushed the fix-variable-args-in-tag branch from d00a407 to bfdd17d Compare November 4, 2025 07:04

Base automatically changed from tagspecs-v0.5.0 to main November 4, 2025 16:15

joshuadavidthomas force-pushed the fix-variable-args-in-tag branch from 7375b18 to 90d6ac8 Compare November 4, 2025 16:17

joshuadavidthomas and others added 7 commits November 4, 2025 10:18

handle variable length args in tag validation

db2f956

update changelog

fcc7788

[pre-commit.ci] auto fixes from pre-commit.com hooks

438821b

for more information, see https://pre-commit.ci

lippy

7644f6b

clippy

8502e23

delete

5288c64

remove

e0bc4ec

joshuadavidthomas force-pushed the fix-variable-args-in-tag branch from 90d6ac8 to e0bc4ec Compare November 4, 2025 16:18

joshuadavidthomas added 3 commits November 4, 2025 10:48

update logic and tests to validate regression and negative tests

dffd0a9

clippy fmt

6d024f8

move to helper method

399072a

joshuadavidthomas marked this pull request as ready for review November 4, 2025 19:38

coderabbitai bot reviewed Nov 4, 2025

View reviewed changes

crates/djls-semantic/src/semantic/args.rs Show resolved Hide resolved

joshuadavidthomas added 2 commits November 4, 2025 13:46

index based

2d1c020

fix

6e67012

coderabbitai bot reviewed Nov 4, 2025

View reviewed changes

crates/djls-semantic/src/semantic/args.rs Outdated Show resolved Hide resolved

joshuadavidthomas added 5 commits November 4, 2025 14:28

fix

d4fe058

clippy

c91dabd

move

3e52e04

oops

31c9dc0

move

2d70bb8

joshuadavidthomas merged commit 26847c2 into main Nov 4, 2025
34 checks passed

joshuadavidthomas deleted the fix-variable-args-in-tag branch November 4, 2025 20:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

handle variable length args in tag validation #351

handle variable length args in tag validation #351

Uh oh!

joshuadavidthomas commented Nov 4, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

codspeed-hq bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Nov 4, 2025

Uh oh!

coderabbitai bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

joshuadavidthomas commented Nov 4, 2025

Uh oh!

coderabbitai bot commented Nov 4, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

handle variable length args in tag validation #351

handle variable length args in tag validation #351

Uh oh!

Conversation

joshuadavidthomas commented Nov 4, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

codspeed-hq bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #351 will improve performances by 16.37%

Summary

Benchmarks breakdown

Uh oh!

coderabbitai bot commented Nov 4, 2025

Uh oh!

coderabbitai bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

joshuadavidthomas commented Nov 4, 2025

Uh oh!

coderabbitai bot commented Nov 4, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joshuadavidthomas commented Nov 4, 2025 •

edited by coderabbitai bot

Loading

codspeed-hq bot commented Nov 4, 2025 •

edited

Loading

coderabbitai bot commented Nov 4, 2025 •

edited

Loading