Skip to content

Enhance code quality and robustness: fixes for pattern context, parser loop, namespace validation, week validation, and bitflags#287

Closed
mundanevision20 wants to merge 5 commits intofacelessuser:mainfrom
mundanevision20:mundanevision20/enhance-code-quality
Closed

Enhance code quality and robustness: fixes for pattern context, parser loop, namespace validation, week validation, and bitflags#287
mundanevision20 wants to merge 5 commits intofacelessuser:mainfrom
mundanevision20:mundanevision20/enhance-code-quality

Conversation

@mundanevision20
Copy link

Hi @facelessuser ,
thank you for maintaining soupsieve; I appreciate your work.

Suggested changes

  • Fixes for pattern-context offsets, parser loop termination, namespace validation, week validation, and bitflags assignment.

The targeted fixes informed by reproduced edge cases and investigation notes to address incorrect offsets, potential parser non-termination on malformed input, namespace validation gaps, week validation edge cases, and an incorrect bitflag assignment. Changes are minimal and focused on root causes.

I'm happy to split this PR into smaller PRs or add a changelog entry if preferred.

@gir-bot gir-bot added S: needs-review Needs to be reviewed and/or approved. C: css-matching Related to CSS matching. C: css-parsing Related to CSS parsing. C: source Related to source code. labels Jan 14, 2026
@facelessuser
Copy link
Owner

I probably would like to see these broken up, but additionally, tests defined that show the use cases you are fixing to help justify the changes.

@facelessuser
Copy link
Owner

Just taking a quick look:

  1. The change related to calendar week makes sense and was a missed case; this should be easily testable, demonstrating edge cases.
  2. The CustomSelector and Namespaces validator change seems to be a valid change (honestly, I don't know why I wasn't validating the keys of the dictionary). Should be testable.
  3. I'm curious if you were able to reproduce the infinite loop case, but the idea of handling a failed case makes sense. I'm curious if the issue can be produced or if you are just protecting against a theoretical case. Through development, I've never run into such a case as things "should" always be valid, but, obviously, the real-world may prove different 🙂.
  4. I'm not sure if the flag case is fixing a real case or a perceived case. I'd like to see a test that fails before the fix that is now working after.
  5. The index case, I'm also curious to see if there is a real case. I hadn't noticed issues with the indexing, but that doesn't mean there isn't one.

@facelessuser
Copy link
Owner

Also, changelog entries would be appreciated.

@facelessuser
Copy link
Owner

It looks like the changes break a few week :in-range and :out-of-range tests. It also seems to break a debug output case.

These changes need to be fully tested, and if an existing test needs to be changed, we need to be very sure the original case was wrong. From what I can tell, none of the failing tests should be failing. It seems the proposed changes actually introduce some regressions. So breaking these changes up and validating separately is definitely my preferred approach. I think some of these changes may not be desired.

@facelessuser
Copy link
Owner

It seems our :in-range and :out-of-range tests were wrong. The tests break because your change fixes the issue. It's likey I implemented this before browsers supported things as right now I can only get Chrome to show results.

@facelessuser
Copy link
Owner

I'm fairly certain the failure TestSyntaxErrorReporting.test_syntax_error_with_multiple_lines is a regression, but the week tests are actually an improvement.

@facelessuser
Copy link
Owner

The infinite loop, I suspect, doesn't actually occur in the wild. I'm thinking, if it does, it should likely throw an exception.

Copy link
Owner

@facelessuser facelessuser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I started digging into this, I think we can address everything in this PR without splitting. I hadn't planned to do a deep dive on this, but I did.

# Some patterns require additional logic, such as default. We try to make these the
# last pattern, and append the appropriate flag to that selector which communicates
# to the matcher what additional logic is required.
# Preserve any flags that were set during parsing (e.g. :empty, :root)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need to see evidence via new test(s), showing why this is needed.

col = index - last + 1
elif last <= index < m.end(0):
indent = '--> '
offset = (-1 if index > m.start(0) else 0) + 3
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this fixes anything, but tests show it regresses one of our tests. I'm open to making a change if a real case can be demonstrated, but it seems like the logic would have to be altered to keep the existing tests working and solve the new test case, if one was shown to need fixing.

@facelessuser
Copy link
Owner

I've actually picked out the verified fixes from this PR in #288. I've added appropriate tests and such and will merge them separately.

That leaves us just with two things:

  1. Proving there is an issue that requires the flag fix.
  2. Proving there is an issue that requires an indexing fix, and if so, providing a proper fix that doesn't break current tests.

@mundanevision20
Copy link
Author

mundanevision20 commented Jan 16, 2026

Thanks @facelessuser for your valuable feedback on this PR. I'm very happy to see that you already merged some of the suggested changes. I'm very grateful for your time spent on this project. I'll need a few days for a deep dive to answer your questions.

Thanks again for your support!

@facelessuser
Copy link
Owner

@mundanevision20 I'm going to close this PR as it is now out of date. If your investigations into the other two issues end up turning up real-world problems, feel free to open up a new PR or issue so we can evaluate a path forward. I will probably go ahead and publish a release just to ensure the fix for the infinite loop problem gets out there, as it is the more serious issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C: css-matching Related to CSS matching. C: css-parsing Related to CSS parsing. C: source Related to source code. S: needs-review Needs to be reviewed and/or approved.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants