Skip to content

[pysrc2cpg] Fix **kwargs handling, add match pattern AST nodes, and test coverage#5910

Open
allsmog wants to merge 1 commit intojoernio:masterfrom
allsmog:py/fix-kwargs-and-match-patterns
Open

[pysrc2cpg] Fix **kwargs handling, add match pattern AST nodes, and test coverage#5910
allsmog wants to merge 1 commit intojoernio:masterfrom
allsmog:py/fix-kwargs-and-match-patterns

Conversation

@allsmog
Copy link
Copy Markdown

@allsmog allsmog commented Mar 30, 2026

Fixes Python structural/AST issues and adds test coverage.

  • Fix **kwargs unpacking to preserve dict argument in CPG for taint tracking (was silently dropped)
  • Convert match statement patterns to proper AST nodes in case body blocks while preserving JumpTarget strings for CfgCreator compatibility
  • Add walrus operator tests verifying expression semantics in conditions
  • Add comprehensive comprehension tests (list, set, dict, generator)

Test plan

  • New kwargs tests in CallCpgTests
  • Updated MatchCpgTests for pattern AST nodes + new literal pattern test
  • New walrus operator tests in AssignCpgTests
  • New ComprehensionCpgTests
  • All existing pysrc2cpg tests pass

…est coverage

- Fix **kwargs unpacking to preserve dict argument for taint tracking
- Convert match statement patterns to proper AST nodes in case body blocks
- Add walrus operator tests verifying expression semantics
- Add comprehensive comprehension tests (list, set, dict, generator)
}
}

"walrus operator (named expression)" should {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not (named expression) but rather (assignment expression)

}

"call with **kwargs unpacking" should {
lazy val cpg = code("""func(a, **my_dict)""".stripMargin, "test.py")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove lazy. The cpgs returned by code are always created lazily.
There is also no need to explicitly specify the filename of the test code.
There are multiple other instances of this. Fix those as well.


blockFirstCase.label shouldBe NodeTypes.BLOCK
blockFirstCase.code shouldBe "print(1)"
blockFirstCase.astChildren.code.l should contain allOf ("a", "b", "print(1)")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the case values a, b as individual statements in the first case block does not seem like a good representation. It does not encode the semantic and I fail to see in which scenarios that helps. Please explain.

None
// We use a synthetic argument name to preserve the unpacked dict as an argument
// in the CPG so that data flow tracking can follow through it.
("**", convert(keyword.value))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a val instead of string constant and instead of ** use <keyword_dict>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants