Skip to content

Conversation

@wsimonw
Copy link

@wsimonw wsimonw commented Jan 21, 2026

This PR is based on the java-cpg modul poposed in PR#1659.

The goal of this PR is to reduce false positive token matches by using characteristic vectors. Charcteristic vectors describe structural elements of the AST subtree below each node that is then transformed into a token. In addition, variable flow analysis is used to construct a semantic vector, allowing variable dependencies to be considered during plagiarism detection.

robinmaisch and others added 30 commits December 29, 2023 11:55
Content:
 - A new CPG language frontend for JPlag
 - An interface to transform submissions into CPGs
 - An interface to transform CPGs into token lists
 - A Graph Transformation Engine (to be extended)
   . interfaces representing node and graph patterns, matches of these patterns, transformations
   . an isomorphism detector
   . a transformation algorithm
 - Some graph transformations (to be extended)
- implemented multi-root graph patterns
- implemented searching for "all matches at once"
- GraphOperations should leave EOG intact
- implemented new kinds of edges and properties for graph patterns
- tokenization works well
- implemented DFG sort pass
-- this requires specialized treatment for all kinds of language features. Surely the considered feature set is incomplete.
It was designed for local use.
Add many comments and put a file in the 'passes' package. When JavaDoc cannot find a Java file in there, it quits.
…dingly,

refactored NodeRegistry to only contain SemanticVectors
…ined as equivalent, added these meodels for java-cpg and created according options
…d vector calculation, exspecially do mark e.g. i++ as assignment, split semantic tokens and including variable dependencies with own option
…shment between semantic analysis and characteristic vectors
… variable flow analysis. Added differenet Comparison metrics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants