Skip to content

Commit 1bbf88f

Browse files
author
Stephan Brandauer
committed
Java: basic version of automodel extraction queries
1 parent 59c43c7 commit 1bbf88f

File tree

1 file changed

+58
-0
lines changed

1 file changed

+58
-0
lines changed

java/ql/automodel/src/README.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Automodel Java Extraction Queries
2+
3+
This pack contains the automodel extraction queries for Java.
4+
5+
## Extraction Queries in `java/ql/automodel/src`
6+
7+
This pack contains extraction queries for application mode and framework mode.
8+
9+
| Kind | Mode | Query File |
10+
|------|------|------------|
11+
| Candidates | Application Mode | [AutomodelApplicationModeExtractCandidates.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeExtractCandidates.ql) |
12+
| + Examples | Application Mode | [AutomodelApplicationModeExtractPositiveExamples.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeExtractPositiveExamples.ql) |
13+
| - Examples | Application Mode | [AutomodelApplicationModeExtractNegativeExamples.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeExtractNegativeExamples.ql) |
14+
| Candidates | Framework Mode | [AutomodelFrameworkModeExtractCandidates.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelFrameworkModeExtractCandidates.ql) |
15+
| + Examples | Framework Mode | [AutomodelFrameworkModeExtractPositiveExamples.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelFrameworkModeExtractPositiveExamples.ql) |
16+
| - Examples | Framework Mode | [AutomodelFrameworkModeExtractNegativeExamples.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelFrameworkModeExtractNegativeExamples.ql) |
17+
18+
## Running the Queries
19+
20+
The extraction queries are part of a separate query pack, `java-automodel-queries`. Use this pack to run them. The queries are tagged appropriately, you can use the tags (example here: https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeExtractNegativeExamples.ql#L8) to construct query suites.
21+
22+
For example, a query suite selecting all example extraction queries (positive and negative) for application mode looks like this:
23+
24+
```
25+
# File: automodel-application-mode-extraction-examples.qls
26+
# ---
27+
# Query suite for extracting examples for automodel
28+
29+
- description: Automodel application mode examples extraction.
30+
- queries: .
31+
from: codeql/java-automodel-queries
32+
- include:
33+
tags contain all:
34+
- automodel
35+
- extract
36+
- application-mode
37+
- examples
38+
```
39+
40+
## Important Software Design Concepts and Goals
41+
42+
### Concept: `Endpoint`
43+
44+
Endpoints are source code locations of interest. All +/- examples and all candidates are endpoints, but not all endpoints are examples or candidates. Each mode decides what endpoints are relevant. For instance, if the Java application mode wants to support candidates for sinks that are arguments passed to unknown method calls, then the Java application mode implementation needs to make sure that method arguments are endpoints. If you look at the `TApplicationModeEndpoint` implementation in [AutomodelApplicationModeCharacteristics.qll](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeCharacteristics.qll), you can see that this is the case: the `TExplicitArgument` implements this behavior.
45+
46+
### Concept: `EndpointCharacteristics`
47+
48+
In the file [AutomodelSharedCharacteristics.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelSharedCharacteristics.ql), you will find the definition of the QL class `EndpointCharacteristic`.
49+
50+
An endpoint characteristic is a QL class that "tags" all endpoints for which the characteristic's `appliesToEndpoint` predicate holds. The characteristic defines a `hasImplications` predicate that declares whether all the endpoints should be considered as sinks/sources/negatives, and with which confidence.
51+
52+
#### :warning: Warning
53+
54+
Do not to "fix" shortcomings that could be fixed by a better prompt or better example selection by adding language- or mode-specific characteristics . Those "fixes" tend to be confusing downstream when questions like "why wasn't this location selected as a candidate?" is harder and harder to answer. It's best to rely on characteristics in the code that is shared across all languages and modes (see [Shared Code](#shared-code)).
55+
56+
## Shared Code
57+
58+
A significant part of the behavior of extraction queries is implemented in shared modules. When we add support for new languages, we expect to move the shared code to a separate QL pack. In the mean time, shared code modules must not import any java libraries.

0 commit comments

Comments
 (0)