|
| 1 | +# Automodel Java Extraction Queries |
| 2 | + |
| 3 | +This pack contains the automodel extraction queries for Java. |
| 4 | + |
| 5 | +## Extraction Queries in `java/ql/automodel/src` |
| 6 | + |
| 7 | +This pack contains extraction queries for application mode and framework mode. |
| 8 | + |
| 9 | +| Kind | Mode | Query File | |
| 10 | +|------|------|------------| |
| 11 | +| Candidates | Application Mode | [AutomodelApplicationModeExtractCandidates.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeExtractCandidates.ql) | |
| 12 | +| + Examples | Application Mode | [AutomodelApplicationModeExtractPositiveExamples.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeExtractPositiveExamples.ql) | |
| 13 | +| - Examples | Application Mode | [AutomodelApplicationModeExtractNegativeExamples.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeExtractNegativeExamples.ql) | |
| 14 | +| Candidates | Framework Mode | [AutomodelFrameworkModeExtractCandidates.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelFrameworkModeExtractCandidates.ql) | |
| 15 | +| + Examples | Framework Mode | [AutomodelFrameworkModeExtractPositiveExamples.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelFrameworkModeExtractPositiveExamples.ql) | |
| 16 | +| - Examples | Framework Mode | [AutomodelFrameworkModeExtractNegativeExamples.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelFrameworkModeExtractNegativeExamples.ql) | |
| 17 | + |
| 18 | +## Running the Queries |
| 19 | + |
| 20 | +The extraction queries are part of a separate query pack, `java-automodel-queries`. Use this pack to run them. The queries are tagged appropriately, you can use the tags (example here: https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeExtractNegativeExamples.ql#L8) to construct query suites. |
| 21 | + |
| 22 | +For example, a query suite selecting all example extraction queries (positive and negative) for application mode looks like this: |
| 23 | + |
| 24 | +``` |
| 25 | +# File: automodel-application-mode-extraction-examples.qls |
| 26 | +# --- |
| 27 | +# Query suite for extracting examples for automodel |
| 28 | +
|
| 29 | +- description: Automodel application mode examples extraction. |
| 30 | +- queries: . |
| 31 | + from: codeql/java-automodel-queries |
| 32 | +- include: |
| 33 | + tags contain all: |
| 34 | + - automodel |
| 35 | + - extract |
| 36 | + - application-mode |
| 37 | + - examples |
| 38 | +``` |
| 39 | + |
| 40 | +## Important Software Design Concepts and Goals |
| 41 | + |
| 42 | +### Concept: `Endpoint` |
| 43 | + |
| 44 | +Endpoints are source code locations of interest. All +/- examples and all candidates are endpoints, but not all endpoints are examples or candidates. Each mode decides what endpoints are relevant. For instance, if the Java application mode wants to support candidates for sinks that are arguments passed to unknown method calls, then the Java application mode implementation needs to make sure that method arguments are endpoints. If you look at the `TApplicationModeEndpoint` implementation in [AutomodelApplicationModeCharacteristics.qll](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelApplicationModeCharacteristics.qll), you can see that this is the case: the `TExplicitArgument` implements this behavior. |
| 45 | + |
| 46 | +### Concept: `EndpointCharacteristics` |
| 47 | + |
| 48 | +In the file [AutomodelSharedCharacteristics.ql](https://github.com/github/codeql/blob/main/java/ql/automodel/src/AutomodelSharedCharacteristics.ql), you will find the definition of the QL class `EndpointCharacteristic`. |
| 49 | + |
| 50 | +An endpoint characteristic is a QL class that "tags" all endpoints for which the characteristic's `appliesToEndpoint` predicate holds. The characteristic defines a `hasImplications` predicate that declares whether all the endpoints should be considered as sinks/sources/negatives, and with which confidence. |
| 51 | + |
| 52 | +#### :warning: Warning |
| 53 | + |
| 54 | +Do not to "fix" shortcomings that could be fixed by a better prompt or better example selection by adding language- or mode-specific characteristics . Those "fixes" tend to be confusing downstream when questions like "why wasn't this location selected as a candidate?" is harder and harder to answer. It's best to rely on characteristics in the code that is shared across all languages and modes (see [Shared Code](#shared-code)). |
| 55 | + |
| 56 | +## Shared Code |
| 57 | + |
| 58 | +A significant part of the behavior of extraction queries is implemented in shared modules. When we add support for new languages, we expect to move the shared code to a separate QL pack. In the mean time, shared code modules must not import any java libraries. |
0 commit comments