You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/codeql/codeql-for-visual-studio-code/using-the-codeql-model-editor.rst
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,15 +12,15 @@ You can view, write, and edit all types of CodeQL packs in Visual Studio Code us
12
12
About the CodeQL model editor
13
13
-----------------------------
14
14
15
-
The CodeQL model editor guides you through modeling the calls to external dependencies in your application or fully modeling all the public entry and exit points in an external dependency
15
+
The CodeQL model editor guides you through modeling the calls to external dependencies in your application or fully modeling all the public entry and exit points in an external dependency.
16
16
17
17
When you open the model editor, it analyzes the currently selected CodeQL database and identifies where the application uses external APIs and all public methods. An external (or third party) API is any API that is not part of the CodeQL database you have selected.
18
18
19
19
The model editor has two different modes:
20
20
21
-
- Application mode (default view): The editor lists each external framework used by the seelcted CodeQL database. When you expand a framework, a list of all calls to and from the external API is shown with the options available to model dataflow through each call. This mode is most useful for improving the CodeQL results for the specific codebase.
21
+
- Application mode (default view): The editor lists each external framework used by the selected CodeQL database. When you expand a framework, a list of all calls to and from the external API is shown with the options available to model dataflow through each call. This mode is most useful for improving the CodeQL results for the specific codebase.
22
22
23
-
- Dependency mode: The editor identifies the all publicly accessible APIs in the selected CodeQL database. This view guides you through modeling each public API that the codebase makes available. When you have finished modeling the entire API, you can save the model and use it to improve the CodeQL analysis for all codebases that use the dependency.
23
+
- Dependency mode: The editor identifies all of the publicly accessible APIs in the selected CodeQL database. This view guides you through modeling each public API that the codebase makes available. When you have finished modeling the entire API, you can save the model and use it to improve the CodeQL analysis for all codebases that use the dependency.
24
24
25
25
Displaying the CodeQL model editor
26
26
----------------------------------
@@ -38,7 +38,7 @@ Modeling the calls your codebase makes to external APIs
38
38
You typically use this method when you are looking at a specific codebase where you want to improve the precision of CodeQL results. This is usually when the codebase uses frameworks or libraries that are not supported by CodeQL but they are not used by other teams in your organization.
39
39
40
40
#. Select the CodeQL database that you want to improve CodeQL coverage for.
41
-
#. Display the CodeQL model editor, by default the editor runs in application mode, so the list of external APIs used by the selected codebase is shown.
41
+
#. Display the CodeQL model editor. By default the editor runs in application mode, so the list of external APIs used by the selected codebase is shown.
@@ -58,10 +58,10 @@ You typically use this method when you are looking at a specific codebase where
58
58
- **Sink**: choose the **Input** element to model.
59
59
- **Flow summary**: choose the **Input** and **Output** elements to model.
60
60
61
-
#. Define the **Kind** of data flow for the model.
61
+
#. Define the **Kind** of dataflow for the model.
62
62
#. When you have finished modeling, click **Save all** or **Save** (shown at the bottom right of each expanded list of calls). The percentage of calls modeled in the editor is updated.
63
63
64
-
The models are stored in your workspace at ``.github/codeql/extensions/<codeql-model-pack>``, where ``<codeql-model-packe>`` is the name of the CodeQL database that you selected. That is, the name of the repository, hyphen, the language analyzed by CodeQL.
64
+
The models are stored in your workspace at ``.github/codeql/extensions/<codeql-model-pack>``, where ``<codeql-model-pack>`` is the name of the CodeQL database that you selected. That is, the name of the repository, hyphen, the language analyzed by CodeQL.
65
65
66
66
The models are stored in a series of YAML data extension files, one for each external API. For example:
67
67
@@ -76,7 +76,7 @@ Modeling the public API of a codebase
76
76
You typically use this method when you want to model a framework or library that your organization uses in more than one codebase. Once you have finished creating and testing the model, you can publish the CodeQL model pack to the GitHub Container Registry for your whole organization to use.
77
77
78
78
#. Select the CodeQL database that you want to model.
79
-
#. Display the CodeQL model editor, by default the editor runs in application mode. Click **Model as dependency** to display dependency mode. The screen changes to show the public API of the framework or library.
79
+
#. Display the CodeQL model editor. By default the editor runs in application mode. Click **Model as dependency** to display dependency mode. The screen changes to show the public API of the framework or library.
@@ -96,10 +96,10 @@ You typically use this method when you want to model a framework or library that
96
96
- **Sink**: choose the **Input** element to model.
97
97
- **Flow summary**: choose the **Input** and **Output** elements to model.
98
98
99
-
#. Define the **Kind** of data flow for the model.
99
+
#. Define the **Kind** of dataflow for the model.
100
100
#. When you have finished modeling, click **Save all** or **Save** (shown at the bottom right of each expanded list of calls). The percentage of calls modeled in the editor is updated.
101
101
102
-
The models are stored in your workspace at ``.github/codeql/extensions/<codeql-model-pack>``, where ``<codeql-model-packe>`` is the name of the CodeQL database that you selected. That is, the name of the repository, hyphen, the language analyzed by CodeQL.
102
+
The models are stored in your workspace at ``.github/codeql/extensions/<codeql-model-pack>``, where ``<codeql-model-pack>`` is the name of the CodeQL database that you selected. That is, the name of the repository, hyphen, the language analyzed by CodeQL.
103
103
104
104
The models are stored in a series of YAML data extension files, one for each public method. For example:
Copy file name to clipboardExpand all lines: docs/codeql/codeql-language-guides/data-extensions-to-model-java-dependencies.rst
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,9 +21,9 @@ For more information, see ":ref:`Using the CodeQL model editor <using-the-codeql
21
21
About data extensions
22
22
---------------------
23
23
24
-
You can customize analysis by defining models (summaries, sinks, and sources) of your code's dependencies in data extension files. Each model defines the behavior of one or more elements of your library or framework, such as a methods and callables. When you run data flow analysis, these models expand the potential sources and sinks tracked by data flow analysis and improve the precision of results.
24
+
You can customize analysis by defining models (summaries, sinks, and sources) of your code's dependencies in data extension files. Each model defines the behavior of one or more elements of your library or framework, such as a methods and callables. When you run dataflow analysis, these models expand the potential sources and sinks tracked by data flow analysis and improve the precision of results.
25
25
26
-
Most of the security queries search for paths from a source of untrusted input to a sink that represents a vulnerability, this is known as taint tracking. Each source is a starting point for data flow analysis to track tainted data and each sink is an end point.
26
+
Most of the security queries search for paths from a source of untrusted input to a sink that represents a vulnerability. This is known as taint tracking. Each source is a starting point for dataflow analysis to track tainted data and each sink is an end point.
27
27
28
28
Taint tracking queries also need to know how data can flow through elements that are not included in the source code. These are modeled as summaries. A summary model enables queries to synthesize the flow behavior through elements in dependency code that is not stored in your repository.
29
29
@@ -63,7 +63,7 @@ The CodeQL library for Java and Kotlin analysis exposes the following extensible
63
63
64
64
- ``sourceModel(package, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model sources of potentially tainted data.
65
65
- ``sinkModel(package, type, subtypes, name, signature, ext, input, kind, provenance)``. This is used to model sinks where tainted data maybe used in a way that makes the code vulnerable.
66
-
- ``summaryModel(package, type, subtypes, name, signature, ext, input, output, kind, provenance)``. This is used to summarize how data values from a source flow outside the repository in a dependency of the main code base.
66
+
- ``summaryModel(package, type, subtypes, name, signature, ext, input, output, kind, provenance)``. This is used to summarize how data values from a source flow outside the repository in a dependency of the main codebase.
67
67
- ``neutralModel(package, type, name, signature, kind, provenance)``. This is similar to a summary model but used to model the flow of values that have only a minor impact on the data flow analysis.
68
68
69
69
The extensible predicates are populated using data extensions specified in YAML files. For more information about extensible predicates, see ":doc:`extensible-predicates`."
You can use data extensions to model the methods and callables that control data flow in any framework or library. This is especially useful for custom frameworks or niche libraries, that are not supported by the standard CodeQL libraries.
9
+
You can use data extensions to model the methods and callables that control dataflow in any framework or library. This is especially useful for custom frameworks or niche libraries, that are not supported by the standard CodeQL libraries.
@@ -20,7 +20,7 @@ Sources, sinks, summaries, and neutrals are commonly known as models. These mode
20
20
About extensible predicates
21
21
---------------------------
22
22
23
-
At a high level, there are two main components to using data extensions. The query writer defines one or more extensible predicates in their query libraries. CLI and code scanning users who want to augment these predicates supply one or more extension files whose data gets injected into the extensible predicate during evaluation. The extension files are either stored directly in the repository where the code base to be analyzed is hosted, or downloaded as CodeQL model packs.
23
+
At a high level, there are two main components to using data extensions. The query writer defines one or more extensible predicates in their query libraries. CLI and code scanning users who want to augment these predicates supply one or more extension files whose data gets injected into the extensible predicate during evaluation. The extension files are either stored directly in the repository where the codebase to be analyzed is hosted, or downloaded as CodeQL model packs.
24
24
25
25
This example of an extensible predicate for a source is taken from the core Java libraries https://github.com/github/codeql/blob/main/java/ql/lib/semmle/code/java/dataflow/ExternalFlowExtensions.qll#L8-L11
26
26
@@ -99,7 +99,7 @@ The following sink kinds are supported:
99
99
- ``request-forgery``: A sink that controls the URL of a request, such as in an ``HttpRequest.newBuilder`` call.
100
100
- ``response-splitting``: A sink that can be used for HTTP response splitting, such as in calls to ``HttpServletResponse.setHeader``.
101
101
- ``sql-injection``: A sink that can be used for SQL injection, such as in a ``Statement.executeQuery`` call.
102
-
- ``template-injection``: A sink that can be used for serverside template injection, such as in a ``Velocity.evaluate`` call.
102
+
- ``template-injection``: A sink that can be used for server-side template injection, such as in a ``Velocity.evaluate`` call.
103
103
- ``trust-boundary-violation``: A sink that can be used to cross a trust boundary, such as in a ``HttpSession.setAttribute`` call.
104
104
- ``url-redirection``: A sink that can be used to redirect the user to a malicious URL, such as in a ``Response.temporaryRedirect`` call.
105
105
- ``xpath-injection``: A sink that can be used for XPath injection, such as in a ``XPath.evaluate`` call.
This extensible predicate is not typically needed externally, but is included here for completeness.
126
-
It has limited impact on data flow analysis.
127
-
Manual neutrals are considered highconfidence dispatch call targets and can reduce the number of dispatch call targets during data flow analysis (a performance optimization).
126
+
It has limited impact on dataflow analysis.
127
+
Manual neutrals are considered high-confidence dispatch call targets and can reduce the number of dispatch call targets during data flow analysis (a performance optimization).
128
128
129
129
- ``kind``: Kind of the neutral. For neutrals the kind can be ``summary``, ``source``, or ``sink`` to indicate that the callable is neutral with respect to flow (no summary), source (is not a source) or sink (is not a sink).
130
130
@@ -175,7 +175,7 @@ and verification is one of:
175
175
- ``generated``: The model was generated, but not verified by a human.
176
176
177
177
The provenance is used to distinguish between models that are manually added (or verified) to the extensible predicate and models that are automatically generated.
178
-
Furthermore, it impacts the data flow analysis in the following way:
178
+
Furthermore, it impacts the dataflow analysis in the following way:
179
179
180
180
- A ``manual`` model takes precedence over ``generated`` models. If a ``manual`` model exists for an element then all ``generated`` models are ignored.
181
181
- A ``generated`` model is ignored during analysis, if the source code of the element it is modeling is available.
0 commit comments