Skip to content

Commit 690b394

Browse files
committed
Java: Add initial documentation for MaD using data extensions for Java.
1 parent c395779 commit 690b394

File tree

2 files changed

+101
-0
lines changed

2 files changed

+101
-0
lines changed
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
.. _customizing-library-models-for-java:
2+
3+
Customizing Library Models for Java
4+
===================================
5+
6+
.. include:: ../reusables/beta-note-customizing-library-models.rst
7+
8+
The Java analysis can be customized by adding library models (summaries, sinks and sources) in data extensions files.
9+
10+
A data extension file for Java is a YAML file in the form:
11+
12+
.. code-block:: yaml
13+
14+
extensions:
15+
- addsTo:
16+
pack: codeql/java-all
17+
extensible: <name of extension point>
18+
data:
19+
- <tuple1>
20+
- <tuple2>
21+
- ...
22+
23+
The data extension can contribute to the following extension points:
24+
25+
- **sourceModel**\(package, type, subtypes, name, signature, ext, output, kind, provenance)
26+
- **sinkModel**\(package, type, subtypes, name, signature, ext, input, kind, provenance)
27+
- **summaryModel**\(package, type, subtypes, name, signature, ext, input, output, kind, provenance)
28+
- **neutralModel**\(package, type, name, signature, provenance)
29+
30+
TODO: Link or inline documentation on how to add dataextensions.
31+
Are we going for extensions packs as the recommended default?
32+
If yes, then we probably need to elaborate with a concrete example.
33+
34+
In the sections below, we will go through the different extension points using concrete examples.
35+
36+
Example: Taint sink in the **java.sql** package.
37+
------------------------------------------------
38+
39+
In this example we will see, how to define the argument passed to the **execute** method as a SQL injection sink.
40+
This is the **execute** method in the **Statement** class, which is located in the 'java.sql' package.
41+
Please note that this sink is already added to the CodeQL Java analysis.
42+
43+
.. code-block:: java
44+
45+
public static void tainted(Connection conn, String query) throws SQLException {
46+
Statement stmt = conn.createStatement();
47+
stmt.execute(query);
48+
}
49+
50+
This can be achieved by adding the following data extensions.
51+
52+
.. code-block:: yaml
53+
54+
extensions:
55+
- addsTo:
56+
pack: codeql/java-all
57+
extensible: sinkModel
58+
data:
59+
- ["java.sql", "Statement", True, "execute", "(String)", "", "Argument[0]", "sql", "manual"]
60+
61+
Reasoning:
62+
63+
Since we are adding a new sink, we need to add a tuple to the **sinkModel** extension point.
64+
The first five values are used to identify the method (callable) which we are defining a sink on.
65+
66+
- The first value **java.sql** is the package name.
67+
- The second value **Statement** is the class (type) name.
68+
- The third value **True** is flag indicating, whether the sink also applies to all overrides of the method.
69+
- The fourth value **execute** is the method name.
70+
- The fifth value **(String)** is the method input type signature.
71+
72+
For most practical purposes the six value is not relevant.
73+
The remaining values are used to define the **access path**, the **kind**, and the **provenance** (origin) of the sink.
74+
75+
- The seventh value **Argument[0]** is the access path to the first argument passed to the method, which means that this is the location of the sink.
76+
- The eighth value **sql** is the kind of the sink. The sink kind is used to define for which queries the sink is in scope.
77+
- The ninth value **manual** is the provenance of the sink, which is used to identify the origin of the sink.
78+
79+
Example: Taint source from the '<TODO>' package.
80+
------------------------------------------------
81+
82+
83+
Example: Adding flow through '<TODO>' methods.
84+
----------------------------------------------
85+
86+
Example: Adding **neutral** methods.
87+
------------------------------------
88+
This is purely for consistency and has no impact on the analysis.
89+
90+
Reference material
91+
------------------
92+
93+
The following sections provide reference material for extension points.
94+
This includins descriptions of each of the arguments (eg. access paths, types, and kinds).
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
.. pull-quote::
2+
3+
Beta Notice - Unstable API
4+
5+
Library customization using data extensions is currently in beta and subject to change.
6+
7+
Breaking changes to this format may occur while in beta.

0 commit comments

Comments
 (0)