Skip to content

Commit f6ef558

Browse files
committed
Java: Add source example.
1 parent 1fd2844 commit f6ef558

File tree

1 file changed

+45
-5
lines changed

1 file changed

+45
-5
lines changed

docs/codeql/codeql-language-guides/customizing-library-models-for-java.rst

Lines changed: 45 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ Are we going for extensions packs as the recommended default?
3232
If yes, then we probably need to elaborate with a concrete example.
3333

3434
In the sections below, we will go through the different extension points using concrete examples.
35+
The **Reference material** section will in more detail describe the *mini DSLs* that are used to comprise a model definition.
3536

3637
Example: Taint sink in the **java.sql** package.
3738
------------------------------------------------
@@ -42,12 +43,12 @@ Please note that this sink is already added to the CodeQL Java analysis.
4243

4344
.. code-block:: java
4445
45-
public static void tainted(Connection conn, String query) throws SQLException {
46+
public static void taintsink(Connection conn, String query) throws SQLException {
4647
Statement stmt = conn.createStatement();
4748
stmt.execute(query);
4849
}
4950
50-
This can be achieved by adding the following data extensions.
51+
This can be achieved by adding the following data extension.
5152

5253
.. code-block:: yaml
5354
@@ -69,20 +70,59 @@ The first five values are used to identify the method (callable) which we are de
6970
- The fourth value **execute** is the method name.
7071
- The fifth value **(String)** is the method input type signature.
7172

72-
For most practical purposes the six value is not relevant.
73+
For most practical purposes the sixth value is not relevant.
7374
The remaining values are used to define the **access path**, the **kind**, and the **provenance** (origin) of the sink.
7475

7576
- The seventh value **Argument[0]** is the access path to the first argument passed to the method, which means that this is the location of the sink.
7677
- The eighth value **sql** is the kind of the sink. The sink kind is used to define for which queries the sink is in scope.
7778
- The ninth value **manual** is the provenance of the sink, which is used to identify the origin of the sink.
7879

79-
Example: Taint source from the '<TODO>' package.
80-
------------------------------------------------
80+
Example: Taint source from the **java.net** package.
81+
----------------------------------------------------
82+
In this example we will see, how to define the return value from the **getInputStream** method as a remote source.
83+
This is the **getInputStream** method in the **Socket** class, which is located in the 'java.net' package.
84+
Please note that this source is already added to the CodeQL Java analysis.
85+
86+
.. code-block:: java
87+
88+
public static InputStream tainted(Socket socket) throws IOException {
89+
InputStream stream = socket.getInputStream();
90+
return stream;
91+
}
92+
93+
This can be achieved by adding the following data extension.
94+
95+
.. code-block:: yaml
8196
97+
extensions:
98+
- addsTo:
99+
pack: codeql/java-all
100+
extensible: sourceModel
101+
data:
102+
- ["java.net", "Socket", False, "getInputStream", "()", "", "ReturnValue", "remote", "manual"]
103+
104+
Reasoning:
105+
106+
Since we are adding a new source, we need to add a tuple to the **sourceModel** extension point.
107+
The first five values are used to identify the method (callable) which we are defining a source on.
108+
109+
- The first value **java.net** is the package name.
110+
- The second value **Socket** is the class (type) name.
111+
- The third value **False** is flag indicating, whether the source also applies to all overrides of the method.
112+
- The fourth value **getInputStream** is the method name.
113+
- The fifth value **()** is the method input type signature.
114+
115+
For most practical purposes the sixth value is not relevant.
116+
The remaining values are used to define the **access path**, the **kind**, and the **provenance** (origin) of the source.
117+
118+
- The seventh value **ReturnValue** is the access path to the return of the method, which means that it is the return value that should be considered a tainted source.
119+
- The eighth value **remote** is the kind of the source. The source kind is used to define for which queries the source is in scope. **remote** applies to many of security related queries as it means a remote source of untrusted data. As an example the SQL injection query uses **remote** sources.
120+
- The ninth value **manual** is the provenance of the source, which is used to identify the origin of the source.
82121

83122
Example: Adding flow through '<TODO>' methods.
84123
----------------------------------------------
85124

125+
86126
Example: Adding **neutral** methods.
87127
------------------------------------
88128
This is purely for consistency and has no impact on the analysis.

0 commit comments

Comments
 (0)