Skip to content

Commit af6a088

Browse files
committed
C++: Update the doc text for C/C++.
1 parent e87593a commit af6a088

File tree

1 file changed

+33
-66
lines changed

1 file changed

+33
-66
lines changed

docs/codeql/codeql-language-guides/customizing-library-models-for-cpp.rst

Lines changed: 33 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
.. _customizing-library-models-for-csharp:
1+
.. _customizing-library-models-for-cpp:
22

3-
Customizing library models for C#
4-
=================================
3+
Customizing library models for C and C++
4+
========================================
55

66
You can model the methods and callables that control data flow in any framework or library. This is especially useful for custom frameworks or niche libraries, that are not supported by the standard CodeQL libraries.
77

@@ -10,28 +10,28 @@ You can model the methods and callables that control data flow in any framework
1010
About this article
1111
------------------
1212

13-
This article contains reference material about how to define custom models for sources, sinks, and flow summaries for C# dependencies in data extension files.
13+
This article contains reference material about how to define custom models for sources, sinks, and flow summaries for C and C++ dependencies in data extension files.
1414

1515
About data extensions
1616
---------------------
1717

18-
You can customize analysis by defining models (summaries, sinks, and sources) of your code's C#/.NET dependencies in data extension files. Each model defines the behavior of one or more elements of your library or framework, such as methods, properties, and callables. When you run dataflow analysis, these models expand the potential sources and sinks tracked by dataflow analysis and improve the precision of results.
18+
You can customize analysis by defining models (summaries, sinks, and sources) of your code's C and C++ dependencies in data extension files. Each model defines the behavior of one or more elements of your library or framework, such as callables. When you run dataflow analysis, these models expand the potential sources and sinks tracked by dataflow analysis and improve the precision of results.
1919

20-
Most of the security queries search for paths from a source of untrusted input to a sink that represents a vulnerability. This is known as taint tracking. Each source is a starting point for dataflow analysis to track tainted data and each sink is an end point.
20+
Many of the security queries search for paths from a source of untrusted input to a sink that represents a vulnerability. This is known as taint tracking. Each source is a starting point for dataflow analysis to track tainted data and each sink is an end point.
2121

2222
Taint tracking queries also need to know how data can flow through elements that are not included in the source code. These are modeled as summaries. A summary model enables queries to synthesize the flow behavior through elements in dependency code that is not stored in your repository.
2323

2424
Syntax used to define an element in an extension file
2525
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2626

2727
Each model of an element is defined using a data extension where each tuple constitutes a model.
28-
A data extension file to extend the standard C# queries included with CodeQL is a YAML file with the form:
28+
A data extension file to extend the standard CPP queries included with CodeQL is a YAML file with the form:
2929

3030
.. code-block:: yaml
3131
3232
extensions:
3333
- addsTo:
34-
pack: codeql/csharp-all
34+
pack: codeql/cpp-all
3535
extensible: <name of extensible predicate>
3636
data:
3737
- <tuple1>
@@ -50,30 +50,31 @@ Publish data extension files in a CodeQL model pack to share
5050

5151
You can group one or more data extension files into a CodeQL model pack and publish it to the GitHub Container Registry. This makes it easy for anyone to download the model pack and use it to extend their analysis. For more information, see `Creating a CodeQL model pack <https://docs.github.com/en/code-security/codeql-cli/using-the-advanced-functionality-of-the-codeql-cli/creating-and-working-with-codeql-packs#creating-a-codeql-model-pack>`__ and `Publishing and using CodeQL packs <https://docs.github.com/en/code-security/codeql-cli/using-the-advanced-functionality-of-the-codeql-cli/publishing-and-using-codeql-packs/>`__ in the CodeQL CLI documentation.
5252

53-
Extensible predicates used to create custom models in C#
54-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53+
Extensible predicates used to create custom models in C and C++
54+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5555

56-
The CodeQL library for C# analysis exposes the following extensible predicates:
56+
The CodeQL library for CPP analysis exposes the following extensible predicates:
5757

58-
- ``sourceModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model sources of potentially tainted data. The ``kind`` of the sources defined using this predicate determine which threat model they are associated with. Different threat models can be used to customize the sources used in an analysis. For more information, see ":ref:`Threat models <threat-models-csharp>`."
58+
- ``sourceModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model sources of potentially tainted data. The ``kind`` of the sources defined using this predicate determine which threat model they are associated with. Different threat models can be used to customize the sources used in an analysis. For more information, see ":ref:`Threat models <threat-models-cpp>`."
5959
- ``sinkModel(namespace, type, subtypes, name, signature, ext, input, kind, provenance)``. This is used to model sinks where tainted data may be used in a way that makes the code vulnerable.
6060
- ``summaryModel(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance)``. This is used to model flow through elements.
61-
- ``neutralModel(namespace, type, name, signature, kind, provenance)``. This is similar to a summary model but used to model the flow of values that have only a minor impact on the dataflow analysis. Manual neutral models (those with a provenance such as ``manual`` or ``ai-manual``) can be used to override generated summary models (those with a provenance such as ``df-generated``), so that the summary model will be ignored. Other than that, neutral models have no effect.
6261

6362
The extensible predicates are populated using the models defined in data extension files.
6463

6564
Examples of custom model definitions
6665
------------------------------------
6766

68-
The examples in this section are taken from the standard CodeQL C# query pack published by GitHub. They demonstrate how to add tuples to extend extensible predicates that are used by the standard queries.
67+
TODO: one good example might do, but we currently have zero.
68+
69+
The examples in this section are taken from the standard CodeQL CPP query pack published by GitHub. They demonstrate how to add tuples to extend extensible predicates that are used by the standard queries.
6970

7071
Example: Taint sink in the ``System.Data.SqlClient`` namespace
7172
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
7273

73-
This example shows how the C# query pack models the argument of the ``SqlCommand`` constructor as a SQL injection sink.
74+
This example shows how the CPP query pack models the argument of the ``SqlCommand`` constructor as a SQL injection sink.
7475
This is the constructor of the ``SqlCommand`` class, which is located in the ``System.Data.SqlClient`` namespace.
7576

76-
.. code-block:: csharp
77+
.. code-block:: csharp TODO
7778

7879
public static void TaintSink(SqlConnection conn, string query) {
7980
SqlCommand command = new SqlCommand(query, connection) // The argument to this method is a SQL injection sink.
@@ -86,7 +87,7 @@ We need to add a tuple to the ``sinkModel``\(namespace, type, subtypes, name, si
8687
8788
extensions:
8889
- addsTo:
89-
pack: codeql/csharp-all
90+
pack: codeql/cpp-all
9091
extensible: sinkModel
9192
data:
9293
- ["System.Data.SqlClient", "SqlCommand", False, "SqlCommand", "(System.String,System.Data.SqlClient.SqlConnection)", "", "Argument[0]", "sql-injection", "manual"]
@@ -109,10 +110,10 @@ The remaining values are used to define the ``access path``, the ``kind``, and t
109110

110111
Example: Taint source from the ``System.Net.Sockets`` namespace
111112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
112-
This example shows how the C# query pack models the return value from the ``GetStream`` method as a ``remote`` source.
113+
This example shows how the CPP query pack models the return value from the ``GetStream`` method as a ``remote`` source.
113114
This is the ``GetStream`` method in the ``TcpClient`` class, which is located in the ``System.Net.Sockets`` namespace.
114115

115-
.. code-block:: csharp
116+
.. code-block:: csharp TODO
116117

117118
public static void Tainted(TcpClient client) {
118119
NetworkStream stream = client.GetStream(); // The return value of this method is a remote source of taint.
@@ -125,7 +126,7 @@ We need to add a tuple to the ``sourceModel``\(namespace, type, subtypes, name,
125126
126127
extensions:
127128
- addsTo:
128-
pack: codeql/csharp-all
129+
pack: codeql/cpp-all
129130
extensible: sourceModel
130131
data:
131132
- ["System.Net.Sockets", "TcpClient", False, "GetStream", "()", "", "ReturnValue", "remote", "manual"]
@@ -144,15 +145,15 @@ The sixth value should be left empty and is out of scope for this documentation.
144145
The remaining values are used to define the ``access path``, the ``kind``, and the ``provenance`` (origin) of the source.
145146

146147
- The seventh value ``ReturnValue`` is the access path to the return of the method, which means that it is the return value that should be considered a source of tainted input.
147-
- The eighth value ``remote`` is the kind of the source. The source kind is used to define the threat model where the source is in scope. ``remote`` applies to many of the security related queries as it means a remote source of untrusted data. As an example the SQL injection query uses ``remote`` sources. For more information, see ":ref:`Threat models <threat-models-csharp>`."
148+
- The eighth value ``remote`` is the kind of the source. The source kind is used to define the threat model where the source is in scope. ``remote`` applies to many of the security related queries as it means a remote source of untrusted data. As an example the SQL injection query uses ``remote`` sources. For more information, see ":ref:`Threat models <threat-models-cpp>`."
148149
- The ninth value ``manual`` is the provenance of the source, which is used to identify the origin of the source.
149150

150151
Example: Add flow through the ``Concat`` method
151152
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
152-
This example shows how the C# query pack models flow through a method for a simple case.
153+
This example shows how the CPP query pack models flow through a method for a simple case.
153154
This pattern covers many of the cases where we need to summarize flow through a method that is stored in a library or framework outside the repository.
154155

155-
.. code-block:: csharp
156+
.. code-block:: cpp TODO
156157

157158
public static void TaintFlow(string s1, string s2) {
158159
string t = String.Concat(s1, s2); // There is taint flow from s1 and s2 to t.
@@ -165,7 +166,7 @@ We need to add tuples to the ``summaryModel``\(namespace, type, subtypes, name,
165166
166167
extensions:
167168
- addsTo:
168-
pack: codeql/csharp-all
169+
pack: codeql/cpp-all
169170
extensible: summaryModel
170171
data:
171172
- ["System", "String", False, "Concat", "(System.Object,System.Object)", "", "Argument[0]", "ReturnValue", "taint", "manual"]
@@ -198,7 +199,7 @@ It would also be possible to merge the two rows into one by using a comma-separa
198199
199200
extensions:
200201
- addsTo:
201-
pack: codeql/csharp-all
202+
pack: codeql/cpp-all
202203
extensible: summaryModel
203204
data:
204205
- ["System", "String", False, "Concat", "(System.Object,System.Object)", "", "Argument[0,1]", "ReturnValue", "taint", "manual"]
@@ -207,9 +208,9 @@ This row defines flow from both the first and the second argument to the return
207208

208209
Example: Add flow through the ``Trim`` method
209210
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
210-
This example shows how the C# query pack models flow through a method for a simple case.
211+
This example shows how the CPP query pack models flow through a method for a simple case.
211212

212-
.. code-block:: csharp
213+
.. code-block:: cpp TODO
213214

214215
public static void TaintFlow(string s) {
215216
string t = s.Trim(); // There is taint flow from s to t.
@@ -222,7 +223,7 @@ We need to add a tuple to the ``summaryModel``\(namespace, type, subtypes, name,
222223
223224
extensions:
224225
- addsTo:
225-
pack: codeql/csharp-all
226+
pack: codeql/cpp-all
226227
extensible: summaryModel
227228
data:
228229
- ["System", "String", False, "Trim", "()", "", "Argument[this]", "ReturnValue", "taint", "manual"]
@@ -250,10 +251,10 @@ The remaining values are used to define the ``access path``, the ``kind``, and t
250251

251252
Example: Add flow through the ``Select`` method
252253
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
253-
This example shows how the C# query pack models a more complex flow through a method.
254+
This example shows how the CPP query pack models a more complex flow through a method.
254255
Here we model flow through higher order methods and collection types, as well as how to handle extension methods and generics.
255256

256-
.. code-block:: csharp
257+
.. code-block:: cpp TODO
257258

258259
public static void TaintFlow(IEnumerable<string> stream) {
259260
IEnumerable<string> lines = stream.Select(item => item + "\n");
@@ -266,7 +267,7 @@ We need to add tuples to the ``summaryModel``\(namespace, type, subtypes, name,
266267
267268
extensions:
268269
- addsTo:
269-
pack: codeql/csharp-all
270+
pack: codeql/cpp-all
270271
extensible: summaryModel
271272
data:
272273
- ["System.Linq", "Enumerable", False, "Select<TSource,TResult>", "(System.Collections.Generic.IEnumerable<TSource>,System.Func<TSource,TResult>)", "", "Argument[0].Element", "Argument[1].Parameter[0]", "value", "manual"]
@@ -307,41 +308,7 @@ For the remaining values for both rows:
307308

308309
That is, the first row specifies that values can flow from the elements of the qualifier enumerable into the first argument of the function provided to ``Select``. The second row specifies that values can flow from the return value of the function to the elements of the enumerable returned from ``Select``.
309310

310-
Example: Add a ``neutral`` method
311-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
312-
This example shows how we can model a method as being neutral with respect to flow. We will also cover how to model a property by modeling the getter of the ``Now`` property of the ``DateTime`` class as neutral.
313-
A neutral model is used to define that there is no flow through a method.
314-
315-
.. code-block:: csharp
316-
317-
public static void TaintFlow() {
318-
System.DateTime t = System.DateTime.Now; // There is no flow from Now to t.
319-
...
320-
}
321-
322-
We need to add a tuple to the ``neutralModel``\(namespace, type, name, signature, kind, provenance) extensible predicate by updating a data extension file.
323-
324-
.. code-block:: yaml
325-
326-
extensions:
327-
- addsTo:
328-
pack: codeql/csharp-all
329-
extensible: neutralModel
330-
data:
331-
- ["System", "DateTime", "get_Now", "()", "summary", "manual"]
332-
333-
334-
Since we are adding a neutral model, we need to add tuples to the ``neutralModel`` extensible predicate.
335-
The first four values identify the callable (in this case the getter of the ``Now`` property) to be modeled as a neutral, the fifth value is the kind, and the sixth value is the provenance (origin) of the neutral.
336-
337-
- The first value ``System`` is the namespace name.
338-
- The second value ``DateTime`` is the class (type) name.
339-
- The third value ``get_Now`` is the method name. Getter and setter methods are named ``get_<name>`` and ``set_<name>`` respectively.
340-
- The fourth value ``()`` is the method input type signature.
341-
- The fifth value ``summary`` is the kind of the neutral.
342-
- The sixth value ``manual`` is the provenance of the neutral.
343-
344-
.. _threat-models-csharp:
311+
.. _threat-models-cpp:
345312

346313
Threat models
347314
-------------

0 commit comments

Comments
 (0)