Skip to content

Commit da4c178

Browse files
felicitymayaibaars
authored andcommitted
Update main Python articles
1 parent 8eeba92 commit da4c178

7 files changed

+19
-47
lines changed

docs/codeql/codeql-language-guides/analyzing-control-flow-in-python.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ Example finding unreachable AST nodes
4747
where not exists(node.getAFlowNode())
4848
select node
4949
50-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/669220024/>`__. The demo projects on LGTM.com all have some code that has no control flow node, and is therefore unreachable. However, since the ``Module`` class is also a subclass of the ``AstNode`` class, the query also finds any modules implemented in C or with no source code. Therefore, it is better to find all unreachable statements.
50+
Many codebases have some code that has no control flow node, and is therefore unreachable. However, since the ``Module`` class is also a subclass of the ``AstNode`` class, the query also finds any modules implemented in C or with no source code. Therefore, it is better to find all unreachable statements.
5151

5252
Example finding unreachable statements
5353
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -60,7 +60,7 @@ Example finding unreachable statements
6060
where not exists(s.getAFlowNode())
6161
select s
6262
63-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/670720181/>`__. This query gives fewer results, but most of the projects have some unreachable nodes. These are also highlighted by the standard "Unreachable code" query. For more information, see `Unreachable code <https://lgtm.com/rules/3980095>`__ on LGTM.com.
63+
This query should give fewer results. You can also find unreachable code using the standard "Unreachable code" query. For more information, see `Unreachable code <https://codeql.github.com/codeql-query-help/python/py-unreachable-statement/>`__.
6464

6565
The ``BasicBlock`` class
6666
------------------------
@@ -114,7 +114,7 @@ Example finding mutually exclusive blocks within the same function
114114
)
115115
select b1, b2
116116
117-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/671000028/>`__. This typically gives a very large number of results, because it is a common occurrence in normal control flow. It is, however, an example of the sort of control-flow analysis that is possible. Control-flow analyses such as this are an important aid to data flow analysis. For more information, see ":doc:`Analyzing data flow in Python <analyzing-data-flow-in-python>`."
117+
This typically gives a very large number of results, because it is a common occurrence in normal control flow. It is, however, an example of the sort of control-flow analysis that is possible. Control-flow analyses such as this are an important aid to data flow analysis. For more information, see ":doc:`Analyzing data flow in Python <analyzing-data-flow-in-python>`."
118118

119119
Further reading
120120
---------------

docs/codeql/codeql-language-guides/analyzing-data-flow-in-python.rst

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -97,11 +97,9 @@ Python has builtin functionality for reading and writing files, such as the func
9797
call = API::moduleImport("os").getMember("open").getACall()
9898
select call.getArg(0)
9999
100-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8635258505893505141/>`__. Two of the demo projects make use of this low-level API.
101-
102100
Notice the use of the ``API`` module for referring to library functions. For more information, see ":doc:`Using API graphs in Python <using-api-graphs-in-python>`."
103101

104-
Unfortunately this will only give the expression in the argument, not the values which could be passed to it. So we use local data flow to find all expressions that flow into the argument:
102+
Unfortunately this query will only give the expression in the argument, not the values which could be passed to it. So we use local data flow to find all expressions that flow into the argument:
105103

106104
.. code-block:: ql
107105
@@ -115,9 +113,7 @@ Unfortunately this will only give the expression in the argument, not the values
115113
DataFlow::localFlow(expr, call.getArg(0))
116114
select call, expr
117115
118-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8213643003890447109/>`__. Many expressions flow to the same call.
119-
120-
We see that we get several data-flow nodes for an expression as it flows towards a call (notice repeated locations in the ``call`` column). We are mostly interested in the "first" of these, what might be called the local source for the file name. To restrict attention to such local sources, and to simultaneously make the analysis more performant, we have the QL class ``LocalSourceNode``. We could demand that ``expr`` is such a node:
116+
Typically, you will see several data-flow nodes for an expression as it flows towards a call (notice repeated locations in the ``call`` column). We are mostly interested in the "first" of these, what might be called the local source for the file name. To restrict attention to such local sources, and to simultaneously make the analysis more performant, we have the QL class ``LocalSourceNode``. We could demand that ``expr`` is such a node:
121117

122118
.. code-block:: ql
123119
@@ -160,9 +156,9 @@ As an alternative, we can ask more directly that ``expr`` is a local source of t
160156
expr = call.getArg(0).getALocalSource()
161157
select call, expr
162158
163-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/6602079735954016687/>`__. All these three queries give identical results. We now mostly have one expression per call.
159+
These three queries all give identical results. We now mostly have one expression per call.
164160

165-
We still have some cases of more than one expression flowing to a call, but then they flow through different code paths (possibly due to control-flow splitting, as in the second case).
161+
We still have some cases of more than one expression flowing to a call, but then they flow through different code paths (possibly due to control-flow splitting).
166162

167163
We might want to make the source more specific, for example a parameter to a function or method. This query finds instances where a parameter is used as the name when opening a file:
168164

@@ -178,7 +174,7 @@ We might want to make the source more specific, for example a parameter to a fun
178174
DataFlow::localFlow(p, call.getArg(0))
179175
select call, p
180176
181-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/3998032643497238063/>`__. Very few results now; these could feasibly be inspected manually.
177+
For most codebases, this will return only a few results and these could be inspected manually.
182178

183179
Using the exact name supplied via the parameter may be too strict. If we want to know if the parameter influences the file name, we can use taint tracking instead of data flow. This query finds calls to ``os.open`` where the filename is derived from a parameter:
184180

@@ -194,7 +190,7 @@ Using the exact name supplied via the parameter may be too strict. If we want to
194190
TaintTracking::localTaint(p, call.getArg(0))
195191
select call, p
196192
197-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/2129957933670836953/>`__. Now we get more results and in more projects.
193+
Typically, this finds more results.
198194

199195
Global data flow
200196
----------------
@@ -369,8 +365,6 @@ This data flow configuration tracks data flow from environment variables to open
369365
select fileOpen, "This call to 'os.open' uses data from $@.",
370366
environment, "call to 'os.getenv'"
371367
372-
➤ `Running this in the query console on LGTM.com <https://lgtm.com/query/6582374907796191895/>`__ unsurprisingly yields no results in the demo projects.
373-
374368
375369
Further reading
376370
---------------

docs/codeql/codeql-language-guides/codeql-for-python.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
1616
expressions-and-statements-in-python
1717
analyzing-control-flow-in-python
1818

19-
- :doc:`Basic query for Python code <basic-query-for-python-code>`: Learn to write and run a simple CodeQL query using LGTM.
19+
- :doc:`Basic query for Python code <basic-query-for-python-code>`: Learn to write and run a simple CodeQL query.
2020

2121
- :doc:`CodeQL library for Python <codeql-library-for-python>`: When you need to analyze a Python program, you can make use of the large collection of classes in the CodeQL library for Python.
2222

docs/codeql/codeql-language-guides/codeql-library-for-python.rst

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ All scopes are basically a list of statements, although ``Scope`` classes have a
5353
where f.getScope() instanceof Function
5454
select f
5555
56-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/665620040/>`__. Many projects have nested functions.
56+
Many codebases use nested functions.
5757

5858
Statement
5959
^^^^^^^^^
@@ -95,7 +95,7 @@ As an example, to find expressions of the form ``a+2`` where the left is a simpl
9595
where bin.getLeft() instanceof Name and bin.getRight() instanceof Num
9696
select bin
9797
98-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/669950026/>`__. Many projects include examples of this pattern.
98+
Many codebases include examples of this pattern.
9999

100100
Variable
101101
^^^^^^^^
@@ -126,7 +126,7 @@ For our first example, we can find all ``finally`` blocks by using the ``Try`` c
126126
from Try t
127127
select t.getFinalbody()
128128
129-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/659662193/>`__. Many projects include examples of this pattern.
129+
Many codebases include examples of this pattern.
130130

131131
2. Finding ``except`` blocks that do nothing
132132
''''''''''''''''''''''''''''''''''''''''''''
@@ -157,7 +157,7 @@ Both forms are equivalent. Using the positive expression, the whole query looks
157157
where forall(Stmt s | s = ex.getAStmt() | s instanceof Pass)
158158
select ex
159159
160-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/690010036/>`__. Many projects include pass-only ``except`` blocks.
160+
Many codebases include pass-only ``except`` blocks.
161161

162162
Summary
163163
^^^^^^^
@@ -278,8 +278,6 @@ Using this predicate we can select the longest ``BasicBlock`` by selecting the `
278278
where bb_length(b) = max(bb_length(_))
279279
select b
280280
281-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/666730036/>`__. When we ran it on the LGTM.com demo projects, the *openstack/nova* and *ytdl-org/youtube-dl* projects both contained source code results for this query.
282-
283281
.. pull-quote::
284282

285283
Note

docs/codeql/codeql-language-guides/expressions-and-statements-in-python.rst

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -54,8 +54,6 @@ The ``global`` statement in Python declares a variable with a global (module-lev
5454
where g.getScope() instanceof Module
5555
select g
5656
57-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/686330052/>`__. None of the demo projects on LGTM.com has a global statement that matches this pattern.
58-
5957
The line: ``g.getScope() instanceof Module`` ensures that the ``Scope`` of ``Global g`` is a ``Module``, rather than a class or function.
6058

6159
Example finding 'if' statements with redundant branches
@@ -81,7 +79,7 @@ To find statements like this that could be simplified we can write a query.
8179
and forall(Stmt p | p = l.getAnItem() | p instanceof Pass)
8280
select i
8381
84-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/672230053/>`__. Many projects have some ``if`` statements that match this pattern.
82+
Many codebases have some ``if`` statements that match this pattern.
8583

8684
The line: ``(l = i.getBody() or l = i.getOrelse())`` restricts the ``StmtList l`` to branches of the ``if`` statement.
8785

@@ -150,8 +148,6 @@ We can check for these using a query.
150148
and cmp.getOp(0) instanceof Is and cmp.getComparator(0) = literal
151149
select cmp
152150
153-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/688180010/>`__. Two of the demo projects on LGTM.com use this pattern: *saltstack/salt* and *openstack/nova*.
154-
155151
The clause ``cmp.getOp(0) instanceof Is and cmp.getComparator(0) = literal`` checks that the first comparison operator is "is" and that the first comparator is a literal.
156152

157153
.. pull-quote::
@@ -180,7 +176,7 @@ If there are duplicate keys in a Python dictionary, then the second key will ove
180176
and k1 != k2 and same_key(k1, k2)
181177
select k1, "Duplicate key in dict literal"
182178
183-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/663330305/>`__. When we ran this query on LGTM.com, the source code of the *saltstack/salt* project contained an example of duplicate dictionary keys. The results were also highlighted as alerts by the standard "Duplicate key in dict literal" query. Two of the other demo projects on LGTM.com refer to duplicate dictionary keys in library files. For more information, see `Duplicate key in dict literal <https://lgtm.com/rules/3980087>`__ on LGTM.com.
179+
When we ran this query on some test codebases, we found examples of duplicate dictionary keys. The results were also highlighted as alerts by the standard "Duplicate key in dict literal" query. For more information, see `Duplicate key in dict literal <https://codeql.github.com/codeql-query-help/python/py-duplicate-key-dict-literal/>`__.
184180

185181
The supporting predicate ``same_key`` checks that the keys have the same identifier. Separating this part of the logic into a supporting predicate, instead of directly including it in the query, makes it easier to understand the query as a whole. The casts defined in the predicate restrict the expression to the type specified and allow predicates to be called on the type that is cast-to. For example:
186182

@@ -222,8 +218,6 @@ This basic query can be improved by checking that the one line of code is a Java
222218
and attr.getObject() = self and self.getId() = "self"
223219
select f, "This function is a Java-style getter."
224220
225-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/669220054/>`__. Of the demo projects on LGTM.com, only the *openstack/nova* project has examples of functions that appear to be Java-style getters.
226-
227221
.. code-block:: ql
228222
229223
ret = f.getStmt(0) and ret.getValue() = attr

docs/codeql/codeql-language-guides/functions-in-python.rst

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Using the member predicate ``Function.getName()``, we can list all of the getter
2828
where f.getName().matches("get%")
2929
select f, "This is a function called get..."
3030
31-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/669220031/>`__. This query typically finds a large number of results. Usually, many of these results are for functions (rather than methods) which we are not interested in.
31+
This query typically finds a large number of results. Usually, many of these results are for functions (rather than methods) which we are not interested in.
3232

3333
Finding all methods called "get..."
3434
-----------------------------------
@@ -43,7 +43,7 @@ You can modify the query above to return more interesting results. As we are onl
4343
where f.getName().matches("get%") and f.isMethod()
4444
select f, "This is a method called get..."
4545
46-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/690010035/>`__. This finds methods whose name starts with ``"get"``, but many of those are not the sort of simple getters we are interested in.
46+
This finds methods whose name starts with ``"get"``, but many of those are not the sort of simple getters we are interested in.
4747

4848
Finding one line methods called "get..."
4949
----------------------------------------
@@ -59,7 +59,7 @@ We can modify the query further to include only methods whose body consists of a
5959
and count(f.getAStmt()) = 1
6060
select f, "This function is (probably) a getter."
6161
62-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/667290044/>`__. This query returns fewer results, but if you examine the results you can see that there are still refinements to be made. This is refined further in ":doc:`Expressions and statements in Python <expressions-and-statements-in-python>`."
62+
This query returns fewer results, but if you examine the results you can see that there are still refinements to be made. This is refined further in ":doc:`Expressions and statements in Python <expressions-and-statements-in-python>`."
6363

6464
Finding a call to a specific function
6565
-------------------------------------
@@ -74,8 +74,6 @@ This query uses ``Call`` and ``Name`` to find calls to the function ``eval`` - w
7474
where call.getFunc() = name and name.getId() = "eval"
7575
select call, "call to 'eval'."
7676
77-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/6718356557331218618/>`__. Some of the demo projects on LGTM.com use this function.
78-
7977
The ``Call`` class represents calls in Python. The ``Call.getFunc()`` predicate gets the expression being called. ``Name.getId()`` gets the identifier (as a string) of the ``Name`` expression.
8078
Due to the dynamic nature of Python, this query will select any call of the form ``eval(...)`` regardless of whether it is a call to the built-in function ``eval`` or not.
8179
In a later tutorial we will see how to use the type-inference library to find calls to the built-in function ``eval`` regardless of name of the variable called.

docs/codeql/codeql-language-guides/using-api-graphs-in-python.rst

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,6 @@ following snippet demonstrates.
2929
3030
select API::moduleImport("re")
3131
32-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/1876172022264324639/>`__.
33-
3432
This query selects the API graph node corresponding to the ``re`` module. This node represents the fact that the ``re`` module has been imported rather than a specific location in the program where the import happens. Therefore, there will be at most one result per project, and it will not have a useful location, so you'll have to click `Show 1 non-source result` in order to see it.
3533

3634
To find where the ``re`` module is referenced in the program, you can use the ``getAUse`` method. The following query selects all references to the ``re`` module in the current database.
@@ -42,8 +40,6 @@ To find where the ``re`` module is referenced in the program, you can use the ``
4240
4341
select API::moduleImport("re").getAUse()
4442
45-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8072356519514905526/>`__.
46-
4743
Note that the ``getAUse`` method accounts for local flow, so that ``my_re_compile``
4844
in the following snippet is
4945
correctly recognized as a reference to the ``re.compile`` function.
@@ -77,8 +73,6 @@ the above ``re.compile`` example, you can now find references to ``re.compile``.
7773
7874
select API::moduleImport("re").getMember("compile").getAUse()
7975
80-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/7970570434725297676/>`__.
81-
8276
In addition to ``getMember``, you can use the ``getUnknownMember`` method to find references to API
8377
components where the name is not known statically. You can use the ``getAMember`` method to
8478
access all members, both known and unknown.
@@ -97,15 +91,11 @@ where the return value of ``re.compile`` is used:
9791
9892
select API::moduleImport("re").getMember("compile").getReturn().getAUse()
9993
100-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/4346050399960356921/>`__.
101-
10294
Note that this includes all uses of the result of ``re.compile``, including those reachable via
10395
local flow. To get just the *calls* to ``re.compile``, you can use ``getAnImmediateUse`` instead of
10496
``getAUse``. As this is a common occurrence, you can use ``getACall`` instead of
10597
``getReturn`` followed by ``getAnImmediateUse``.
10698

107-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8143347716552092926/>`__.
108-
10999
Note that the API graph does not distinguish between class instantiations and function calls. As far
110100
as it's concerned, both are simply places where an API graph node is called.
111101

@@ -134,8 +124,6 @@ all subclasses of ``View``, you must explicitly include the subclasses of ``Meth
134124
135125
select viewClass().getAUse()
136126
137-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/288293322319747121/>`__.
138-
139127
Note the use of the set literal ``["View", "MethodView"]`` to match both classes simultaneously.
140128

141129
Built-in functions and classes

0 commit comments

Comments
 (0)