Skip to content

Commit 748749c

Browse files
committed
Python, doc: Describe smoother syntax
1 parent f561c45 commit 748749c

File tree

1 file changed

+33
-5
lines changed

1 file changed

+33
-5
lines changed

docs/codeql/codeql-language-guides/analyzing-data-flow-in-python.rst

Lines changed: 33 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ Unfortunately this will only give the expression in the argument, not the values
107107
108108
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8213643003890447109/>`__. Many expressions flow to the same call.
109109

110-
We see that we get several data-flow nodes for an expression as it flows towards a call (notice repeated locations in the ``call`` column). We are mostly interested in the "first" of these, what might be called the local source for the file name. To restrict attention to such local sources, and to simultaneously make the analysis more performant, we have the QL class ``LocalSourceNode``:
110+
We see that we get several data-flow nodes for an expression as it flows towards a call (notice repeated locations in the ``call`` column). We are mostly interested in the "first" of these, what might be called the local source for the file name. To restrict attention to such local sources, and to simultaneously make the analysis more performant, we have the QL class ``LocalSourceNode``. We could simply demand that ``expr`` is such a node:
111111

112112
.. code-block:: ql
113113
@@ -122,11 +122,39 @@ We see that we get several data-flow nodes for an expression as it flows towards
122122
expr instanceof DataFlow::LocalSourceNode
123123
select call, expr
124124
125-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/2017139821928498055/>`__. We now mostly have one expression per call.
125+
However, we could also enforce this by casting. That would allow us to use the member function ``flowsTo`` on ``LocalSourceNode`` like so:
126+
127+
.. code-block:: ql
128+
129+
import python
130+
import semmle.python.dataflow.new.DataFlow
131+
import semmle.python.ApiGraphs
132+
133+
from DataFlow::CallCfgNode call, DataFlow::ExprNode expr
134+
where
135+
call = API::moduleImport("os").getMember("open").getACall() and
136+
expr.(DataFlow::LocalSourceNode).flowsTo(call.getArg(0))
137+
select call, expr
138+
139+
As an alternative, we can ask more directly that ``expr`` is a local source of the first argument, via the predicate ``getALocalSource``:
140+
141+
.. code-block:: ql
142+
143+
import python
144+
import semmle.python.dataflow.new.DataFlow
145+
import semmle.python.ApiGraphs
146+
147+
from DataFlow::CallCfgNode call, DataFlow::ExprNode expr
148+
where
149+
call = API::moduleImport("os").getMember("open").getACall() and
150+
expr = call.getArg(0).getALocalSource()
151+
select call, expr
152+
153+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/6602079735954016687/>`__. All these three queries give identical results. We now mostly have one expression per call.
126154

127155
We still have some cases of more than one expression flowing to a call, but then they flow through different code paths (possibly due to control-flow splitting, as in the second case).
128156

129-
We can also make the source more specific, for example a parameter to a function or method. This query finds instances where a parameter is used as the name when opening a file:
157+
We might want to make the source more specific, for example a parameter to a function or method. This query finds instances where a parameter is used as the name when opening a file:
130158

131159
.. code-block:: ql
132160
@@ -140,7 +168,7 @@ We can also make the source more specific, for example a parameter to a function
140168
DataFlow::localFlow(p, call.getArg(0))
141169
select call, p
142170
143-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/3998032643497238063/>`__. Very few hits now; these could feasibly be inspected manually.
171+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/3998032643497238063/>`__. Very few results now; these could feasibly be inspected manually.
144172

145173
Using the exact name supplied via the parameter may be too strict. If we want to know if the parameter influences the file name, we can use taint tracking instead of data flow. This query finds calls to ``os.open`` where the filename is derived from a parameter:
146174

@@ -156,7 +184,7 @@ Using the exact name supplied via the parameter may be too strict. If we want to
156184
TaintTracking::localTaint(p, call.getArg(0))
157185
select call, p
158186
159-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/2129957933670836953/>`__. Now we get more hits and in more projects.
187+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/2129957933670836953/>`__. Now we get more results and in more projects.
160188

161189
Global data flow
162190
----------------

0 commit comments

Comments
 (0)