You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/codeql/codeql-language-guides/analyzing-data-flow-in-python.rst
+33-5Lines changed: 33 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -107,7 +107,7 @@ Unfortunately this will only give the expression in the argument, not the values
107
107
108
108
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8213643003890447109/>`__. Many expressions flow to the same call.
109
109
110
-
We see that we get several data-flow nodes for an expression as it flows towards a call (notice repeated locations in the ``call`` column). We are mostly interested in the "first" of these, what might be called the local source for the file name. To restrict attention to such local sources, and to simultaneously make the analysis more performant, we have the QL class ``LocalSourceNode``:
110
+
We see that we get several data-flow nodes for an expression as it flows towards a call (notice repeated locations in the ``call`` column). We are mostly interested in the "first" of these, what might be called the local source for the file name. To restrict attention to such local sources, and to simultaneously make the analysis more performant, we have the QL class ``LocalSourceNode``. We could simply demand that ``expr`` is such a node:
111
111
112
112
.. code-block:: ql
113
113
@@ -122,11 +122,39 @@ We see that we get several data-flow nodes for an expression as it flows towards
122
122
expr instanceof DataFlow::LocalSourceNode
123
123
select call, expr
124
124
125
-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/2017139821928498055/>`__. We now mostly have one expression per call.
125
+
However, we could also enforce this by casting. That would allow us to use the member function ``flowsTo`` on ``LocalSourceNode`` like so:
126
+
127
+
.. code-block:: ql
128
+
129
+
import python
130
+
import semmle.python.dataflow.new.DataFlow
131
+
import semmle.python.ApiGraphs
132
+
133
+
from DataFlow::CallCfgNode call, DataFlow::ExprNode expr
134
+
where
135
+
call = API::moduleImport("os").getMember("open").getACall() and
As an alternative, we can ask more directly that ``expr`` is a local source of the first argument, via the predicate ``getALocalSource``:
140
+
141
+
.. code-block:: ql
142
+
143
+
import python
144
+
import semmle.python.dataflow.new.DataFlow
145
+
import semmle.python.ApiGraphs
146
+
147
+
from DataFlow::CallCfgNode call, DataFlow::ExprNode expr
148
+
where
149
+
call = API::moduleImport("os").getMember("open").getACall() and
150
+
expr = call.getArg(0).getALocalSource()
151
+
select call, expr
152
+
153
+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/6602079735954016687/>`__. All these three queries give identical results. We now mostly have one expression per call.
126
154
127
155
We still have some cases of more than one expression flowing to a call, but then they flow through different code paths (possibly due to control-flow splitting, as in the second case).
128
156
129
-
We can also make the source more specific, for example a parameter to a function or method. This query finds instances where a parameter is used as the name when opening a file:
157
+
We might want to make the source more specific, for example a parameter to a function or method. This query finds instances where a parameter is used as the name when opening a file:
130
158
131
159
.. code-block:: ql
132
160
@@ -140,7 +168,7 @@ We can also make the source more specific, for example a parameter to a function
140
168
DataFlow::localFlow(p, call.getArg(0))
141
169
select call, p
142
170
143
-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/3998032643497238063/>`__. Very few hits now; these could feasibly be inspected manually.
171
+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/3998032643497238063/>`__. Very few results now; these could feasibly be inspected manually.
144
172
145
173
Using the exact name supplied via the parameter may be too strict. If we want to know if the parameter influences the file name, we can use taint tracking instead of data flow. This query finds calls to ``os.open`` where the filename is derived from a parameter:
146
174
@@ -156,7 +184,7 @@ Using the exact name supplied via the parameter may be too strict. If we want to
156
184
TaintTracking::localTaint(p, call.getArg(0))
157
185
select call, p
158
186
159
-
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/2129957933670836953/>`__. Now we get more hits and in more projects.
187
+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/2129957933670836953/>`__. Now we get more results and in more projects.
0 commit comments