Skip to content

Commit 7cdf439

Browse files
committed
Python: Clean up basicStoreStep
Moves the `flowsTo` logic into the shared implementation, so that `TypeTrackingPrivate` only has to define the shape of immediate store steps. Also cleans up the documentation to talk a bit more about what `content` can represent, and what caveats there are.
1 parent 0e81fd2 commit 7cdf439

File tree

2 files changed

+47
-33
lines changed

2 files changed

+47
-33
lines changed

python/ql/src/experimental/typetracking/TypeTracker.qll

Lines changed: 43 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,18 @@
22

33
private import TypeTrackerPrivate
44

5-
/** Any string that may appear as the name of a piece of content. */
5+
/**
6+
* Any string that may appear as the name of a piece of content. This will usually include things like:
7+
* - Attribute names (in Python)
8+
* - Property names (in JavaScript)
9+
*
10+
* In general, this can also be used to model things like stores to specific list indices. To ensure
11+
* correctness, it is important that
12+
*
13+
* - different types of content do not have overlapping names, and
14+
* - the empty string `""` is not a valid piece of content, as it is used to indicate the absence of
15+
* content instead.
16+
*/
617
class ContentName extends string {
718
ContentName() { this = getPossibleContentName() }
819
}
@@ -70,14 +81,41 @@ module StepSummary {
7081
summary = ReturnStep()
7182
or
7283
exists(string content |
73-
basicStoreStep(nodeFrom, nodeTo, content) and
84+
localSourceStoreStep(nodeFrom, nodeTo, content) and
7485
summary = StoreStep(content)
7586
or
7687
basicLoadStep(nodeFrom, nodeTo, content) and summary = LoadStep(content)
7788
)
7889
}
79-
}
8090

91+
/**
92+
* Holds if `nodeFrom` is being written to the `content` content of the object in `nodeTo`.
93+
*
94+
* Note that `nodeTo` will always be a local source node that flows to the place where the content
95+
* is written in `basicStoreStep`. This may lead to the flow of information going "back in time"
96+
* from the point of view of the execution of the program.
97+
*
98+
* For instance, if we interpret attribute writes in Python as writing to content with the same
99+
* name as the attribute and consider the following snippet
100+
*
101+
* ```python
102+
* def foo(y):
103+
* x = Foo()
104+
* bar(x)
105+
* x.attr = y
106+
* baz(x)
107+
*
108+
* def bar(x):
109+
* z = x.attr
110+
* ```
111+
* for the attribute write `x.attr = y`, we will have `content` being the literal string `"attr"`,
112+
* `nodeFrom` will be `y`, and `nodeTo` will be the object `Foo()` created on the first line of the
113+
* function. This means we will track the fact that `x.attr` can have the type of `y` into the
114+
* assignment to `z` inside `bar`, even though this attribute write happens _after_ `bar` is called.
115+
*/
116+
predicate localSourceStoreStep(Node nodeFrom, LocalSourceNode nodeTo, string content) {
117+
exists(Node obj | nodeTo.flowsTo(obj) and basicStoreStep(nodeFrom, obj, content))
118+
}
81119
}
82120

83121
private newtype TTypeTracker = MkTypeTracker(Boolean hasCall, OptionalContentName content)
@@ -92,7 +130,7 @@ private newtype TTypeTracker = MkTypeTracker(Boolean hasCall, OptionalContentNam
92130
*
93131
* It is recommended that all uses of this type are written in the following form,
94132
* for tracking some type `myType`:
95-
* ```
133+
* ```ql
96134
* DataFlow::LocalSourceNode myType(DataFlow::TypeTracker t) {
97135
* t.start() and
98136
* result = < source of myType >
@@ -253,7 +291,7 @@ private newtype TTypeBackTracker = MkTypeBackTracker(Boolean hasReturn, Optional
253291
* It is recommended that all uses of this type are written in the following form,
254292
* for back-tracking some callback type `myCallback`:
255293
*
256-
* ```
294+
* ```ql
257295
* DataFlow::LocalSourceNode myCallback(DataFlow::TypeBackTracker t) {
258296
* t.start() and
259297
* result = (< some API call >).getArgument(< n >).getALocalSource()

python/ql/src/experimental/typetracking/TypeTrackerPrivate.qll

Lines changed: 4 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,8 @@ predicate typePreservingStep(Node nodeFrom, Node nodeTo) {
1313
}
1414

1515
/**
16-
* Gets the name of a possible piece of content. This will usually include things like
17-
*
18-
* - Attribute names (in Python)
19-
* - Property names (in JavaScript)
16+
* Gets the name of a possible piece of content. For Python, this is currently only attribute names,
17+
* using the name of the attribute for the corresponding content.
2018
*/
2119
string getPossibleContentName() { result = any(DataFlowPublic::AttrRef a).getAttributeName() }
2220

@@ -54,34 +52,12 @@ predicate returnStep(DataFlowPrivate::ReturnNode nodeFrom, Node nodeTo) {
5452

5553
/**
5654
* Holds if `nodeFrom` is being written to the `content` content of the object in `nodeTo`.
57-
*
58-
* Note that the choice of `nodeTo` does not have to make sense "chronologically".
59-
* All we care about is whether the `content` content of `nodeTo` can have a specific type,
60-
* and the assumption is that if a specific type appears here, then any access of that
61-
* particular content can yield something of that particular type.
62-
*
63-
* Thus, in an example such as
64-
*
65-
* ```python
66-
* def foo(y):
67-
* x = Foo()
68-
* bar(x)
69-
* x.content = y
70-
* baz(x)
71-
*
72-
* def bar(x):
73-
* z = x.content
74-
* ```
75-
* for the content write `x.content = y`, we will have `content` being the literal string `"content"`,
76-
* `nodeFrom` will be `y`, and `nodeTo` will be the object `Foo()` created on the first line of the
77-
* function. This means we will track the fact that `x.content` can have the type of `y` into the
78-
* assignment to `z` inside `bar`, even though this content write happens _after_ `bar` is called.
7955
*/
80-
predicate basicStoreStep(Node nodeFrom, LocalSourceNode nodeTo, string content) {
56+
predicate basicStoreStep(Node nodeFrom, Node nodeTo, string content) {
8157
exists(DataFlowPublic::AttrWrite a |
8258
a.mayHaveAttributeName(content) and
8359
nodeFrom = a.getValue() and
84-
nodeTo.flowsTo(a.getObject())
60+
nodeTo = a.getObject()
8561
)
8662
}
8763

0 commit comments

Comments
 (0)