Skip to content

Commit 0e0b18c

Browse files
committed
Python: Adjust comment based on review.
1 parent d18c160 commit 0e0b18c

File tree

1 file changed

+21
-15
lines changed

1 file changed

+21
-15
lines changed

python/ql/src/semmle/python/dataflow/new/internal/DataFlowPrivate.qll

Lines changed: 21 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1059,30 +1059,36 @@ predicate subscriptReadStep(CfgNode nodeFrom, Content c, CfgNode nodeTo) {
10591059
* (a, [b, *c]) = ("a", ["b", SOURCE]) # RHS has content `TupleElementContent(1); ListElementContent`
10601060
* ```
10611061
* where `a` should not receive content, but `b` and `c` should. `c` will be `[SOURCE]` so
1062-
* should have the content converted and transferred, while `b` should read it.
1062+
* should have the content transferred, while `b` should read it.
10631063
*
1064-
* The strategy for converting content type is to break the transfer up into a read step
1065-
* and a store step, together creating a converting transfer step.
1066-
* For this we need a synthetic node in the middle, which we call `TIterableElement(receiver)`.
1067-
* It is associated with the receiver of the transfer, because we know the receiver type (tuple) from the syntax.
1068-
* Since we sometimes need a converting read step (in the example above, `[b, *c]` reads the content
1069-
* `ListElementContent` but should have content `TupleElementContent(0)` and `TupleElementContent(0)`),
1070-
* we actually need a second synthetic node. A converting read step is a read step followed by a
1071-
* converting transfer.
1064+
* To transfer content from RHS to the elements of the LHS in the expression `sequence = iterable`,
1065+
* we use two synthetic nodes:
10721066
*
1073-
* We can have a uniform treatment by always having two synthetic nodes and so we can view it as
1074-
* two stages of the same node. So we read into (or transfer to) `TIterableSequence(receiver)`,
1075-
* from which we take a read step to `TIterableElement(receiver)` and then a store step to `receiver`.
1067+
* - `TIterableSequence(sequence)` which captures the content-modeling the entire `sequence` will have
1068+
* (essentially just a copy of the content-modeling the RHS has)
10761069
*
1077-
* In order to preserve precise content, we also take a flow step from `TIterableSequence(receiver)`
1078-
* directly to `receiver`.
1070+
* - `TIterableElement(sequence)` which captures the content-modeling that will be assigned to an element.
1071+
* Note that an empty access path means that the value we are tracking flows directly to the element.
1072+
*
1073+
*
1074+
* The `TIterableSequence(sequence)` is at this point superflous but becomes useful when handling recursive
1075+
* structures in the LHS, where `sequence` is some internal sequence node. We can have a uniform treatment
1076+
* by always having these two synthetic nodes. So we transfer to (or, in the recursive case, read into)
1077+
* `TIterableSequence(sequence)`, from which we take a read step to `TIterableElement(sequence)` and then a
1078+
* store step to `sequence`.
1079+
*
1080+
* This allows the unknown content from the RHS to be read into `TIterableElement(sequence)` and tuple content
1081+
* to then be stored into `sequence`. If the content is already tuple content, this inderection creates crosstalk
1082+
* between indices. Therefore, tuple content is never read into `TIterableElement(sequence)`; it is instead
1083+
* transferred directly from `TIterableSequence(sequence)` to `sequence` via a flow step. Such a flow step will
1084+
* also transfer other content, but only tuple content is further read from `sequence` into its elements.
10791085
*
10801086
* The strategy is then via several read-, store-, and flow steps:
10811087
* 1. [Flow] Content is transferred from `iterable` to `TIterableSequence(sequence)` via a
10821088
* flow step. From here, everything happens on the LHS.
10831089
*
10841090
* 2. [Flow] Content is transferred from `TIterableSequence(sequence)` to `sequence` via a
1085-
* flow step.
1091+
* flow step. (Here only tuple content is relevant.)
10861092
*
10871093
* 3. [Read] Content is read from `TIterableSequence(sequence)` into `TIterableElement(sequence)`.
10881094
* As `sequence` is modeled as a tuple, we will not read tuple content as that would allow

0 commit comments

Comments
 (0)