@@ -1151,6 +1151,40 @@ predicate normalCallArg(CallNode call, Node arg, ArgumentPosition apos) {
1151
1151
*
1152
1152
* Note: If `Bar.meth` and `Foo.meth` resolves to the same function, we will end up
1153
1153
* sending both `self` arguments to that function, which is by definition the right thing to do.
1154
+ *
1155
+ * ### Bound methods
1156
+ *
1157
+ * For bound methods, such as `bm = x.m; bm()`, it's a little unclear whether we should
1158
+ * still use the object in the attribute lookup (`x.m`) as the self argument in the
1159
+ * call (`bm()`). We currently do this, but there might also be cases where we don't
1160
+ * want to do this.
1161
+ *
1162
+ * In the example below, we want to clear taint from the list before it reaches the
1163
+ * sink, but because we don't have a use of `l` in the `clear()` call, we currently
1164
+ * don't have any way to achieve our goal. (Note that this is a contrived example)
1165
+ *
1166
+ * ```py
1167
+ * l = list()
1168
+ * clear = l.clear
1169
+ * l.append(tainted)
1170
+ * clear()
1171
+ * sink(l)
1172
+ * ```
1173
+ *
1174
+ * To make the above even worse, bound-methods have a `__self__` property that refers to
1175
+ * the object of the bound-method, so we can re-write the code as:
1176
+ *
1177
+ * ```py
1178
+ * l = list()
1179
+ * clear = l.clear
1180
+ * clear.__self__.append(tainted)
1181
+ * clear()
1182
+ * sink(l)
1183
+ * ```
1184
+ *
1185
+ * One idea to solve this is to track the object in a synthetic data-flow node every
1186
+ * time the bound method is used, such that the `clear()` call would essentially be
1187
+ * translated into `l.clear()`, and we can still have use-use flow.
1154
1188
*/
1155
1189
cached
1156
1190
predicate getCallArg ( CallNode call , Function target , CallType type , Node arg , ArgumentPosition apos ) {
@@ -1160,16 +1194,24 @@ predicate getCallArg(CallNode call, Function target, CallType type, Node arg, Ar
1160
1194
type instanceof CallTypePlainFunction and
1161
1195
normalCallArg ( call , arg , apos )
1162
1196
or
1163
- // self argument for normal method calls
1197
+ // self argument for normal method calls -- see note above about bound methods
1164
1198
type instanceof CallTypeNormalMethod and
1165
1199
apos .isSelf ( ) and
1166
1200
resolveMethodCall ( call , target , type , arg ) and
1167
- // dataflow lib has requirement that arguments and calls are in same enclosing callable.
1201
+ // dataflow lib has requirement that arguments and calls are in same enclosing
1202
+ // callable. This requirement would be broken if we used `my_obj` as the self
1203
+ // argument in the `f()` call in the example below:
1204
+ // ```py
1205
+ // def call_func(f):
1206
+ // f()
1207
+ //
1208
+ // call_func(my_obj.some_method)
1209
+ // ```
1168
1210
exists ( CfgNode cfgNode | cfgNode .getNode ( ) = call |
1169
1211
cfgNode .getEnclosingCallable ( ) = arg .getEnclosingCallable ( )
1170
1212
)
1171
1213
or
1172
- // cls argument for classmethod calls
1214
+ // cls argument for classmethod calls -- see ntoe above about bound methods
1173
1215
type instanceof CallTypeClassMethod and
1174
1216
apos .isSelf ( ) and
1175
1217
resolveMethodCall ( call , target , type , arg ) and
0 commit comments