Skip to content

Commit ef39968

Browse files
committed
Python: ORM: Add data-flow plumbing for ORM modeling
The idea is that we will do `save ==> synthetic` and `synthetic ==> load`, so we don't need to do CP between save/load. This setup with synthetic node in the middle, also allows for a limited amount of the field-flow we can do with real flow-summary support.
1 parent d3f07cd commit ef39968

File tree

3 files changed

+94
-2
lines changed

3 files changed

+94
-2
lines changed

python/ql/lib/semmle/python/dataflow/new/internal/DataFlowPrivate.qll

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -935,6 +935,24 @@ string ppReprType(DataFlowType t) { none() }
935935
* taken into account.
936936
*/
937937
predicate jumpStep(Node nodeFrom, Node nodeTo) {
938+
jumpStepSharedWithTypeTracker(nodeFrom, nodeTo)
939+
or
940+
jumpStepNotSharedWithTypeTracker(nodeFrom, nodeTo)
941+
}
942+
943+
/**
944+
* Set of jumpSteps that are shared with type-tracker implementation.
945+
*
946+
* For ORM modeling we want to add jumpsteps to global dataflow, but since these are
947+
* based on type-trackers, it's important that these new ORM jumsteps are not used in
948+
* the type-trackers as well, as that would make evaluation of type-tracking recursive
949+
* with the new jumpsteps.
950+
*
951+
* Holds if `pred` can flow to `succ`, by jumping from one callable to
952+
* another. Additional steps specified by the configuration are *not*
953+
* taken into account.
954+
*/
955+
predicate jumpStepSharedWithTypeTracker(Node nodeFrom, Node nodeTo) {
938956
runtimeJumpStep(nodeFrom, nodeTo)
939957
or
940958
// Read of module attribute:
@@ -948,6 +966,22 @@ predicate jumpStep(Node nodeFrom, Node nodeTo) {
948966
defaultValueFlowStep(nodeFrom, nodeTo)
949967
}
950968

969+
/**
970+
* Set of jumpSteps that are NOT shared with type-tracker implementation.
971+
*
972+
* For ORM modeling we want to add jumpsteps to global dataflow, but since these are
973+
* based on type-trackers, it's important that these new ORM jumsteps are not used in
974+
* the type-trackers as well, as that would make evaluation of type-tracking recursive
975+
* with the new jumpsteps.
976+
*
977+
* Holds if `pred` can flow to `succ`, by jumping from one callable to
978+
* another. Additional steps specified by the configuration are *not*
979+
* taken into account.
980+
*/
981+
predicate jumpStepNotSharedWithTypeTracker(Node nodeFrom, Node nodeTo) {
982+
any(Orm::AdditionalOrmSteps es).jumpStep(nodeFrom, nodeTo)
983+
}
984+
951985
/**
952986
* Holds if the module `m` defines a name `name` by assigning `defn` to it. This is an
953987
* overapproximation, as `name` may not in fact be exported (e.g. by defining an `__all__` that does
@@ -991,6 +1025,51 @@ predicate storeStep(Node nodeFrom, Content c, Node nodeTo) {
9911025
kwOverflowStoreStep(nodeFrom, c, nodeTo)
9921026
or
9931027
matchStoreStep(nodeFrom, c, nodeTo)
1028+
or
1029+
any(Orm::AdditionalOrmSteps es).storeStep(nodeFrom, c, nodeTo)
1030+
}
1031+
1032+
/**
1033+
* INTERNAL: Do not use.
1034+
*
1035+
* Provides classes for modeling data-flow through ORM models saved in a DB.
1036+
*/
1037+
module Orm {
1038+
/**
1039+
* INTERNAL: Do not use.
1040+
*
1041+
* A unit class for adding additional data-flow steps for ORM models.
1042+
*/
1043+
class AdditionalOrmSteps extends Unit {
1044+
/**
1045+
* Holds if data can flow from `nodeFrom` to `nodeTo` via an assignment to
1046+
* content `c`.
1047+
*/
1048+
abstract predicate storeStep(Node nodeFrom, Content c, Node nodeTo);
1049+
1050+
/**
1051+
* Holds if `pred` can flow to `succ`, by jumping from one callable to
1052+
* another. Additional steps specified by the configuration are *not*
1053+
* taken into account.
1054+
*/
1055+
abstract predicate jumpStep(Node nodeFrom, Node nodeTo);
1056+
}
1057+
1058+
/** A synthetic node representing the data for an ORM model saved in a DB. */
1059+
class SyntheticOrmModelNode extends Node, TSyntheticOrmModelNode {
1060+
Class cls;
1061+
1062+
SyntheticOrmModelNode() { this = TSyntheticOrmModelNode(cls) }
1063+
1064+
override string toString() { result = "[orm-model] " + cls.toString() }
1065+
1066+
override Scope getScope() { result = cls.getEnclosingScope() }
1067+
1068+
override Location getLocation() { result = cls.getLocation() }
1069+
1070+
/** Gets the class that defines this ORM model. */
1071+
Class getClass() { result = cls }
1072+
}
9941073
}
9951074

9961075
/** Data flows from an element of a list to the list. */

python/ql/lib/semmle/python/dataflow/new/internal/DataFlowPublic.qll

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,20 @@ newtype TNode =
8787
/**
8888
* A synthetic node representing element content in a star pattern.
8989
*/
90-
TStarPatternElementNode(MatchStarPattern target)
90+
TStarPatternElementNode(MatchStarPattern target) or
91+
/**
92+
* INTERNAL: Do not use.
93+
*
94+
* A synthetic node representing the data for an ORM model saved in a DB.
95+
*/
96+
// TODO: Limiting the classes here to the ones that are actually ORM models was
97+
// non-trivial, since that logic is based on API::Node results, and trying to do this
98+
// causes non-monotonic recursion, and makes the API graph evaluation recursive with
99+
// data-flow, which might do bad things for performance.
100+
//
101+
// So for now we live with having these synthetic ORM nodes for _all_ classes, which
102+
// is a bit wasteful, but we don't think it will hurt too much.
103+
TSyntheticOrmModelNode(Class cls)
91104

92105
/** Helper for `Node::getEnclosingCallable`. */
93106
private DataFlowCallable getCallableScope(Scope s) {

python/ql/lib/semmle/python/dataflow/new/internal/TypeTrackerSpecific.qll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ class TypeTrackingNode = DataFlowPublic::TypeTrackingNode;
1212

1313
predicate simpleLocalFlowStep = DataFlowPrivate::simpleLocalFlowStep/2;
1414

15-
predicate jumpStep = DataFlowPrivate::jumpStep/2;
15+
predicate jumpStep = DataFlowPrivate::jumpStepSharedWithTypeTracker/2;
1616

1717
/**
1818
* Gets the name of a possible piece of content. For Python, this is currently only attribute names,

0 commit comments

Comments
 (0)