Skip to content

Commit ada0b37

Browse files
Merge pull request #20120 from joefarebrother/python-qual-unexpected-raise-special
Python: Modernize Unexpected Raise In Special Method query
2 parents a3aacfb + 5b0beb9 commit ada0b37

14 files changed

+285
-163
lines changed

python/ql/src/Functions/IncorrectRaiseInSpecialMethod.py

Lines changed: 0 additions & 16 deletions
This file was deleted.

python/ql/src/Functions/IncorrectRaiseInSpecialMethod.qhelp

Lines changed: 22 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,11 @@
55
<overview>
66
<p>User-defined classes interact with the Python virtual machine via special methods (also called "magic methods").
77
For example, for a class to support addition it must implement the <code>__add__</code> and <code>__radd__</code> special methods.
8-
When the expression <code>a + b</code> is evaluated the Python virtual machine will call <code>type(a).__add__(a, b)</code> and if that
8+
When the expression <code>a + b</code> is evaluated, the Python virtual machine will call <code>type(a).__add__(a, b)</code>, and if that
99
is not implemented it will call <code>type(b).__radd__(b, a)</code>.</p>
1010
<p>
1111
Since the virtual machine calls these special methods for common expressions, users of the class will expect these operations to raise standard exceptions.
12-
For example, users would expect that the expression <code>a.b</code> might raise an <code>AttributeError</code>
12+
For example, users would expect that the expression <code>a.b</code> may raise an <code>AttributeError</code>
1313
if the object <code>a</code> does not have an attribute <code>b</code>.
1414
If a <code>KeyError</code> were raised instead,
1515
then this would be unexpected and may break code that expected an <code>AttributeError</code>, but not a <code>KeyError</code>.
@@ -20,50 +20,48 @@ Therefore, if a method is unable to perform the expected operation then its resp
2020
</p>
2121

2222
<ul>
23-
<li>Attribute access, <code>a.b</code>: Raise <code>AttributeError</code></li>
24-
<li>Arithmetic operations, <code>a + b</code>: Do not raise an exception, return <code>NotImplemented</code> instead.</li>
25-
<li>Indexing, <code>a[b]</code>: Raise <code>KeyError</code>.</li>
26-
<li>Hashing, <code>hash(a)</code>: Use <code>__hash__ = None</code> to indicate that an object is unhashable.</li>
27-
<li>Equality methods, <code>a != b</code>: Never raise an exception, always return <code>True</code> or <code>False</code>.</li>
28-
<li>Ordering comparison methods, <code>a &lt; b</code>: Raise a <code>TypeError</code> if the objects cannot be ordered.</li>
29-
<li>Most others: Ideally, do not implement the method at all, otherwise raise <code>TypeError</code> to indicate that the operation is unsupported.</li>
23+
<li>Attribute access, <code>a.b</code> (<code>__getattr__</code>): Raise <code>AttributeError</code>.</li>
24+
<li>Arithmetic operations, <code>a + b</code> (<code>__add__</code>): Do not raise an exception, return <code>NotImplemented</code> instead.</li>
25+
<li>Indexing, <code>a[b]</code> (<code>__getitem__</code>): Raise <code>KeyError</code> or <code>IndexError</code>.</li>
26+
<li>Hashing, <code>hash(a)</code> (<code>__hash__</code>): Should not raise an exception. Use <code>__hash__ = None</code> to indicate that an object is unhashable rather than raising an exception.</li>
27+
<li>Equality methods, <code>a == b</code> (<code>__eq__</code>): Never raise an exception, always return <code>True</code> or <code>False</code>.</li>
28+
<li>Ordering comparison methods, <code>a &lt; b</code> (<code>__lt__</code>): Raise a <code>TypeError</code> if the objects cannot be ordered.</li>
29+
<li>Most others: If the operation is never supported, the method often does not need to be implemented at all; otherwise a <code>TypeError</code> should be raised.</li>
3030
</ul>
3131

3232
</overview>
3333
<recommendation>
34-
<p>If the method is meant to be abstract, then declare it so using the <code>@abstractmethod</code> decorator.
35-
Otherwise, either remove the method or ensure that the method raises an exception of the correct type.
34+
<p>If the method always raises as exception, then if it is intended to be an abstract method, the <code>@abstractmethod</code> decorator should be used.
35+
Otherwise, ensure that the method raises an exception of the correct type, or remove the method if the operation does not need to be supported.
3636
</p>
3737

3838
</recommendation>
3939
<example>
4040

4141
<p>
42-
This example shows two unhashable classes. The first class is unhashable in a non-standard way which may cause maintenance problems.
43-
The second, corrected, class uses the standard idiom for unhashable classes.
42+
In the following example, the <code>__add__</code> method of <code>A</code> raises a <code>TypeError</code> if <code>other</code> is of the wrong type.
43+
However, it should return <code>NotImplemented</code> instead of rising an exception, to allow other classes to support adding to <code>A</code>.
44+
This is demonstrated in the class <code>B</code>.
4445
</p>
45-
<sample src="IncorrectRaiseInSpecialMethod.py" />
46+
<sample src="examples/IncorrectRaiseInSpecialMethod.py" />
4647
<p>
47-
In this example, the first class is implicitly abstract; the <code>__add__</code> method is unimplemented,
48-
presumably with the expectation that it will be implemented by sub-classes.
49-
The second class makes this explicit with an <code>@abstractmethod</code> decoration on the unimplemented <code>__add__</code> method.
48+
In the following example, the <code>__getitem__</code> method of <code>C</code> raises a <code>ValueError</code>, rather than a <code>KeyError</code> or <code>IndexError</code> as expected.
5049
</p>
51-
<sample src="IncorrectRaiseInSpecialMethod2.py" />
50+
<sample src="examples/IncorrectRaiseInSpecialMethod2.py" />
5251
<p>
53-
In this last example, the first class implements a collection backed by the file store.
54-
However, should an <code>IOError</code> be raised in the <code>__getitem__</code> it will propagate to the caller.
55-
The second class handles any <code>IOError</code> by reraising a <code>KeyError</code> which is the standard exception for
56-
the <code>__getitem__</code> method.
52+
In the following example, the class <code>__hash__</code> method of <code>D</code> raises <code>NotImplementedError</code>.
53+
This causes <code>D</code> to be incorrectly identified as hashable by <code>isinstance(obj, collections.abc.Hashable)</code>; so the correct
54+
way to make a class unhashable is to set <code>__hash__ = None</code>.
5755
</p>
5856

59-
<sample src="IncorrectRaiseInSpecialMethod3.py" />
57+
<sample src="examples/IncorrectRaiseInSpecialMethod3.py" />
6058

6159

6260
</example>
6361
<references>
6462

6563
<li>Python Language Reference: <a href="http://docs.python.org/dev/reference/datamodel.html#special-method-names">Special Method Names</a>.</li>
66-
<li>Python Library Reference: <a href="https://docs.python.org/2/library/exceptions.html">Exceptions</a>.</li>
64+
<li>Python Library Reference: <a href="https://docs.python.org/3/library/exceptions.html">Exceptions</a>.</li>
6765

6866

6967

python/ql/src/Functions/IncorrectRaiseInSpecialMethod.ql

Lines changed: 151 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -7,114 +7,188 @@
77
* error-handling
88
* @problem.severity recommendation
99
* @sub-severity high
10-
* @precision very-high
10+
* @precision high
1111
* @id py/unexpected-raise-in-special-method
1212
*/
1313

1414
import python
15+
import semmle.python.ApiGraphs
16+
import semmle.python.dataflow.new.internal.DataFlowDispatch
1517

16-
private predicate attribute_method(string name) {
17-
name = "__getattribute__" or name = "__getattr__" or name = "__setattr__"
18+
/** Holds if `name` is the name of a special method for attribute access such as `a.b`, that should raise an `AttributeError`. */
19+
private predicate attributeMethod(string name) {
20+
name = ["__getattribute__", "__getattr__", "__delattr__"] // __setattr__ excluded as it makes sense to raise different kinds of errors based on the `value` parameter
1821
}
1922

20-
private predicate indexing_method(string name) {
21-
name = "__getitem__" or name = "__setitem__" or name = "__delitem__"
23+
/** Holds if `name` is the name of a special method for indexing operations such as `a[b]`, that should raise a `LookupError`. */
24+
private predicate indexingMethod(string name) {
25+
name = ["__getitem__", "__delitem__"] // __setitem__ excluded as it makes sense to raise different kinds of errors based on the `value` parameter
2226
}
2327

24-
private predicate arithmetic_method(string name) {
25-
name in [
26-
"__add__", "__sub__", "__or__", "__xor__", "__rshift__", "__pow__", "__mul__", "__neg__",
27-
"__radd__", "__rsub__", "__rdiv__", "__rfloordiv__", "__div__", "__rdiv__", "__rlshift__",
28-
"__rand__", "__ror__", "__rxor__", "__rrshift__", "__rpow__", "__rmul__", "__truediv__",
29-
"__rtruediv__", "__pos__", "__iadd__", "__isub__", "__idiv__", "__ifloordiv__", "__idiv__",
30-
"__ilshift__", "__iand__", "__ior__", "__ixor__", "__irshift__", "__abs__", "__ipow__",
31-
"__imul__", "__itruediv__", "__floordiv__", "__div__", "__divmod__", "__lshift__", "__and__"
28+
/** Holds if `name` is the name of a special method for arithmetic operations. */
29+
private predicate arithmeticMethod(string name) {
30+
name =
31+
[
32+
"__add__", "__sub__", "__and__", "__or__", "__xor__", "__lshift__", "__rshift__", "__pow__",
33+
"__mul__", "__div__", "__divmod__", "__truediv__", "__floordiv__", "__matmul__", "__radd__",
34+
"__rsub__", "__rand__", "__ror__", "__rxor__", "__rlshift__", "__rrshift__", "__rpow__",
35+
"__rmul__", "__rdiv__", "__rdivmod__", "__rtruediv__", "__rfloordiv__", "__rmatmul__",
36+
"__iadd__", "__isub__", "__iand__", "__ior__", "__ixor__", "__ilshift__", "__irshift__",
37+
"__ipow__", "__imul__", "__idiv__", "__idivmod__", "__itruediv__", "__ifloordiv__",
38+
"__imatmul__", "__pos__", "__neg__", "__abs__", "__invert__",
3239
]
3340
}
3441

35-
private predicate ordering_method(string name) {
36-
name = "__lt__"
37-
or
38-
name = "__le__"
39-
or
40-
name = "__gt__"
41-
or
42-
name = "__ge__"
43-
or
44-
name = "__cmp__" and major_version() = 2
42+
/** Holds if `name is the name of a special method for ordering operations such as `a < b`. */
43+
private predicate orderingMethod(string name) {
44+
name =
45+
[
46+
"__lt__",
47+
"__le__",
48+
"__gt__",
49+
"__ge__",
50+
]
4551
}
4652

47-
private predicate cast_method(string name) {
48-
name = "__nonzero__" and major_version() = 2
49-
or
50-
name = "__int__"
51-
or
52-
name = "__float__"
53-
or
54-
name = "__long__"
55-
or
56-
name = "__trunc__"
57-
or
58-
name = "__complex__"
53+
/** Holds if `name` is the name of a special method for casting an object to a numeric type, such as `int(x)` */
54+
private predicate castMethod(string name) {
55+
name =
56+
[
57+
"__int__",
58+
"__float__",
59+
"__index__",
60+
"__trunc__",
61+
"__complex__"
62+
] // __bool__ excluded as it makes sense to allow it to always raise
5963
}
6064

61-
predicate correct_raise(string name, ClassObject ex) {
62-
ex.getAnImproperSuperType() = theTypeErrorType() and
65+
/** Holds if we allow a special method named `name` to raise `exec` as an exception. */
66+
predicate correctRaise(string name, Expr exec) {
67+
execIsOfType(exec, "TypeError") and
6368
(
64-
name = "__copy__" or
65-
name = "__deepcopy__" or
66-
name = "__call__" or
67-
indexing_method(name) or
68-
attribute_method(name)
69+
indexingMethod(name) or
70+
attributeMethod(name) or
71+
// Allow add methods to raise a TypeError, as they can be used for sequence concatenation as well as arithmetic
72+
name = ["__add__", "__iadd__", "__radd__"]
6973
)
7074
or
71-
preferred_raise(name, ex)
72-
or
73-
preferred_raise(name, ex.getASuperType())
75+
exists(string execName |
76+
preferredRaise(name, execName, _) and
77+
execIsOfType(exec, execName)
78+
)
7479
}
7580

76-
predicate preferred_raise(string name, ClassObject ex) {
77-
attribute_method(name) and ex = theAttributeErrorType()
78-
or
79-
indexing_method(name) and ex = Object::builtin("LookupError")
80-
or
81-
ordering_method(name) and ex = theTypeErrorType()
82-
or
83-
arithmetic_method(name) and ex = Object::builtin("ArithmeticError")
84-
or
85-
name = "__bool__" and ex = theTypeErrorType()
81+
/** Holds if it is preferred for `name` to raise exceptions of type `execName`. `message` is the alert message. */
82+
predicate preferredRaise(string name, string execName, string message) {
83+
attributeMethod(name) and
84+
execName = "AttributeError" and
85+
message = "should raise an AttributeError instead."
86+
or
87+
indexingMethod(name) and
88+
execName = "LookupError" and
89+
message = "should raise a LookupError (KeyError or IndexError) instead."
90+
or
91+
orderingMethod(name) and
92+
execName = "TypeError" and
93+
message = "should raise a TypeError or return NotImplemented instead."
94+
or
95+
arithmeticMethod(name) and
96+
execName = "ArithmeticError" and
97+
message = "should raise an ArithmeticError or return NotImplemented instead."
98+
or
99+
name = "__bool__" and
100+
execName = "TypeError" and
101+
message = "should raise a TypeError instead."
86102
}
87103

88-
predicate no_need_to_raise(string name, string message) {
89-
name = "__hash__" and message = "use __hash__ = None instead"
90-
or
91-
cast_method(name) and message = "there is no need to implement the method at all."
104+
/** Holds if `exec` is an exception object of the type named `execName`. */
105+
predicate execIsOfType(Expr exec, string execName) {
106+
// Might make sense to have execName be an IPA type here. Or part of a more general API modeling builtin/stdlib subclass relations.
107+
exists(string subclass |
108+
execName = "TypeError" and
109+
subclass = "TypeError"
110+
or
111+
execName = "LookupError" and
112+
subclass = ["LookupError", "KeyError", "IndexError"]
113+
or
114+
execName = "ArithmeticError" and
115+
subclass = ["ArithmeticError", "FloatingPointError", "OverflowError", "ZeroDivisionError"]
116+
or
117+
execName = "AttributeError" and
118+
subclass = "AttributeError"
119+
|
120+
exec = API::builtin(subclass).getACall().asExpr()
121+
or
122+
exec = API::builtin(subclass).getASubclass().getACall().asExpr()
123+
)
124+
}
125+
126+
/**
127+
* Holds if `meth` need not be implemented if it always raises. `message` is the alert message, and `allowNotImplemented` is true
128+
* if we still allow the method to always raise `NotImplementedError`.
129+
*/
130+
predicate noNeedToAlwaysRaise(Function meth, string message, boolean allowNotImplemented) {
131+
meth.getName() = "__hash__" and
132+
message = "use __hash__ = None instead." and
133+
allowNotImplemented = false
134+
or
135+
castMethod(meth.getName()) and
136+
message = "this method does not need to be implemented." and
137+
allowNotImplemented = true and
138+
// Allow an always raising cast method if it's overriding other behavior
139+
not exists(Function overridden |
140+
overridden.getName() = meth.getName() and
141+
overridden.getScope() = getADirectSuperclass+(meth.getScope()) and
142+
not alwaysRaises(overridden, _)
143+
)
144+
}
145+
146+
/** Holds if `func` has a decorator likely marking it as an abstract method. */
147+
predicate isAbstract(Function func) { func.getADecorator().(Name).getId().matches("%abstract%") }
148+
149+
/** Holds if `f` always raises the exception `exec`. */
150+
predicate alwaysRaises(Function f, Expr exec) {
151+
directlyRaises(f, exec) and
152+
strictcount(Expr e | directlyRaises(f, e)) = 1 and
153+
not exists(f.getANormalExit())
92154
}
93155

94-
predicate is_abstract(FunctionObject func) {
95-
func.getFunction().getADecorator().(Name).getId().matches("%abstract%")
156+
/** Holds if `f` directly raises `exec` using a `raise` statement. */
157+
predicate directlyRaises(Function f, Expr exec) {
158+
exists(Raise r |
159+
r.getScope() = f and
160+
exec = r.getException() and
161+
exec instanceof Call
162+
)
96163
}
97164

98-
predicate always_raises(FunctionObject f, ClassObject ex) {
99-
ex = f.getARaisedType() and
100-
strictcount(f.getARaisedType()) = 1 and
101-
not exists(f.getFunction().getANormalExit()) and
102-
/* raising StopIteration is equivalent to a return in a generator */
103-
not ex = theStopIterationType()
165+
/** Holds if `exec` is a `NotImplementedError`. */
166+
predicate isNotImplementedError(Expr exec) {
167+
exec = API::builtin("NotImplementedError").getACall().asExpr()
104168
}
105169

106-
from FunctionObject f, ClassObject cls, string message
170+
/** Gets the name of the builtin exception type `exec` constructs, if it can be determined. */
171+
string getExecName(Expr exec) { result = exec.(Call).getFunc().(Name).getId() }
172+
173+
from Function f, Expr exec, string message
107174
where
108-
f.getFunction().isSpecialMethod() and
109-
not is_abstract(f) and
110-
always_raises(f, cls) and
175+
f.isSpecialMethod() and
176+
not isAbstract(f) and
177+
directlyRaises(f, exec) and
111178
(
112-
no_need_to_raise(f.getName(), message) and not cls.getName() = "NotImplementedError"
179+
exists(boolean allowNotImplemented, string subMessage |
180+
alwaysRaises(f, exec) and
181+
noNeedToAlwaysRaise(f, subMessage, allowNotImplemented) and
182+
(allowNotImplemented = true implies not isNotImplementedError(exec)) and // don't alert if it's a NotImplementedError and that's ok
183+
message = "This method always raises $@ - " + subMessage
184+
)
113185
or
114-
not correct_raise(f.getName(), cls) and
115-
not cls.getName() = "NotImplementedError" and
116-
exists(ClassObject preferred | preferred_raise(f.getName(), preferred) |
117-
message = "raise " + preferred.getName() + " instead"
186+
not isNotImplementedError(exec) and
187+
not correctRaise(f.getName(), exec) and
188+
exists(string subMessage | preferredRaise(f.getName(), _, subMessage) |
189+
if alwaysRaises(f, exec)
190+
then message = "This method always raises $@ - " + subMessage
191+
else message = "This method raises $@ - " + subMessage
118192
)
119193
)
120-
select f, "Function always raises $@; " + message, cls, cls.toString()
194+
select f, message, exec, getExecName(exec)

python/ql/src/Functions/IncorrectRaiseInSpecialMethod2.py

Lines changed: 0 additions & 15 deletions
This file was deleted.

0 commit comments

Comments
 (0)