Skip to content

Commit 43f7eed

Browse files
authored
Merge pull request #6182 from haby0/python/LogInjection
Python: CWE-117 Log injection
2 parents c007c94 + d52f95d commit 43f7eed

File tree

12 files changed

+409
-0
lines changed

12 files changed

+409
-0
lines changed
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
2+
<!DOCTYPE qhelp PUBLIC
3+
"-//Semmle//qhelp//EN"
4+
"qhelp.dtd">
5+
<qhelp>
6+
7+
<overview>
8+
9+
<p>If unsanitized user input is written to a log entry, a malicious user may be able to forge new log entries.</p>
10+
11+
<p>Forgery can occur if a user provides some input creating the appearance of multiple
12+
log entries. This can include unescaped new-line characters, or HTML or other markup.</p>
13+
</overview>
14+
15+
<recommendation>
16+
<p>
17+
User input should be suitably sanitized before it is logged.
18+
</p>
19+
<p>
20+
If the log entries are plain text then line breaks should be removed from user input, using for example
21+
<code>replace(old, new)</code> or similar. Care should also be taken that user input is clearly marked
22+
in log entries, and that a malicious user cannot cause confusion in other ways.
23+
</p>
24+
<p>
25+
For log entries that will be displayed in HTML, user input should be HTML encoded before being logged, to prevent forgery and
26+
other forms of HTML injection.
27+
</p>
28+
</recommendation>
29+
30+
<example>
31+
<p>
32+
In the example, the name provided by the user is recorded using the log output function (<code>logging.info</code> or <code>app.logger.info</code>, etc.).
33+
In these four cases, the name provided by the user is not provided The processing is recorded. If a malicious user provides <code>Guest%0D%0AUser name: Admin</code>
34+
as a parameter, the log entry will be divided into two lines, the first line is <code>User name: Guest</code> code>, the second line is <code>User name: Admin</code>.
35+
</p>
36+
<sample src="LogInjectionBad.py" />
37+
38+
<p>
39+
In a good example, the program uses the <code>replace</code> function to provide parameter processing to the user, and replace <code>\r\n</code> and <code>\n</code>
40+
with empty characters. To a certain extent, the occurrence of log injection vulnerabilities is reduced.
41+
</p>
42+
43+
<sample src="LogInjectionGood.py" />
44+
</example>
45+
46+
<references>
47+
<li>OWASP: <a href="https://owasp.org/www-community/attacks/Log_Injection">Log Injection</a>.</li>
48+
</references>
49+
</qhelp>
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
/**
2+
* @name Log Injection
3+
* @description Building log entries from user-controlled data is vulnerable to
4+
* insertion of forged log entries by a malicious user.
5+
* @kind path-problem
6+
* @problem.severity error
7+
* @precision high
8+
* @id py/log-injection
9+
* @tags security
10+
* external/cwe/cwe-117
11+
*/
12+
13+
import python
14+
import experimental.semmle.python.security.injection.LogInjection
15+
import DataFlow::PathGraph
16+
17+
from LogInjectionFlowConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
18+
where config.hasFlowPath(source, sink)
19+
select sink.getNode(), source, sink, "$@ flows to log entry.", source.getNode(),
20+
"User-provided value"
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/usr/bin/env python
2+
# -*- coding: UTF-8 -*-
3+
"""
4+
@Desc :Log Injection
5+
"""
6+
from flask import Flask
7+
from flask import request
8+
from django.utils.log import request_logger
9+
import logging
10+
11+
logging.basicConfig(level=logging.DEBUG)
12+
13+
app = Flask(__name__)
14+
15+
@app.route('/bad1')
16+
def bad1():
17+
name = request.args.get('name')
18+
app.logger.info('User name: ' + name) # Bad
19+
return 'bad1'
20+
21+
@app.route('/bad2')
22+
def bad2():
23+
name = request.args.get('name')
24+
logging.info('User name: ' + name) # Bad
25+
return 'bad2'
26+
27+
@app.route('/bad3')
28+
def bad3():
29+
name = request.args.get('name')
30+
request_logger.warn('User name: ' + name) # Bad
31+
return 'bad3'
32+
33+
@app.route('/bad4')
34+
def bad4():
35+
name = request.args.get('name')
36+
logtest = logging.getLogger('test')
37+
logtest.debug('User name: ' + name) # Bad
38+
return 'bad4'
39+
40+
if __name__ == '__main__':
41+
app.debug = True
42+
handler = logging.FileHandler('log')
43+
app.logger.addHandler(handler)
44+
app.run()
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#!/usr/bin/env python
2+
# -*- coding: UTF-8 -*-
3+
"""
4+
@Desc :Log Injection
5+
"""
6+
from flask import Flask
7+
from flask import request
8+
import logging
9+
10+
logging.basicConfig(level=logging.DEBUG)
11+
12+
app = Flask(__name__)
13+
14+
@app.route('/good1')
15+
def good1():
16+
name = request.args.get('name')
17+
name = name.replace('\r\n','').replace('\n','')
18+
logging.info('User name: ' + name) # Good
19+
return 'good1'
20+
21+
if __name__ == '__main__':
22+
app.debug = True
23+
handler = logging.FileHandler('log')
24+
app.logger.addHandler(handler)
25+
app.run()

python/ql/src/experimental/semmle/python/Concepts.qll

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,36 @@ private import semmle.python.dataflow.new.RemoteFlowSources
1414
private import semmle.python.dataflow.new.TaintTracking
1515
private import experimental.semmle.python.Frameworks
1616

17+
/** Provides classes for modeling log related APIs. */
18+
module LogOutput {
19+
/**
20+
* A data flow node for log output.
21+
*
22+
* Extend this class to model new APIs. If you want to refine existing API models,
23+
* extend `LogOutput` instead.
24+
*/
25+
abstract class Range extends DataFlow::Node {
26+
/**
27+
* Get the parameter value of the log output function.
28+
*/
29+
abstract DataFlow::Node getAnInput();
30+
}
31+
}
32+
33+
/**
34+
* A data flow node for log output.
35+
*
36+
* Extend this class to refine existing API models. If you want to model new APIs,
37+
* extend `LogOutput::Range` instead.
38+
*/
39+
class LogOutput extends DataFlow::Node {
40+
LogOutput::Range range;
41+
42+
LogOutput() { this = range }
43+
44+
DataFlow::Node getAnInput() { result = range.getAnInput() }
45+
}
46+
1747
/** Provides classes for modeling Regular Expression-related APIs. */
1848
module RegexExecution {
1949
/**

python/ql/src/experimental/semmle/python/Frameworks.qll

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,4 @@
55
private import experimental.semmle.python.frameworks.Stdlib
66
private import experimental.semmle.python.frameworks.LDAP
77
private import experimental.semmle.python.frameworks.NoSQL
8+
private import experimental.semmle.python.frameworks.Log
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
/**
2+
* Provides classes modeling security-relevant aspects of the log libraries.
3+
*/
4+
5+
private import python
6+
private import semmle.python.dataflow.new.DataFlow
7+
private import semmle.python.dataflow.new.TaintTracking
8+
private import semmle.python.dataflow.new.RemoteFlowSources
9+
private import experimental.semmle.python.Concepts
10+
private import semmle.python.frameworks.Flask
11+
private import semmle.python.ApiGraphs
12+
13+
/**
14+
* Provides models for Python's log-related libraries.
15+
*/
16+
private module log {
17+
/**
18+
* Log output method list.
19+
*
20+
* See https://docs.python.org/3/library/logging.html#logger-objects
21+
*/
22+
private class LogOutputMethods extends string {
23+
LogOutputMethods() {
24+
this in ["info", "error", "warn", "warning", "debug", "critical", "exception", "log"]
25+
}
26+
}
27+
28+
/**
29+
* The class used to find the log output method of the `logging` module.
30+
*
31+
* See `LogOutputMethods`
32+
*/
33+
private class LoggingCall extends DataFlow::CallCfgNode, LogOutput::Range {
34+
LoggingCall() {
35+
this = API::moduleImport("logging").getMember(any(LogOutputMethods m)).getACall()
36+
}
37+
38+
override DataFlow::Node getAnInput() {
39+
this.getFunction().(DataFlow::AttrRead).getAttributeName() != "log" and
40+
result in [this.getArg(_), this.getArgByName(_)] // this includes the arg named "msg"
41+
or
42+
this.getFunction().(DataFlow::AttrRead).getAttributeName() = "log" and
43+
result in [this.getArg(any(int i | i > 0)), this.getArgByName(any(string s | s != "level"))]
44+
}
45+
}
46+
47+
/**
48+
* The class used to find log output methods related to the `logging.getLogger` instance.
49+
*
50+
* See `LogOutputMethods`
51+
*/
52+
private class LoggerCall extends DataFlow::CallCfgNode, LogOutput::Range {
53+
LoggerCall() {
54+
this =
55+
API::moduleImport("logging")
56+
.getMember("getLogger")
57+
.getReturn()
58+
.getMember(any(LogOutputMethods m))
59+
.getACall()
60+
}
61+
62+
override DataFlow::Node getAnInput() {
63+
this.getFunction().(DataFlow::AttrRead).getAttributeName() != "log" and
64+
result in [this.getArg(_), this.getArgByName(_)] // this includes the arg named "msg"
65+
or
66+
this.getFunction().(DataFlow::AttrRead).getAttributeName() = "log" and
67+
result in [this.getArg(any(int i | i > 0)), this.getArgByName(any(string s | s != "level"))]
68+
}
69+
}
70+
71+
/**
72+
* The class used to find the relevant log output method of the `flask.Flask.logger` instance (flask application).
73+
*
74+
* See `LogOutputMethods`
75+
*/
76+
private class FlaskLoggingCall extends DataFlow::CallCfgNode, LogOutput::Range {
77+
FlaskLoggingCall() {
78+
this =
79+
Flask::FlaskApp::instance()
80+
.getMember("logger")
81+
.getMember(any(LogOutputMethods m))
82+
.getACall()
83+
}
84+
85+
override DataFlow::Node getAnInput() {
86+
this.getFunction().(DataFlow::AttrRead).getAttributeName() != "log" and
87+
result in [this.getArg(_), this.getArgByName(_)] // this includes the arg named "msg"
88+
or
89+
this.getFunction().(DataFlow::AttrRead).getAttributeName() = "log" and
90+
result in [this.getArg(any(int i | i > 0)), this.getArgByName(any(string s | s != "level"))]
91+
}
92+
}
93+
94+
/**
95+
* The class used to find the relevant log output method of the `django.utils.log.request_logger` instance (django application).
96+
*
97+
* See `LogOutputMethods`
98+
*/
99+
private class DjangoLoggingCall extends DataFlow::CallCfgNode, LogOutput::Range {
100+
DjangoLoggingCall() {
101+
this =
102+
API::moduleImport("django")
103+
.getMember("utils")
104+
.getMember("log")
105+
.getMember("request_logger")
106+
.getMember(any(LogOutputMethods m))
107+
.getACall()
108+
}
109+
110+
override DataFlow::Node getAnInput() {
111+
this.getFunction().(DataFlow::AttrRead).getAttributeName() != "log" and
112+
result in [this.getArg(_), this.getArgByName(_)] // this includes the arg named "msg"
113+
or
114+
this.getFunction().(DataFlow::AttrRead).getAttributeName() = "log" and
115+
result in [this.getArg(any(int i | i > 0)), this.getArgByName(any(string s | s != "level"))]
116+
}
117+
}
118+
}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
import python
2+
import semmle.python.Concepts
3+
import experimental.semmle.python.Concepts
4+
import semmle.python.dataflow.new.DataFlow
5+
import semmle.python.dataflow.new.TaintTracking
6+
import semmle.python.dataflow.new.RemoteFlowSources
7+
8+
/**
9+
* A taint-tracking configuration for tracking untrusted user input used in log entries.
10+
*/
11+
class LogInjectionFlowConfig extends TaintTracking::Configuration {
12+
LogInjectionFlowConfig() { this = "LogInjectionFlowConfig" }
13+
14+
override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }
15+
16+
override predicate isSink(DataFlow::Node sink) { sink = any(LogOutput logoutput).getAnInput() }
17+
18+
override predicate isSanitizer(DataFlow::Node node) {
19+
exists(CallNode call |
20+
node.asCfgNode() = call.getFunction().(AttrNode).getObject("replace") and
21+
call.getArg(0).getNode().(StrConst).getText() in ["\r\n", "\n"]
22+
)
23+
}
24+
}
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
edges
2+
| LogInjectionBad.py:17:12:17:18 | ControlFlowNode for request | LogInjectionBad.py:17:12:17:23 | ControlFlowNode for Attribute |
3+
| LogInjectionBad.py:17:12:17:23 | ControlFlowNode for Attribute | LogInjectionBad.py:18:21:18:40 | ControlFlowNode for BinaryExpr |
4+
| LogInjectionBad.py:23:12:23:18 | ControlFlowNode for request | LogInjectionBad.py:23:12:23:23 | ControlFlowNode for Attribute |
5+
| LogInjectionBad.py:23:12:23:23 | ControlFlowNode for Attribute | LogInjectionBad.py:24:18:24:37 | ControlFlowNode for BinaryExpr |
6+
| LogInjectionBad.py:29:12:29:18 | ControlFlowNode for request | LogInjectionBad.py:29:12:29:23 | ControlFlowNode for Attribute |
7+
| LogInjectionBad.py:29:12:29:23 | ControlFlowNode for Attribute | LogInjectionBad.py:30:25:30:44 | ControlFlowNode for BinaryExpr |
8+
| LogInjectionBad.py:35:12:35:18 | ControlFlowNode for request | LogInjectionBad.py:35:12:35:23 | ControlFlowNode for Attribute |
9+
| LogInjectionBad.py:35:12:35:23 | ControlFlowNode for Attribute | LogInjectionBad.py:37:19:37:38 | ControlFlowNode for BinaryExpr |
10+
nodes
11+
| LogInjectionBad.py:17:12:17:18 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
12+
| LogInjectionBad.py:17:12:17:23 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute |
13+
| LogInjectionBad.py:18:21:18:40 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
14+
| LogInjectionBad.py:23:12:23:18 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
15+
| LogInjectionBad.py:23:12:23:23 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute |
16+
| LogInjectionBad.py:24:18:24:37 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
17+
| LogInjectionBad.py:29:12:29:18 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
18+
| LogInjectionBad.py:29:12:29:23 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute |
19+
| LogInjectionBad.py:30:25:30:44 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
20+
| LogInjectionBad.py:35:12:35:18 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
21+
| LogInjectionBad.py:35:12:35:23 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute |
22+
| LogInjectionBad.py:37:19:37:38 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
23+
subpaths
24+
#select
25+
| LogInjectionBad.py:18:21:18:40 | ControlFlowNode for BinaryExpr | LogInjectionBad.py:17:12:17:18 | ControlFlowNode for request | LogInjectionBad.py:18:21:18:40 | ControlFlowNode for BinaryExpr | $@ flows to log entry. | LogInjectionBad.py:17:12:17:18 | ControlFlowNode for request | User-provided value |
26+
| LogInjectionBad.py:24:18:24:37 | ControlFlowNode for BinaryExpr | LogInjectionBad.py:23:12:23:18 | ControlFlowNode for request | LogInjectionBad.py:24:18:24:37 | ControlFlowNode for BinaryExpr | $@ flows to log entry. | LogInjectionBad.py:23:12:23:18 | ControlFlowNode for request | User-provided value |
27+
| LogInjectionBad.py:30:25:30:44 | ControlFlowNode for BinaryExpr | LogInjectionBad.py:29:12:29:18 | ControlFlowNode for request | LogInjectionBad.py:30:25:30:44 | ControlFlowNode for BinaryExpr | $@ flows to log entry. | LogInjectionBad.py:29:12:29:18 | ControlFlowNode for request | User-provided value |
28+
| LogInjectionBad.py:37:19:37:38 | ControlFlowNode for BinaryExpr | LogInjectionBad.py:35:12:35:18 | ControlFlowNode for request | LogInjectionBad.py:37:19:37:38 | ControlFlowNode for BinaryExpr | $@ flows to log entry. | LogInjectionBad.py:35:12:35:18 | ControlFlowNode for request | User-provided value |
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
experimental/Security/CWE-117/LogInjection.ql

0 commit comments

Comments
 (0)