Skip to content

Commit bdc4808

Browse files
committed
Python: MaD summary models
Two of the generated summaries have been excluded: - ["re", "Member[split]", "Argument[0,pattern:]", "ReturnValue", "taint"] From the documentation, it is not clear why pattern should figure in the return value, as that is the part denoting split point and thus all those instances are filtered out. From the implementation Spit function: https://github.com/python/cpython/blob/3.12/Lib/re/__init__.py#L199 _compile function being called by split: https://github.com/python/cpython/blob/3.12/Lib/re/__init__.py#L280 We see that in case the pattern is already a compiled `Pattern`, it is returned directly from _compile and could thus be part of the return value from split. This is probably not possible to arrange for an attacker, and so an FP in practice. - ["urllib2", "Member[unquote]", "Argument[0,string:]", "ReturnValue", "taint"] urllib2 seems to be only in Python2 (e.g. https://docs.python.org/2.7/library/urllib2.html) and I cannot locate the function unquote.
1 parent bc55117 commit bdc4808

File tree

1 file changed

+104
-1
lines changed

1 file changed

+104
-1
lines changed

python/ql/lib/semmle/python/frameworks/Stdlib/StdLib.model.yml

Lines changed: 104 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,19 +7,107 @@ extensions:
77
- addsTo:
88
pack: codeql/python-all
99
extensible: sinkModel
10-
data: []
10+
data:
11+
- ["subprocess.Popen!","Subclass.Call.Argument[0,args:]", "log-injection"]
12+
- ["zipfile.ZipFile","Member[extractall].Argument[0,path:]", "path-injection"]
1113

1214
- addsTo:
1315
pack: codeql/python-all
1416
extensible: summaryModel
1517
data:
18+
# See
19+
# - https://docs.python.org/3/glossary.html#term-mapping
20+
# - https://docs.python.org/3/library/stdtypes.html#dict.get
21+
- ["_collections_abc.Mapping", "Member[get]", "Argument[1,default:]", "ReturnValue", "taint"]
22+
# See https://docs.python.org/3/library/argparse.html#argparse.ArgumentParser
23+
- ["argparse.ArgumentParser", "Member[_parse_known_args,_read_args_from_files]", "Argument[0,arg_strings:]", "ReturnValue", "taint"]
24+
- ["argparse.ArgumentParser", "Member[parse_args,parse_known_args]", "Argument[0,args:]", "ReturnValue", "taint"]
25+
# See https://docs.python.org/3/library/cgi.html#higher-level-interface
26+
- ["cgi.FieldStorage", "Member[getfirst,getlist,getvalue]", "Argument[self]", "ReturnValue", "taint"]
27+
# See https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack
28+
- ["contextlib.ExitStack", "Member[enter_context]", "Argument[0,cm:]", "ReturnValue", "taint"]
1629
# See https://docs.python.org/3/library/copy.html#copy.deepcopy
1730
- ["copy", "Member[copy,deepcopy]", "Argument[0,x:]", "ReturnValue", "value"]
31+
# See
32+
# - https://docs.python.org/3/library/ctypes.html#ctypes.create_string_buffer
33+
# - https://docs.python.org/3/library/ctypes.html#ctypes.create_unicode_buffer
34+
- ["ctypes", "Member[create_string_buffer,create_unicode_buffer]", "Argument[0,init:,init_or_size:]", "ReturnValue", "taint"]
35+
# See https://docs.python.org/3.11/distutils/apiref.html#distutils.util.change_root
36+
- ["distutils", "Member[util].Member[change_root]", "Argument[0,new_root:,1,pathname:]", "ReturnValue", "taint"]
37+
# See https://docs.python.org/3/library/email.header.html#email.header.Header
38+
- ["email.header.Header!", "Subclass.Call", "Argument[0,s:]", "ReturnValue", "taint"]
39+
# See https://docs.python.org/3/library/email.utils.html#email.utils.parseaddr
40+
- ["email", "Member[utils].Member[parseaddr]", "Argument[0,addr:]", "ReturnValue", "taint"]
41+
- ["email", "Member[utils].Member[parseaddr]", "Argument[0,addr:]", "ReturnValue.TupleElement[0,1]", "taint"]
1842
# See See https://docs.python.org/3/library/fnmatch.html#fnmatch.filter
1943
- ["fnmatch", "Member[filter]", "Argument[0,names:].ListElement", "ReturnValue.ListElement", "value"]
2044
- ["fnmatch", "Member[filter]", "Argument[0,names:]", "ReturnValue", "taint"]
45+
# See https://docs.python.org/3/library/getopt.html#getopt.getopt
46+
- ["getopt", "Member[getopt]", "Argument[0,args:]", "ReturnValue.TupleElement[1]", "taint"]
47+
- ["getopt", "Member[getopt]", "Argument[1,shortopts:,2,longopts:]", "ReturnValue.TupleElement[0].ListElement.TupleElement[0]", "taint"]
48+
# See https://docs.python.org/3/library/gettext.html#gettext.gettext
49+
- ["gettext", "Member[gettext]", "Argument[0,message:]", "ReturnValue", "taint"]
50+
# See https://docs.python.org/3/library/gzip.html#gzip.GzipFile
51+
- ["gzip.GzipFile!", "Subclass.Call", "Argument[0,filename:]", "ReturnValue", "taint"]
52+
# See
53+
# - https://docs.python.org/3/library/html.html#html.escape
54+
# - https://docs.python.org/3/library/html.html#html.unescape
55+
- ["html", "Member[escape,unescape]", "Argument[0,s:]", "ReturnValue", "taint"]
56+
# See https://docs.python.org/3/library/html.parser.html#html.parser.HTMLParser.feed
57+
- ["html.parser.HTMLParser", "Member[feed]", "Argument[0,data:]", "Argument[self]", "taint"]
58+
# See https://docs.python.org/3.11/library/imp.html#imp.find_module
59+
- ["imp", "Member[find_module]", "Argument[0,name:,1,path:]", "ReturnValue", "taint"]
60+
# See https://docs.python.org/3/library/logging.html#logging.getLevelName
61+
# specifically the no matching case
62+
- ["logging", "Member[getLevelName]", "Argument[0,level:]", "ReturnValue", "taint"]
63+
# See https://docs.python.org/3/library/logging.html#logging.LogRecord.getMessage
64+
- ["logging.LogRecord", "Member[getMessage]", "Argument[self]", "ReturnValue", "taint"]
65+
# See https://docs.python.org/3/library/mimetypes.html#mimetypes.guess_type
66+
- ["mimetypes", "Member[guess_type]", "Argument[0,url:]", "ReturnValue", "taint"]
67+
# See https://docs.python.org/3/library/multiprocessing.html#multiprocessing.connection.Listener
68+
- ["multiprocessing.connection.Listener!", "Subclass.Call", "Argument[3,authkey:]", "ReturnValue", "taint"]
69+
# See https://github.com/python/cpython/blob/main/Lib/nturl2path.py
70+
# No user-facing documentation, unfortunately.
71+
- ["nturl2path", "Member[pathname2url]", "Argument[0,p:]", "ReturnValue", "taint"]
72+
- ["nturl2path", "Member[url2pathname]", "Argument[0,url:]", "ReturnValue", "taint"]
2173
# See https://docs.python.org/3/library/optparse.html#optparse.OptionParser.parse_args
2274
- ["optparse.OptionParser", "Member[parse_args]", "Argument[0,args:,1,values:]", "ReturnValue.TupleElement[0,1]", "taint"]
75+
# See https://github.com/python/cpython/blob/3.10/Lib/pathlib.py#L972-L973
76+
- ["pathlib.Path", ".Member[__enter__]", "Argument[self]", "ReturnValue", "taint"]
77+
# See https://docs.python.org/3/library/os.html#os.PathLike.__fspath__
78+
- ["pathlib.PurePath", "Member[__fspath__]", "Argument[self]", "ReturnValue", "taint"]
79+
# See
80+
# - https://docs.python.org/3/library/asyncio-queue.html#asyncio.Queue.put
81+
# - https://docs.python.org/3/library/asyncio-queue.html#asyncio.Queue.put_nowait
82+
- ["queue.Queue", "Member[put,put_nowait]", "Argument[0,item:]", "Argument[self]", "taint"]
83+
# See
84+
# - https://docs.python.org/3/library/random.html#random.choice
85+
# - https://docs.python.org/3/library/random.html#module-random
86+
- ["random", "Member[choice]", "Argument[0,seq:]", "ReturnValue", "taint"]
87+
- ["random.Random", "Member[choice]", "Argument[0,seq:]", "ReturnValue", "taint"]
88+
# See https://docs.python.org/3/library/shlex.html#shlex.quote
89+
- ["shlex", "Member[quote]", "Argument[0,s:]", "ReturnValue", "taint"]
90+
# See https://docs.python.org/3/library/shutil.html#shutil.rmtree
91+
- ["shutil", "Member[rmtree]", "Argument[0,path:]", "Argument[2,onerror:,onexc:].Argument[1]", "taint"]
92+
# See https://docs.python.org/3/library/shutil.html#shutil.which
93+
- ["shutil", "Member[which]", "Argument[0,cmd:,2,path:]", "ReturnValue", "taint"]
94+
# See https://docs.python.org/3/library/subprocess.html#subprocess.Popen
95+
- ["subprocess.Popen!", "Subclass.Call", "Argument[0,args:]", "ReturnValue", "taint"]
96+
# See
97+
# - https://docs.python.org/3/library/tarfile.html#tarfile.open
98+
# - https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.open
99+
- ["tarfile", "Member[open]", "Argument[0,name:,2,fileobj:]", "ReturnValue", "taint"]
100+
- ["tarfile.TarFile", "Member[open]", "Argument[0,name:,2,fileobj:]", "ReturnValue", "taint"]
101+
# See https://docs.python.org/3/library/tempfile.html#tempfile.mkdtemp
102+
- ["tempfile", "Member[mkdtemp]", "Argument[0,suffix:,1,prefix:,2,dir:]", "ReturnValue", "taint"]
103+
# See https://docs.python.org/3/library/tempfile.html#tempfile.mkstemp
104+
- ["tempfile", "Member[mkstemp]", "Argument[0,suffix:,1,prefix:,2,dir:]", "ReturnValue.TupleElement[0,1]", "taint"]
105+
# See https://docs.python.org/3/library/textwrap.html#textwrap.dedent
106+
- ["textwrap", "Member[dedent]", "Argument[0,text:]", "ReturnValue", "taint"]
107+
# See https://docs.python.org/3/library/traceback.html#traceback.StackSummary.from_list
108+
- ["traceback.StackSummary", "Member[from_list]", "Argument[0,a_list:]", "ReturnValue", "taint"]
109+
# See https://docs.python.org/3/library/typing.html#typing.cast
110+
- ["typing", "Member[cast]", "Argument[1,val:]", "ReturnValue", "value"]
23111
# See https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote
24112
- ["urllib", "Member[parse].Member[quote]", "Argument[0,string:]", "ReturnValue", "taint"]
25113
# See https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote_plus
@@ -35,6 +123,21 @@ extensions:
35123
- ["urllib", "Member[parse].Member[urlencode]", "Argument[0,query:]", "ReturnValue", "taint"]
36124
# See https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urljoin
37125
- ["urllib", "Member[parse].Member[urljoin]", "Argument[0,base:,1,url:]", "ReturnValue", "taint"]
126+
# See the internal documentation
127+
# https://github.com/python/cpython/blob/3.12/Lib/zipfile/_path/__init__.py#L103-L105
128+
- ["zipfile.CompleteDirs", "Member[namelist]", "Argument[self]", "ReturnValue", "taint"]
129+
# See https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile
130+
# it may be necessary to read the code to understand the taint propagation
131+
# Constructor: https://github.com/python/cpython/blob/3.12/Lib/zipfile/__init__.py#L1266
132+
- ["zipfile.ZipFile!", "Subclass.Call", "Argument[0,file:]", "ReturnValue", "taint"]
133+
- ["zipfile.ZipFile!", "Subclass.Call", "Argument[0,file:]", "ReturnValue.Attribute[filelist].ListElement.Attribute[filename]", "value"]
134+
# _extract_member: https://github.com/python/cpython/blob/3.12/Lib/zipfile/__init__.py#L1761
135+
- ["zipfile.ZipFile", "Member[_extract_member]", "Argument[1,targetpath:]", "ReturnValue", "taint"]
136+
# infolist: https://github.com/python/cpython/blob/3.12/Lib/zipfile/__init__.py#L1498-L1501
137+
- ["zipfile.ZipFile", "Member[infolist]", "Argument[self]", "ReturnValue", "taint"]
138+
- ["zipfile.ZipFile", "Member[infolist]", "Argument[self].Attribute[filelist]", "ReturnValue", "value"]
139+
# namelist: https://github.com/python/cpython/blob/3.12/Lib/zipfile/__init__.py#L1494-L1496
140+
- ["zipfile.ZipFile", "Member[namelist]", "Argument[self]", "ReturnValue", "taint"]
38141
- addsTo:
39142
pack: codeql/python-all
40143
extensible: neutralModel

0 commit comments

Comments
 (0)