By Sarah Scheffler and Hien Nguyen
detect_xss.py- given a CFG for a single file, callblock_parserto detect sources and sinks, then callpath_finderto detect paths between them.scan_all.sh- callsdetect_xss.pyon many CFG files
This is a git clone of https://github.com/joyrexus/dijkstra with one added file:
path_finder.py- contains two main functions:cfg_to_graph(cfg_file)- parses the CFG file into a list of nodes using functionality defined inspoon-master/test/block/block_parser.py, then parses that list into a graph that can be input todijkstraget_path_if_exists(graph, start, end)takes in the output ofcfg_to_graph, astartnode, and anendnode. It returns a tuple(start, end, pred)wherestartandendare the unaltered inputs (this was useful to have as an output later), andpredis the output ofdijkstra. It is a path, albeit one that is difficult to read.predis an empty dictionary{}if there is no path.
All other files in the dijkstra folder are unaltered from the source.
This contains the code for a dummy extension that contains a very basic XSS vulnerability. The basis of this code was the developer tutorial for Chrome extensions, and the extension itself was written by us.
This contains the code for crawling the Chrome Web Store.
selenium_crawler.pywas the code that we eventually used to crawl the Store.how_to_setup.txtexplains the process of setting up the crawler (largely written to make moving our code over to the MOC server easier)unzip_all_expansions.shunzips extensions. (.crxfiles can be unzipped using the normalunziputility)
This contains the text file outputs of potential vulnerabilities, sorted by extension ID and then filename.
This is a git clone of https://github.com/indutny/spoon with several added files:
test/block/block_parser.py- Main file that finds sources and sinks within a CFG. Uses regexes to check for a list of sinks and sources. It has three main functions:parseCFG(filename)- parses the CFG from the output ofspoonto a list of blocks, where each block is a tuple with the relevant information (block number, predecessors, successors, instructions, etc) extracted using a regexget_sink_blocks(filename) - usesparseCFG` and then uses the sink regexes to detect which blocks are sinksget_source_block(filename) - usesparseCFG` and then uses the source regexes to detect which blocks are sources
test/block/tester.py- simple file to test sink and source capturingtest/cfg_collector.py- script to calculate the CFG for all downloaded extensions usingfile_to_cfg.jsfile_to_cfg.js- call Esprima to construct an AST and then spoon to construct a CFG for a single.jsfile
Contains a number of old or abandoned parts of this project.
The following are dependencies that we copied to this repository for ease of use:
The following are dependencies that can be obtained from apt-get or a similar package manager:
- nodejs-legacy
- nodejs
- npm
- xvfb
- chromium-chromedriver
The following are dependencies that must be obtained from npm:
- esprima
- json
- fs
- estraverse
- escodegen
- assert
The following are dependencies that must be obtained from pip3:
- selenium
- pyvirtualdisplay