Improve README (how to reproduce our results)

JannisBush · JannisBush · commit 9aac8d7dc77e · 2024-10-17T10:42:55.000+02:00
diff --git a/README.md b/README.md
@@ -4,41 +4,110 @@ This repository contains all code for our paper `Head(er)s Up! Detecting Securit
 
 This repository is a fork of [WPT](https://github.com/web-platform-tests/wpt), the original README can be found [here](./README_original.md).
 All test and analysis code for our paper can be found in the [_hp](./_hp/README.md) directory.
-Our modified version of the wptserve HTTP server implementation can be found in `tools/serve` and `tools/wptserve`. All other directories are untouched and required for `wptserve` to run, we removed the other WPT test directories for better clarity.
+Our modified version of the wptserve HTTP server implementation can be found in the `tools/serve` and `tools/wptserve` directories. All other directories are untouched and required for `wptserve` to run, we removed the other WPT test directories for better clarity.
 
-## Setup
-- Create a fresh Ubuntu22 container/VM: `lxc launch ubuntu:22.04 <name>` and connect to it `lxc exec <name> bash`
+## Setup and Start the Header Testing Server
+- Create a fresh Ubuntu22 container/VM: `lxc launch ubuntu:22.04 <name>` and connect to it `lxc exec <name> bash` (Other environments might also work but are not tested)
   - Switch to the ubuntu user: `su - ubuntu`
   - Clone this repository: `git clone git@github.com:header-testing/header-testing.git`
   - Run the setup file: `cd header-testing/_hp`, `./setup.bash` (reopen all terminals or run `source ~/.bashrc` afterwards)
-  - Configure DB settings in [config.json](_hp/tools/config.json); Make sure to create a database with the correct name
+  - Start a postgres instance somewhere that is reachable from this container.
+  - Configure DB settings in [config.json](_hp/hp/tools/config.json); Make sure that a database with the correct name already exists
   - Setup the database: `cd _hp/hp/tools && poetry run python models.py`
   - Setup certs: either remove `.demo` from the files in `_hp/hp/tools/certs/` to use self-signed certs or add your own certs here
+- Create the basic and parsing responses: Run `cd _hp/hp/tools && poetry run python create_responses.py` (basic), run `cd analysis && poetry run jupyter-lab` and execute `response_header_generation.ipynb` to generate the parsing responses.
+-  Start the WPT server first (from the top-most folder): `poetry run -C _hp python wpt serve --config _hp/wpt-config.json`
+- Manually check if the server and the tests are working: Visit http://sub.headers.websec.saarland:80/_hp/tests/framing.sub.html and confirm that tests are loaded and executed.
+- Optional: Run tests to check that everything is working correctly: `poetry run -C _hp pytest _hp`
+- (TODO Optional: Change the used domains in [_hp/wpt-config.json](_hp/wpt-config.json) and [_hp/host-config.txt](_hp/host-config.txt); domains are hardcoded at several places and thus this is not enough at the moment.)
 
-## Run Instructions
-- Always start the WPT server first (from the top-most folder): `poetry run -C _hp python wpt serve --config _hp/wpt-config.json`
-- Create the basic and parsing responses: Run `cd _hp/hp/tools && poetry run python create_responses.py` (basic), run `cd analysis` and execute `response_header_generation.ipynb` to generate the parsing responses.
-- Manually check if the server and the tests are working: Visit http://sub.headers.websec.saarland:80/_hp/tests/framing.sub.html
-- Automatic testrunners:
-  - `cd _hp/hp/tools/crawler`
-  - Android: `poetry run python android_intent.py` (TODO: Additional config required; solve android_intent and more?!)
-  - MacOS/Ubuntu: `poetry run python desktop_selenium.py` (For a quick test run: `poetry run python desktop_selenium.py --debug_browsers --resp_type debug --ignore_certs`)
-  - iPadOS/iOS: `poetry run python desktop_selenium.py ----gen_page_runner --page_runner_json urls.json --max_urls_until_restart 10000"`, then visit the URLs in that file manually
-  - TODO: Exact settings of the runs for our experiment:
-    - TODO: some information about how to exactly reproduce our results?
-    - TODO: repeat to ensure each test has 5x repetitions (`poetry run python create_repeat ...`)
-    - ...
-- Optional configuration to run headfull browsers on linux server:
+
+## Reproduce or Enhance our Results
+In the following, we describe our to reproduce all our results from the paper.
+By slightly adapting the configuration and updating the used browsers, it is also possible to run our tool chain on new/other browser configurations.
+
+### Desktop Browsers (Linux Ubuntu)
+- Execute `cd _hp/hp/tools/crawler`
+- If using self-signed certs, add `--ignore_certs` to all commands.
+- Run the following for a quick test run to check that everything is working: `poetry run python desktop_selenium.py --debug_browsers --resp_type debug`
+- Full run:
+  - If the test environment cannot support 50 parallel browsers, reduce the `num_browsers` parameter.
+  - Run all basic tests: `for i in {1..5}; do poetry run python desktop_selenium.py --num_browsers 50 --resp_type basic; done`
+  - Run all parsing tests: `for i in {1..5}; do poetry run python desktop_selenium.py --num_browsers 50 --resp_type parsing; done`
+  - It can happen that some tests do not have 5 results after the above commands due to timeouts and similar, to ensure that all tests have at least 5 results run the below commands.
+  - Run missing basic tests: `poetry run python create_repeat.py --selection_str "\"Response\".resp_type = 'basic' and \"Browser\".os = 'Ubuntu 22.04'` and `poetry run python desktop_selenium.py --num_browsers 50 --run_mode repeat --max_urls_until_restart 50`
+  - Run missing parsing tests: `poetry run python create_repeat.py --selection_str "\"Response\".resp_type = 'parsing' and \"Browser\".os = 'Ubuntu 22.04'` and `poetry run python desktop_selenium.py --num_browsers 50 --run_mode repeat --max_urls_until_restart 50`
+- Optional configuration to debug headfull browsers on the Ubuntu container:
 ```bash
 Xvfb :99 -screen 0 1920x1080x24 &
 x11vnc -display :99 -bg -shared -forever -passwd abc -xkb -rfbport 5900
 export DISPLAY=:99 && fluxbox -log fluxbox.log &
 ```
-- Analysis:
-  - Run `cd _hp/hp/tools/analysis && poetry run jupyter-lab`
-  - Open `_hp/hp/tools/analysis/main_analysis_desktop_basic+parsing.ipynb`
-  - TODO: rename: (Also contains the mobile analysis)
-  - TODO: check analysis code and improve
+
+### Desktop Browsers (MacOS)
+- Have to be run on a real MacOS device, we used version 17.3 and 17.5 (adjust the browser configuration in `desktop_selenium.py` if using another version).
+- Make sure that the MacOS device can reach the Header Testing server. (Alternatively it could also work to run the header testing server and the database locally on the MacOS device).
+- If using self-signed certs, add `--ignore_certs` to all commands.
+- Execute `cd _hp/hp/tools/crawler`
+- Full run:
+  - On the Header Testing Server:
+    - Create test-page-runner pages for basic tests: `poetry run python desktop_selenium.py --resp_type basic --gen_page_runner --max_urls_until_restart 100`
+    - Create test-page-runner pages for parsing tests: `poetry run python desktop_selenium.py --resp_type parsing --gen_page_runner --max_urls_until_restart 1000`
+    - The above two commands output a path similar to `basic-MaxURLs100-MaxResps10-MaxPopups100-53332b.json`, make sure to copy the files to the MacOS device and replace the file name in the following commands.
+  - On the MacOS device:
+    - Run the basic tests: `for i in {1..5}; do poetry run python desktop_selenium.py --num_browsers 1 --page_runner_json <basic-test-json> --timeout_task 1000; done`
+    - Run the parsing tests: `for i in {1..5}; do poetry run python desktop_selenium.py --num_browsers 1 --page_runner_json <parsing-test-json> --timeout_task 10000; done`
+  - It can happen that not all tests recorded 5 results, thus run the following to ensure that all tests are executed at least 5 times:
+    - For the basic tests: `poetry run python create_repeat.py --selection_str "\"Response\".resp_type = 'basic' and \"Browser\".os != 'Android 11'"` and `poetry run python desktop_selenium.py --num_browsers 1 --run_mode repeat --timeout_task 10000`
+    - For the parsing tests: `poetry run python create_repeat.py --selection_str "\"Response\".resp_type = 'parsing' and \"Browser\".os != 'Android 11'"` and `poetry run python desktop_selenium.py --num_browsers 1 --run_mode repeat --timeout_task 10000`
+
+### Mobile Browsers (Android)
+- Execute `cd _hp/hp/tools/crawler`
+- To run the tests on Android devices, first some emulators have to be set up and the browsers have to be installed and configured:
+  - TODO: include all info here (Tin)
+  - Emulator Setup (install adb, setup emulator, ...)
+  - Browser Installation and Setup
+  - Additional browser config (popups need to be allowed):
+    - Chrome: By default, Pop-ups and redirects are blocked. To allow, go to Settings/Site Settings/ Turn on the Pop-Ups and Redirects option
+    - Brave: By default, the pop-ups are blocked by the Privacy Shields setting. To enable, go to Settings/Brave Shields & privacy/ Allow all trackers and ads
+    - To allow popups, to go about:config, and then set dom.disable_open_during_load to false.
+- The emulators also need to be able to reach the Header Testing server.
+- Issue: currently does not work with the self-signed certs, make sure to have correct certs setup
+- Full run:
+  - Run the basic tests: `for i in {1..5}; do timeout 15m poetry run python android_intent.py -browsers chrome -repeat 1 -num_devices 30 -type basic -auto_restart; done` (@Tin can we simply use `-browsers all` or do we have to run it three times?)
+  - Run the parsing tests: `for i in {1..5}; do timeout 6h poetry run python android_intent.py -browsers chrome -repeat 1 -num_devices 30 -type parsing -auto_restart; done`
+  - Similarly to the other tests, it could happen that not all tests collected 5 results, thus run the following to rerun some tests.
+  - Create the repeat file for the basic tests: `poetry run python create_repeat.py --selection_str "\"Response\".resp_type = 'basic' and \"Browser\".os = 'Android 11'"`
+  - Run them: `poetry run python android_intent.py -browsers all -repeat 1 -num_devices 30 -url_json repeat.json -auto_restart`
+  - Create the repeat file for the parsing tests: `poetry run python create_repeat.py --selection_str "\"Response\".resp_type = 'parsing' and \"Browser\".os = 'Android 11'"`
+  - Run them: `poetry run python android_intent.py -browsers all -repeat 1 -num_devices 30 -url_json repeat.json -auto_restart`
+
+
+### Mobile Browsers (iPadOS)
+- To run the tests on iPadOS a real iPad is required. The iPad also needs to be able to reach the Header Testing Server.
+- Issue: currently does not work with the self-signed certs, make sure tho have correct certs setup
+- On the iPad install Chrome (uses WebKit) and allow popups (Open Settings -> Content-Settings -> Block Pop-Ups -> Off)
+- Full run:
+  - On the Header Testing server:
+    - Execute `cd _hp/hp/tools/crawler`
+    - Add the DB entry: adjust the browser/os version info and then run `poetry run python create_ipados_browser.py` and note the returned browser_id
+    - Generate URLs to visit:
+      - Basic: `poetry run python desktop_selenium.py --resp_type basic --gen_page_runner --max_urls_until_restart 10000 --gen_multiplier 5`
+      - Parsing: `poetry run python desktop_selenium.py --resp_type parsing --gen_page_runner --max_urls_until_restart 100000 --gen_multiplier 5`
+  - On the iPad:
+    - Visit the URLs generated by the above commands and add `?browser_id=<browser_id>` to the URL, example: `https://sub.headers.websec.saarland/_hp/tests/test-page-runner-1_ed4f3b-0.html?browser_id=16`
+  - To ensure that all tests have at least 5 results run the following:
+    - On the server:
+      - Generate the repeats: `poetry run python create_repeat.py --selection_str "\"Response\".resp_type = 'parsing' and \"Browser\".os != 'Android 11'"`
+      - Create a page-runner URL containing all URLs: `poetry run python create_page_runner_repeats.py --browser_id <browser_id>`
+    - On the iPad:
+      - Visit the page-runner URL
+
+### Analysis:
+  - Execute `cd _hp/hp/tools/analysis && poetry run jupyter-lab`
+  - Open and run `_hp/hp/tools/analysis/analysis_may_2024.ipynb`
+  - Note that the analysis is tailored towards our results from May 2024 and some small changes might be required if run on new data
+  - TODO: update with updated browser analysis (e.g., `analysis_august_2024_new_chrome.ipynb`)
 
 ## Inventory
 - `_hp/`: All test and analysis code for the paper:
diff --git a/_hp/hp/test_internals.py b/_hp/hp/test_internals.py
@@ -45,7 +45,7 @@ def test_get_tests():
     max_resps = 10
     browser_modifier = 2
     tests = get_tests(resp_type=resp_type, browser_id=browser_id, scheme=scheme, max_popups=max_popups, max_resps=max_resps, browser_modifier=browser_modifier)
-    assert len(tests) == 325
+    assert len(tests) == 269
 
 def test_get_resp_ids():
     """Check whether get_resp_ids returns valid splits
diff --git a/_hp/hp/tools/crawler/create_ipados_browser.py b/_hp/hp/tools/crawler/create_ipados_browser.py
@@ -0,0 +1,10 @@
+"""
+Generate the DB entry for the iPadOS browser
+"""
+from hp.tools.crawler.utils import get_or_create_browser
+
+browser_name = "chrome"
+browser_version = "122.0.6261.89"
+
+browser_id = get_or_create_browser(browser_name, browser_version, 'iPadOS 17.3.1', 'real', 'intent', '')
+print(browser_id)
diff --git a/_hp/hp/tools/crawler/create_repeat.py b/_hp/hp/tools/crawler/create_repeat.py