Update analysis setup

JannisBush · JannisBush · commit 635d76bdb3c8 · 2025-08-24T20:35:51.000+02:00
diff --git a/README.md b/README.md
@@ -14,15 +14,19 @@ The project is made out of 6 parts:
   - (Optional) Run tests to verify server is setup correctly: `sudo docker compose exec header-testing-server bash -c "poetry run -C _hp pytest /app/_hp"`
   - The server is now serving all the tests pages and reponses for our paper. Depending on the configuration the server is now available within and outside the Docker network. E.g., by default it should bind to port 80 and 443 and `curl -I http://localhost/_hp/common/empty.html` and `curl -I -k https://localhost/_hp/common/empty.html` (our dummy certificates are not valid, thus `-k`/insecure is required) on the host should return a response from `BaseHTTP/0.6 Python/3.11.5`
 2. (Optional) Analysis scripts
-   - Dockerized jupyter-lab and works on MacOS and Linux
-   - Run: `(sudo) docker compose exec header-testing-server bash -c "cd /app/_hp/hp/tools/analysis && poetry run jupyter-lab --allow-root --ip 0.0.0.0"` and access the URL printed on your local browser
-   - Open `analysis_ae.ipynb` and run it to analyze the browser runs; If none of the test runners were executed yet, no 
-   - The files `analysis_may_2024.ipynb` and `analysis_december_2024.ipynb` contain the full analysis for the original browser run and the updated browser run experiments described in the paper, including the output of the analysis. Re-running them requires access to our originally collected data.
+   - Dockerized and works on MacOS and Linux
+   - Run: `(sudo) docker compose exec header-testing-server bash -c "cd _hp/hp/tools/analysis && poetry run python analysis_demo.py"` to get some basic statistics about the test runs executed by the unit tests and running browser-test-runners.
+   - We also provide the data and the analysis scripts used for the paper:
+    - Download the database from **TODO**
+    - Import the database into your local postgres: `(sudo) docker compose exec -T postgres pg_restore -U header_user -d http_header_original -v /tmp/data/http_header_original.dump`
+    - Start the jupyter-lab: `(sudo) docker compose exec header-testing-server bash -c "cd /app/_hp/hp/tools/analysis && poetry run jupyter-lab --allow-root --ip 0.0.0.0"` and access the URL printed on your local browser
+    - The files `analysis_may_2024.ipynb` and `analysis_december_2024.ipynb` contain the full analysis for the original browser run and the updated browser run experiments described in the paper, including the output of the analysis and can be executed to reproduce the analysis. Note: re-executing these scripts require a large amount of RAM on the docker container >20GB.
 3. (Optional) Test runner for desktop linux browsers
-   - Dockerized and only work on Linux (issues with the MacOS Linux emulation)
-   - Run: `(sudo) docker compose exec header-testing-server bash -c "cd /app/_hp/hp/tools/crawler/ && poetry run python desktop_selenium.py --debug_browsers --resp_type debug --ignore_certs"` for a quick check that data can be collected
+   - Dockerized demo works on Linux and macOS; For a full run, the browser runner needs to be installed outside of docker on a linux system.
+   - Demo run:
+     - Run: `(sudo) docker compose exec header-testing-server bash -c "cd /app/_hp/hp/tools/crawler/ && poetry run python desktop_selenium.py --debug_browsers --resp_type debug --ignore_certs"` for a quick check that data can be collected
      - This should take around 2-3m
-     - Check `_hp/hp/tools/crawler/logs/desktop-selenium/` for logs, there should be two rows with `Start chrome (128)` and two with `Finish chrome (128)` and no additional rows. The results of these tests can also be seen in the database or checked with the `analysis_ae.ipynb` script
+     - Check `_hp/hp/tools/crawler/logs/desktop-selenium/` for logs, there should be two rows with `Start chrome (128)` and two with `Finish chrome (128)` and no additional rows. The results of these tests can also be seen in the database or checked with the `analysis_demo.py` script
    - Reproduce the basic experiment:
      - TODO (copy from below + verify, there are some issues with dockerized setup e.g. `--no-sandbox`?)
      - Run `(sudo) docker compose exec header-testing-server bash -c "cd /app/_hp/hp/tools/crawler/ && for i in {1..5}; do poetry run python desktop_selenium.py --num_browsers 50 --resp_type basic --ignore_certs; done"`
@@ -36,7 +40,7 @@ The project is made out of 6 parts:
      - Lastly, the modified WPT server needs to be reachable. One option is to modify `/etc/hosts/` to point the required hosts to the docker container  (see `_hp/host-config.txt`)
      - Then `poetry run python desktop_selenium.py --help` can be used to see all settings of the test runner and then executed as wanted
 4. (Optional) Test runner for macOS browser
-   - Requires access to a macOS device with a display 
+   - Requires access to a macOS device with a display
    - The Safari version is bound to the operating system, for an exact reproduction of our results, macOS devices in the correct version are required. To test the test runner on macOS, the used version can be updated in `desktop_selenium.py`
    - Requirements: `python=3.11.5`, `poetry` (see `setup.bash`) and access to the modified WPT server
    - Install `poetry install`
diff --git a/TODOS.md b/TODOS.md
@@ -4,7 +4,8 @@
 - [ ] Move all configs in one place/less places?
 - [ ] Remove hardcoded stuff
 - [ ] Test, test, test! + fix bugs/remove manual steps (e.g., brave is currently manually managed?)
-- [ ] analysis_ae.ipynb basic analysis script that simply works
+- [x] analysis_demo.py basic analysis script that simply works
+- [ ] original data archive + import to recreate analysis
 - [ ] Improve and update REAMDE! What is where and so on
 - [ ] Create artifact eval document
 - [ ] ...
diff --git a/_hp/hp/tools/analysis/analysis_ae.ipynb b/_hp/hp/tools/analysis/analysis_ae.ipynb
diff --git a/_hp/hp/tools/analysis/analysis_december_2024.ipynb b/_hp/hp/tools/analysis/analysis_december_2024.ipynb
@@ -14,9 +14,7 @@
   {
    "cell_type": "markdown",
    "id": "35ccff59-8ebc-4fe8-8b0e-cf8dfe03ef49",
-   "metadata": {
-    "jp-MarkdownHeadingCollapsed": true
-   },
+   "metadata": {},
    "source": [
     "# Setup"
    ]
@@ -41,7 +39,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": null,
    "id": "a5639063-07f2-4283-ba32-098aa5b57671",
    "metadata": {},
    "outputs": [
@@ -70,7 +68,9 @@
     "JOIN \"Response\" ON \"Result\".response_id = \"Response\".id JOIN \"Browser\" ON \"Result\".browser_id = \"Browser\".id\n",
     "WHERE \"Browser\".name != 'Unknown' AND \"Response\".resp_type !=  'debug';\n",
     "\"\"\"\n",
-    "df = get_data(Config(), initial_data)\n",
+    "config = Config()\n",
+    "config.DB_NAME = \"http_header_original\"\n",
+    "df = get_data(config, initial_data)\n",
     "df = add_columns(df)"
    ]
   },
@@ -110,7 +110,7 @@
     "responses = \"\"\"\n",
     "SELECT * from \"Response\";\n",
     "\"\"\"\n",
-    "responses = get_data(Config(), responses)"
+    "responses = get_data(config, responses)"
    ]
   },
   {
@@ -120,38 +120,7 @@
    "source": [
     "# Overview\n",
     "- Collected between 885730 and 1558656 results for 12 browsers (original)\n",
-    "- Collected between 885730 and 898346results for 4 browsers (update december)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "589725d1-b403-4c0c-9fb1-219560b382c3",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Only keep main and finished browsers\n",
-    "# Remove other browsers/os\n",
-    "# iOS unfinished browsers: old chrome (26), old brave (27), opera (28), safari (30), brave (54)\n",
-    "# edge (same as chrome, only desktop): 44\n",
-    "# duckduckgo (very old webview version): 66\n",
-    "# opera (same as chrome, only mobile): 63\n",
-    "# chrome (HSTS testing only): 70\n",
-    "# brave (android shields disabled): 65\n",
-    "# chrome (test version): 71\n",
-    "\n",
-    "df = df.loc[~df[\"browser_id\"].isin([30, 27, 26, 54, 28, 66, 63, 44, 70, 65, 71])]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "16c48765-0460-427a-b2cf-7e084bbd4788",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Do not analyze OAC as it is only supported in Chromium + noisy \n",
-    "df = df.loc[~df[\"test_name\"].str.startswith(\"oac\")]"
+    "- Collected between 885730 and 898346 results for 4 browsers (update december)"
    ]
   },
   {
@@ -9789,9 +9758,7 @@
   {
    "cell_type": "markdown",
    "id": "ba9dc4c8-af9d-461a-8428-81c243a1eae7",
-   "metadata": {
-    "jp-MarkdownHeadingCollapsed": true
-   },
+   "metadata": {},
    "source": [
     "## PerformanceAPI\n",
     "- Testfile: `perfAPI-tao.sub.html`\n",
diff --git a/_hp/hp/tools/analysis/analysis_demo.py b/_hp/hp/tools/analysis/analysis_demo.py
@@ -0,0 +1,25 @@
+import pandas as pd
+import numpy as np
+import matplotlib.pyplot as plt
+import re
+import json
+
+from datetime import datetime
+
+from utils import get_data, Config, clean_url, make_clickable, add_columns
+
+initial_data = """
+SELECT "Result".*, 
+"Response".raw_header, "Response".status_code, "Response".label, "Response".resp_type,
+"Browser".name, "Browser".version, "Browser".headless_mode, "Browser".os, "Browser".automation_mode, "Browser".add_info
+FROM "Result"
+JOIN "Response" ON "Result".response_id = "Response".id JOIN "Browser" ON "Result".browser_id = "Browser".id;
+"""
+df = get_data(Config(), initial_data)
+df = add_columns(df)
+
+print(f"Collected {len(df)} results:")
+
+print(df.groupby(['browser'])['test_name'].value_counts())
+
+print(f"Example row:\n {df.iloc[-1]}")
diff --git a/_hp/hp/tools/analysis/analysis_may_2024.ipynb b/_hp/hp/tools/analysis/analysis_may_2024.ipynb
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
    "id": "961069ec-1248-4bef-b760-65030ca43783",
    "metadata": {},
    "outputs": [],
@@ -14,16 +14,14 @@
   {
    "cell_type": "markdown",
    "id": "35ccff59-8ebc-4fe8-8b0e-cf8dfe03ef49",
-   "metadata": {
-    "jp-MarkdownHeadingCollapsed": true
-   },
+   "metadata": {},
    "source": [
     "# Setup"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 3,
    "id": "69930996-4101-4f83-8162-3560b81ef2f9",
    "metadata": {},
    "outputs": [],
@@ -41,14 +39,15 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 154,
+   "execution_count": 6,
    "id": "a5639063-07f2-4283-ba32-098aa5b57671",
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
+      "Config(DB_HOST='postgres', DB_NAME='http_header_original', DB_USER='header_user', DB_PASSWORD='header_password', DB_PORT='5432')\n",
       "Connecting to the PostgreSQL database...\n",
       "Connection successful\n"
      ]
@@ -67,7 +66,9 @@
     "JOIN \"Response\" ON \"Result\".response_id = \"Response\".id JOIN \"Browser\" ON \"Result\".browser_id = \"Browser\".id\n",
     "WHERE \"Browser\".name != 'Unknown';\n",
     "\"\"\"\n",
-    "df = get_data(Config(), initial_data)\n",
+    "config = Config()\n",
+    "config.DB_NAME = \"http_header_original\"\n",
+    "df = get_data(config, initial_data)\n",
     "df = add_columns(df)"
    ]
   },
@@ -107,15 +108,13 @@
     "responses = \"\"\"\n",
     "SELECT * from \"Response\";\n",
     "\"\"\"\n",
-    "responses = get_data(Config(), responses)"
+    "responses = get_data(config, responses)"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "4d838597-a856-47fd-b0e2-42005c0c5e95",
-   "metadata": {
-    "jp-MarkdownHeadingCollapsed": true
-   },
+   "metadata": {},
    "source": [
     "# Overview\n",
     "- Collected between 885730 and 1558656 results for 12 browsers"
@@ -128,27 +127,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Only main and finished browsers\n",
-    "# Remove other browsers/os\n",
-    "# iOS unfinished browsers: old chrome (26), old brave (27), opera (28), safari (30), brave (54)\n",
-    "# edge (same as chrome, only desktop): 44\n",
-    "# duckduckgo (very old webview version): 66\n",
-    "# opera (same as chrome, only mobile: 63\n",
-    "# chrome (HSTS testing only): 70\n",
-    "# brave (android shields disabled): 65\n",
-    "\n",
-    "df = df.loc[~df[\"browser_id\"].isin([30, 27, 26, 54, 28, 66, 63, 44, 70, 65])]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 157,
-   "id": "16c48765-0460-427a-b2cf-7e084bbd4788",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Do not analyze OAC as it is only supported in Chromium + noisy \n",
-    "df = df.loc[~df[\"test_name\"].str.startswith(\"oac\")]"
+    "# Drop the browsers added in the second experiment in December\n",
+    "df = df.loc[~df[\"browser_id\"].isin([73, 74, 75, 76])]"
    ]
   },
   {
diff --git a/data/.gitkeep b/data/.gitkeep
diff --git a/docker-compose.yml b/docker-compose.yml
@@ -1,12 +1,13 @@
 services:
   postgres:
-    image: postgres:15
+    image: postgres:17
     environment:
       POSTGRES_DB: http_header_demo
       POSTGRES_USER: header_user
       POSTGRES_PASSWORD: header_password
     volumes:
       - postgres_data:/var/lib/postgresql/data
+      - ./data:/tmp/data
     ports:
       - "5432:5432"
     healthcheck: