Skip to content

Commit af85759

Browse files
committed
Fix dockerized server setup
Desktop browsers still have some issues in docker. Could be due to MacOS vs Linux.
1 parent be906ff commit af85759

25 files changed

+993
-51
lines changed

Dockerfile

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,8 @@ RUN chmod +x /app/entrypoint.sh
3131
# Expose ports
3232
EXPOSE 80 443 8443 9000
3333

34-
# WORKDIR /app
34+
# WORKDIR /app
35+
36+
ENTRYPOINT ["/app/entrypoint.sh"]
37+
# CMD is default command if not overridden in docker-compose
38+
CMD ["poetry", "run", "-C", "/app/_hp", "python", "/app/wpt", "serve", "--config", "/app/_hp/wpt-config.json"]

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,15 @@ Our modified version of the wptserve HTTP server implementation can be found in
2020
- Manually check if the server and the tests are working: Visit http://sub.headers.websec.saarland:80/_hp/tests/framing.sub.html and confirm that tests are loaded and executed.
2121
- Optional: Run tests to check that everything is working correctly: `poetry run -C _hp pytest _hp`
2222
- Optional: Change the used domains in [_hp/wpt-config.json](_hp/wpt-config.json) and [_hp/host-config.txt](_hp/host-config.txt)
23-
- To run it inside a Docker container: `docker compose up --build`. This should spin up the server.
23+
- To run it inside a Docker container: `docker compose up --build`. This should spin up the server (as we use the same docker for the linux desktop browsers, the container is configured as `platform: linux/amd64` meaning it is emulated and slow on AppleSilicon)
2424

2525

2626
## Reproduce or Enhance our Results
2727
In the following, we describe how to reproduce all our results from the paper.
2828
By slightly adapting the configuration and updating the used browsers, it is also possible to run our tool chain on new/other browser configurations.
2929

3030
### Desktop Browsers (Linux Ubuntu)
31+
- Note: if running in the docker container on AppleSilicon only headless browser will work as Xvfb cannot be emulated
3132
- Execute `cd _hp/hp/tools/crawler`
3233
- If using self-signed certs, add `--ignore_certs` to all commands.
3334
- Run the following for a quick test run to check that everything is working: `poetry run python desktop_selenium.py --debug_browsers --resp_type debug`

_hp/hp/test_external_api.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import json
33
import httpx
44

5-
with open("_hp/wpt-config.json", "r") as f:
5+
with open("/app/_hp/wpt-config.json", "r") as f:
66
wpt_config = json.load(f)
77

88
def test_get_resp_ids():

_hp/hp/test_internals.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
from hp.tools.create_responses import create_responses
77
from hp.tools.crawler.desktop_selenium import get_child_processes
88

9-
with open("_hp/wpt-config.json", "r") as f:
9+
with open("/app/_hp/wpt-config.json", "r") as f:
1010
wpt_config = json.load(f)
1111

1212

_hp/hp/tools/analysis/utils.py

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@
88

99
@dataclass
1010
class Config:
11-
DB_HOST: str="localhost"
11+
DB_HOST: str="postgres"
1212
DB_NAME: str="http_header_demo"
13-
DB_USER: str=""
14-
DB_PASSWORD: str=""
13+
DB_USER: str="header_user"
14+
DB_PASSWORD: str="header_password"
1515
DB_PORT: str="5432"
1616

1717
def get_data(config: Config, select_query: str, quiet=False) -> pd.DataFrame:
@@ -64,17 +64,17 @@ def postgresql_to_dataframe(conn, select_query, non_cat=None):
6464
print("Error: %s" % error)
6565
cursor.close()
6666
return 1
67-
67+
6868
# Naturally we get a list of tuples
6969
tuples = cursor.fetchall()
7070
cursor.close()
71-
71+
7272
# We just need to turn it into a pandas dataframe
7373
df = pd.DataFrame(tuples, columns=column_names)
74-
74+
7575
# Convert all string (object) columns to categorical to speed things up
7676
# df[df.select_dtypes(['object']).columns] = df.select_dtypes(['object']).apply(to_cat, non_cat=non_cat)
77-
77+
7878
return df
7979

8080
def to_cat(column, non_cat=[]):
@@ -105,7 +105,7 @@ def clickable(title=None, url=None):
105105

106106
def add_columns(df):
107107
"""Create extra columns: e.g., clean_url and filtered outcome_str"""
108-
108+
109109
df["outcome_str"] = df["outcome_value"].fillna("None").astype(str)
110110
df["clean_url"] = df["full_url"].apply(clean_url)
111111
@lru_cache(maxsize=None)
@@ -115,16 +115,16 @@ def id_to_browser(id):
115115
df["browser"] = df["browser_id"].apply(id_to_browser)
116116
df["org_origin"] = df["org_scheme"] + "://" + df["org_host"]
117117
df["resp_origin"] = df["resp_scheme"] + "://" + df["resp_host"]
118-
118+
119119
# Unify outcomes that are semantically the same (only the exact error string is different in different browsers)
120-
120+
121121
# Fetch fails:
122122
# Firefox: {'error': 'object "TypeError: NetworkError when attempting to fetch resource."', 'headers': ''}
123123
# Chromium: {'error': 'object "TypeError: Failed to fetch"', 'headers': ''}
124124
# Safari: {'error': 'object "TypeError: Load failed"', 'headers': ''}
125125
df["outcome_str"] = df["outcome_str"].replace("TypeError: Load failed", "TypeError: Failed to fetch", regex=True)
126126
df["outcome_str"] = df["outcome_str"].replace("TypeError: NetworkError when attempting to fetch resource.", "TypeError: Failed to fetch", regex=True)
127-
127+
128128
# Fetch is aborted:
129129
# Firefox: AbortError: The operation was aborted.<space>
130130
# Safari: AbortError: Fetch is aborted
@@ -137,10 +137,10 @@ def id_to_browser(id):
137137
# Firefox: {'window.open.opener': 'object "TypeError: w is null"'}
138138
df["outcome_str"] = df["outcome_str"].replace("TypeError: w is null", "No window-reference. Probably popup blocked", regex=True)
139139
df["outcome_str"] = df["outcome_str"].replace("TypeError: Cannot read properties of null (reading \'opener\')", "No window-reference. Probably popup blocked", regex=True)
140-
140+
141141
# For document referrer we only want to know whether it is a origin or the full URl?
142142
df['outcome_str'] = df['outcome_str'].apply(lambda x: 'document.referrer: full_url' if 'responses.py?feature_group' in x else x)
143143
# The differences always only are between http-origin, https-origin, full-url, none, timeout; there is never a difference between the various origins, thus we can merge them to make our live easier
144144
df["outcome_str"] = df["outcome_str"].apply(lambda x: "document.referrer: https://origin" if "https://" in x else x)
145145
df["outcome_str"] = df["outcome_str"].apply(lambda x: "document.referrer: http://origin" if "http://" in x else x)
146-
return df
146+
return df
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)