Skip to content

Commit 47beb9f

Browse files
authored
Trim trailing slashes when matching URLs (#910)
This is another bit of Browsertrix normalization I missed in #909. Makes WARC import a little more stable and reliable.
1 parent 021fe49 commit 47beb9f

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

web_monitoring/utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -231,7 +231,7 @@ def matchable_url(url: str) -> str:
231231
parsed = urlsplit(url)
232232
return parsed._replace(
233233
netloc=normalize_netloc(parsed),
234-
path=(parsed.path or '/'),
234+
path=(parsed.path or '/').rstrip('/'),
235235
query=matchable_querystring(parsed.query),
236236
fragment=''
237237
).geturl()

0 commit comments

Comments
 (0)