Potential fix for code scanning alert no. 3: Incomplete URL substring sanitization by BetterMint · Pull Request #15 · BetterMint/BetterMITM

BetterMint · 2025-12-15T01:37:03Z

Potential fix for https://github.com/BetterMint/BetterMITM/security/code-scanning/3

The test should not rely on a simple substring match, but rather should parse and analyze the output to ensure that "example.com" is present in the intended, safe context. If the output is HTML that should contain a link to "example.com", the test should parse the HTML and check for the presence of an href (or similar) attribute referencing it (using, for example, html.parser from the Python standard library or BeautifulSoup).

Concretely:

In the affected assertion, instead of assert "example.com" in str(f.response.content), use an HTML parser to extract all links/URLs from the response content and check that at least one is to "example.com" (and, optionally, NOT as a substring of some other, unrelated domain).
This can be done using the standard library's html.parser (with some code) or more robustly with BeautifulSoup (which is commonly available in test environments); since the file hasn't imported BeautifulSoup, use the built-in html.parser.

Required changes:

Add an import of HTMLParser from html.parser.
Define a small helper class/function within the test method or at the module level, which parses f.response.content and collects all URLs containing "example.com" in, e.g., href attributes.
Replace the substring assertion with one that asserts the correct and intended presence of "example.com".

Suggested fixes powered by Copilot Autofix. Review carefully before merging.

… sanitization Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

examples/contrib/webscanner_helper/test_urlinjection.py

+            def handle_starttag(self, tag, attrs):
+                for attr in attrs:
+                    if attr[0] in ('href', 'src'):
+                        if attr[1].startswith("http") and "://example.com" in attr[1]:


To fix the substring-based check of "://example.com" in the parsing logic, replace it with code that parses the URL and accurately checks the hostname. Specifically, use Python's standard library urllib.parse.urlparse to extract the hostname of the URL in attr[1], then compare the hostname (case-insensitively) to example.com (optionally allowing for subdomains, based on requirements). This avoids errors due to the substring appearing outside the actual hostname.
Required file changes:

In ExampleComLinkParser.handle_starttag, replace the condition attr[1].startswith("http") and "://example.com" in attr[1] with:

Parse attr[1] with urlparse

Check if parsed.hostname (lowercased) is exactly example.com (or startswith/endswith checks if subdomains are desired).

Implementation additions:

Import urllib.parse.urlparse at the top of the file, if not present.

examples/contrib/webscanner_helper/test_urlinjection.py

+
+        parser = ExampleComLinkParser()
+        parser.feed(f.response.text if hasattr(f.response, "text") else str(f.response.content))
+        assert any(url.startswith("http://example.com") or url.startswith("https://example.com") for url in parser.example_com_links)


To fix the issue, you should parse each URL using Python's urllib.parse.urlparse function and then inspect the .hostname attribute of the resulting object. Your test should then confirm that the hostname of each found URL is exactly example.com or, if the intention is to allow subdomains, that it ends with .example.com. This prevents various hostname tricks, such as example.com.attacker.com or example.com@evil.com, which would otherwise match a naive prefix check.

Specifically, in file examples/contrib/webscanner_helper/test_urlinjection.py, locate line 54:

assert any(url.startswith("http://example.com") or url.startswith("https://example.com") for url in parser.example_com_links)

and replace it with:

from urllib.parse import urlparse assert any(urlparse(url).hostname == "example.com" for url in parser.example_com_links)

Also, ensure from urllib.parse import urlparse is imported at the top of the file (after existing imports).

examples/contrib/webscanner_helper/test_urlinjection.py

+
+        parser = ExampleComLinkParser()
+        parser.feed(f.response.text if hasattr(f.response, "text") else str(f.response.content))
+        assert any(url.startswith("http://example.com") or url.startswith("https://example.com") for url in parser.example_com_links)


To fix this problem, the test should parse each URL using urllib.parse, extract the hostname, and check if it matches exactly example.com instead of relying on substring or prefix matches. This means replacing the startswith checks in line 54 with logic using urllib.parse.urlparse(url).hostname == "example.com". You will need to import the urllib.parse module. Only examples/contrib/webscanner_helper/test_urlinjection.py, line 54, needs changes, plus the required import.

Potential fix for code scanning alert no. 3: Incomplete URL substring…

fe0afba

… sanitization Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

github-advanced-security bot found potential problems Dec 15, 2025

View reviewed changes

BetterMint marked this pull request as ready for review December 15, 2025 01:39

BetterMint merged commit d10a380 into main Dec 15, 2025
11 of 46 checks passed

BetterMint deleted the alert-autofix-3 branch December 15, 2025 12:10

@@ -1,6 +1,7 @@
             import json
             from unittest import mock
             from html.parser import HTMLParser
+            from urllib.parse import urlparse
             from examples.contrib.webscanner_helper.urlinjection import HTMLInjection
             from examples.contrib.webscanner_helper.urlinjection import InjectionGenerator
             from examples.contrib.webscanner_helper.urlinjection import logger
@@ -46,8 +47,12 @@
                         def handle_starttag(self, tag, attrs):
                             for attr in attrs:
                                 if attr[0] in ('href', 'src'):
-                                    if attr[1].startswith("http") and "://example.com" in attr[1]:
-                                        self.example_com_links.append(attr[1])
+                                    try:
+                                        parsed = urlparse(attr[1])
+                                        if parsed.scheme in ("http", "https") and parsed.hostname and parsed.hostname.lower() == "example.com":
+                                            self.example_com_links.append(attr[1])
+                                    except Exception:
+                                        pass
                     parser = ExampleComLinkParser()
                     parser.feed(f.response.text if hasattr(f.response, "text") else str(f.response.content))

@@ -1,6 +1,7 @@
             import json
             from unittest import mock
             from html.parser import HTMLParser
+            from urllib.parse import urlparse
             from examples.contrib.webscanner_helper.urlinjection import HTMLInjection
             from examples.contrib.webscanner_helper.urlinjection import InjectionGenerator
             from examples.contrib.webscanner_helper.urlinjection import logger
@@ -51,7 +52,7 @@
                     parser = ExampleComLinkParser()
                     parser.feed(f.response.text if hasattr(f.response, "text") else str(f.response.content))
-                    assert any(url.startswith("http://example.com") or url.startswith("https://example.com") for url in parser.example_com_links)
+                    assert any(urlparse(url).hostname == "example.com" for url in parser.example_com_links)
                 def test_inject_insert_body(self):
                     html_injection = HTMLInjection(insert=True)

@@ -1,6 +1,7 @@
             import json
             from unittest import mock
             from html.parser import HTMLParser
+            import urllib.parse
             from examples.contrib.webscanner_helper.urlinjection import HTMLInjection
             from examples.contrib.webscanner_helper.urlinjection import InjectionGenerator
             from examples.contrib.webscanner_helper.urlinjection import logger
@@ -51,7 +52,7 @@
                     parser = ExampleComLinkParser()
                     parser.feed(f.response.text if hasattr(f.response, "text") else str(f.response.content))
-                    assert any(url.startswith("http://example.com") or url.startswith("https://example.com") for url in parser.example_com_links)
+                    assert any(urllib.parse.urlparse(url).hostname == "example.com" for url in parser.example_com_links)
                 def test_inject_insert_body(self):
                     html_injection = HTMLInjection(insert=True)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Potential fix for code scanning alert no. 3: Incomplete URL substring sanitization#15

Potential fix for code scanning alert no. 3: Incomplete URL substring sanitization#15
BetterMint merged 1 commit intomainfrom
alert-autofix-3

BetterMint commented Dec 15, 2025

Uh oh!

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

BetterMint commented Dec 15, 2025

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant