Skip to content

SP-2372: Fixed handling of links to external URLs in HtmlBrowser (esp in About Box)#1446

Merged
tombogle merged 12 commits intomasterfrom
sp-2372-handle-links-to-external-window-in-about-box
Jul 18, 2025
Merged

SP-2372: Fixed handling of links to external URLs in HtmlBrowser (esp in About Box)#1446
tombogle merged 12 commits intomasterfrom
sp-2372-handle-links-to-external-window-in-about-box

Conversation

@tombogle
Copy link
Contributor

@tombogle tombogle commented Jul 11, 2025

This change is Reviewable

@tombogle tombogle requested a review from imnasnainaec July 11, 2025 13:39
@tombogle tombogle self-assigned this Jul 11, 2025
tombogle added a commit to sillsdev/saymore that referenced this pull request Jul 11, 2025
…in new window.

The ultimate fix will require a change in libpalaso: sillsdev/libpalaso#1446
@github-actions
Copy link

github-actions bot commented Jul 11, 2025

Palaso Tests

     4 files  ±  0       4 suites  ±0   9m 53s ⏱️ +25s
 5 053 tests + 75   4 819 ✅ + 75  234 💤 ±0  0 ❌ ±0 
16 411 runs  +225  15 708 ✅ +224  703 💤 +1  0 ❌ ±0 

Results for commit 1934655. ± Comparison against base commit 5a3933d.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 7 files at r1.
Reviewable status: 1 of 7 files reviewed, 2 unresolved discussions (waiting on @tombogle)


TestApps/SIL.Windows.Forms.TestApp/TestAppForm.cs line 152 at r1 (raw file):

			const string internalLinkHtmlContent = @"
						  <p>Testing the about box with an <a href='#internal'>internal link</a>.</p>
						  <p>Some ipsums and maybe a lorum or two.</p>

I like this story, but your could probably cut it to half the size.


SIL.Windows.Forms/Miscellaneous/SILAboutBox.cs line 45 at r1 (raw file):

		/// <summary>
		/// Occurs before browser control navigation occurs within the HTML browser control in the
		/// About dialog box. This event fires before browser navigation occurs. It allows

Are the first two sentences in this <summary> partially redundant?

@jasonleenaylor
Copy link
Contributor

TestApps/SIL.Windows.Forms.TestApp/TestAppForm.cs line 152 at r1 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

I like this story, but your could probably cut it to half the size.

Hmm, it could be shorter but I am a strong believer that test code needs an appropriate sprinkling of humor to keep developers amused. 🪰 🐸 🐱 🐶 🐄 🦊 🔫 🐑

@tombogle tombogle enabled auto-merge July 14, 2025 12:30
@tombogle
Copy link
Contributor Author

SIL.Windows.Forms/Miscellaneous/SILAboutBox.cs line 45 at r1 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Are the first two sentences in this <summary> partially redundant?

Not anymore

Copy link
Contributor Author

@tombogle tombogle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 1 of 7 files reviewed, 2 unresolved discussions (waiting on @tombogle)


TestApps/SIL.Windows.Forms.TestApp/TestAppForm.cs line 152 at r1 (raw file):

Previously, jasonleenaylor (Jason Naylor) wrote…

Hmm, it could be shorter but I am a strong believer that test code needs an appropriate sprinkling of humor to keep developers amused. 🪰 🐸 🐱 🐶 🐄 🦊 🔫 🐑

Humor meets real-life business requirements.

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 7 files at r1, all commit messages.
Reviewable status: 2 of 7 files reviewed, 5 unresolved discussions (waiting on @tombogle)


SIL.Windows.Forms/Miscellaneous/SILAboutBox.cs line 303 at r2 (raw file):

			
			var regexHtmlHasExtLinks =
				new Regex(@"<a\s+[^>]*href\s*=\s*['""](?!#)[^'""]+['""][^>]*>", IgnoreCase);

This will catch (e.g.) <a href='mailto:someone@example.com'>Email</a>. Is that desired?


SIL.Windows.Forms/Miscellaneous/SILAboutBox.cs line 307 at r2 (raw file):

				return html;
			
			var regexHasTargetBlank = new Regex(@"\btarget\s*=\s*['""]?_blank['""]?", IgnoreCase);

What about other intentional targets (https://www.w3schools.com/tags/att_a_target.asp)?


CHANGELOG.md line 23 at r2 (raw file):

- [SIL.Windows.Forms] Added PortableClipboard.CanGetImage()
- [ClipboardTestApp] Restored this test program and added tests for PortableClipboard.CanGetImage() and GetImageFromClipboard()
- [SIL.Windows.Forms] Check in debug mode to alert developer when using the SILAboutBox to display HTML that has links but has neither a base target in the head element (`<base target=""_blank""> element`) nor explicit `target=""_blank""` attributes for any of the links when they have not handled the Navigating event to customize the navigation behavior. In this situation links will likely open directly in the About browser window and will probably not behave as expected.

I don't think we want the escaped quotes ("") here.


CHANGELOG.md line 25 at r2 (raw file):

- [SIL.Windows.Forms] Check in debug mode to alert developer when using the SILAboutBox to display HTML that has links but has neither a base target in the head element (`<base target=""_blank""> element`) nor explicit `target=""_blank""` attributes for any of the links when they have not handled the Navigating event to customize the navigation behavior. In this situation links will likely open directly in the About browser window and will probably not behave as expected.
- [SIL.Windows.Forms] Added SILAboutBox.Navigating and SILAboutBox.Navigated events to allow callers to customize how HTML links in the embedded browser are handled.
- [SIL.Windows.Forms] Added SILAboutBox.AllowExternalLinksToOpenInsideAboutBox property to control whether a <base target="_blank" rel="noopener noreferrer"> line is automatically added to the HTML (if missing) to ensure links open in the system browser rather than within the About dialog box.

This <base target="_blank" rel="noopener noreferrer"> could go in a ` code block.


CHANGELOG.md line 30 at r2 (raw file):

- [SIL.Windows.Forms] In `CustomDropDown.OnOpening`, fixed check that triggers timer to stop.
- [SIL.Windows.Forms] Fixed HtmlBrowserHandled.OnNewWindow to open external URLs (target="_blank") in the system’s default browser instead of Internet Explorer. This improves behavior in SILAboutBox and other components that use the embedded browser.

Probably wrap this target="_blank" in a ` code block (or else escape the _ with ="\_blank").

@tombogle tombogle marked this pull request as draft July 14, 2025 19:44
auto-merge was automatically disabled July 14, 2025 19:44

Pull request was converted to draft

Methods in this class add support for fixing missing target attributes on links and copying simple asset files when creating a temp HTML file to display.
Also, prepped CHANGELOG.md for v. 16.1.0 release
@tombogle tombogle marked this pull request as ready for review July 16, 2025 11:25
Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 7 files at r3, all commit messages.
Dismissed @jasonleenaylor from a discussion.
Reviewable status: 4 of 11 files reviewed, 2 unresolved discussions (waiting on @tombogle)


CHANGELOG.md line 601 at r3 (raw file):

[Unreleased]: https://github.com/sillsdev/libpalaso/compare/v16.1.0...master
[16.1.0]: https://github.com/sillsdev/libpalaso/compare/v16.0.0...master

v16.0.0...v16.1.0


SIL.Windows.Forms/Miscellaneous/SILAboutBox.cs line 287 at r3 (raw file):

				// Note: the following comment also applies to the case where HandleMissingLinkTargets tweaks
				// the HTML, but hopefully that will be rare in production code.
				// Create a temporary file with the DependencyMarker replaced with our collected Acknowledgements.

It's confusing to now have this comment so far away from the line it's talking about:

newHtmlContents = aboutBoxHtml.Replace(DependencyMarker, insertableAcknowledgements);

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 11 files reviewed, 4 unresolved discussions (waiting on @tombogle)


SIL.Core/Html/HtmlUtils.cs line 15 at r3 (raw file):

	{
		private static readonly Regex regexTarget =
			new Regex(@"target\s*=\s*['""](?<target>[^'""]*)['""]", IgnoreCase | Compiled);

We've had to deal with filenames that have a ' in them. Would regexTarget/regexHref/regexLocalAssetReferences behave as intended in those cases (e.g. target="hawai'i.html")?


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 37 at r3 (raw file):

		[TestCase(@"<html>
<head><base target=""_blank"" rel=""noopener noreferrer""></head>

To avoid unnecessary " escaping, might as well use single quotes in test cases where double quotes aren't needed.


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 96 at r3 (raw file):

</html>";
			if (useDoubleQuotes)
				html = html.Replace("'", @"""");

html.Replace("'", "\""); looks slightly more readable to me in these useDoubelQuotes cases, but looking at this whole document, you seem to prefer using @ and "" for strings with ", rather than \".

@tombogle
Copy link
Contributor Author

SIL.Windows.Forms/Miscellaneous/SILAboutBox.cs line 287 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

It's confusing to now have this comment so far away from the line it's talking about:

newHtmlContents = aboutBoxHtml.Replace(DependencyMarker, insertableAcknowledgements);

Yes, I had intended to remove that comment block entirely as I think it is no longer very helpful. When I went to look at that, I realized I had failed to actually use my new method!

…lFile

Documented previously undocumented feature: SILAboutBox constructor can accept either a filename or a file URI.
@tombogle tombogle force-pushed the sp-2372-handle-links-to-external-window-in-about-box branch from d7eb1f7 to cd5dd1e Compare July 16, 2025 14:01
Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 11 files reviewed, 2 unresolved discussions (waiting on @tombogle)


SIL.Core/Html/HtmlUtils.cs line 15 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

We've had to deal with filenames that have a ' in them. Would regexTarget/regexHref/regexLocalAssetReferences behave as intended in those cases (e.g. target="hawai'i.html")?

One option, with back references and lazy matching:

		private static readonly Regex regexTarget =
			new Regex(@"target\s*=\s*(['""])(?<target>.*?)\1", IgnoreCase | Compiled);
[...]
		private static readonly Regex regexHref =
			new Regex(@"href\s*=\s*(['""])(?<href>.+?)\1", IgnoreCase | Compiled);
[...]
		private static readonly Regex regexLocalAssetReferences =
			new Regex(@"(?:src|href)\s*=\s*(['""])\s*(./)?(?<filename>[^/>\s\\]+?)\2",
			IgnoreCase | Compiled);

…attributes and ensure the ending quote matches the opening quote
Copy link
Contributor Author

@tombogle tombogle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 11 files reviewed, 1 unresolved discussion (waiting on @imnasnainaec and @tombogle)


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 37 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

To avoid unnecessary " escaping, might as well use single quotes in test cases where double quotes aren't needed.

I changed a few. I intentionally want some variety, in order to ensure I'm exercising the regexes rigorously.


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 96 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

html.Replace("'", "\""); looks slightly more readable to me in these useDoubelQuotes cases, but looking at this whole document, you seem to prefer using @ and "" for strings with ", rather than \".

Agreed. I actually don't have a strong preference. I hate them both about equally. I vibe-coded a lot of this, and ChatGPT seems to have preferred the @ notation.

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 11 files reviewed, 1 unresolved discussion (waiting on @tombogle)


SIL.Windows.Forms/HtmlBrowser/XWebBrowser.cs line 348 at r6 (raw file):

						Program.Process.SafeStart(url); // opens with default browser
						e.Cancel = true;
					}

Would this be worth an else { Logger.Write[Minor]Event if no url is extracted?


SIL.Core/Html/HtmlUtils.cs line 15 at r6 (raw file):

	{
		private static readonly Regex regexTarget =
			new Regex(@"target\s*=\s*(['""])(?<target>[^'""]*?)\1", IgnoreCase | Compiled);

With lazy matching (*?), the [^'""]* is unnecessary and can be replaced with .*.


SIL.Core/Html/HtmlUtils.cs line 30 at r6 (raw file):

		private static readonly Regex regexLocalAssetReferences =
			new Regex(@"(?:src|href)\s*=\s*(['""])\s*(./)?(?<filename>[^/>\s\\]+?)\1",

I had used \2 instead of \1 because I assumed (?:src|href) would be \1. However, the tests all pass whether the regex ends with \1 or \2. Any test case that could be added to straighten that out?

Also added and improved some unit test cases
Copy link
Contributor Author

@tombogle tombogle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 11 files reviewed, 1 unresolved discussion (waiting on @imnasnainaec and @tombogle)


SIL.Core/Html/HtmlUtils.cs line 15 at r6 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

With lazy matching (*?), the [^'""]* is unnecessary and can be replaced with .*.

Done


SIL.Core/Html/HtmlUtils.cs line 30 at r6 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

I had used \2 instead of \1 because I assumed (?:src|href) would be \1. However, the tests all pass whether the regex ends with \1 or \2. Any test case that could be added to straighten that out?

Interesting. I thought it should be 2 also, but chatGPT said 1. The first group (with the :?) is a non-capturing group. When I change it to \2, lots of tests fail. Just to be rigorous, I changed one of the filenames to something that would match that first group if it were a capturing group. For completeness I also made the group designed to ignore the ./ into a non-capturing group.


SIL.Windows.Forms/HtmlBrowser/XWebBrowser.cs line 348 at r6 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Would this be worth an else { Logger.Write[Minor]Event if no url is extracted?

in practice, there's no known pathway for url to be null or empty without it being a deliberate, valid user interaction that doesn't require error logging.

Here’s a breakdown:


✅ What Happens in Various Cases:

  1. No href attribute at all:

    • The element isn’t rendered as a link → can’t be clicked, so OnNewWindow never runs.
      No need to log
  2. Empty href="":

    • This becomes a link to the current page or folder (file:///...)

    • url is not null or empty, just a local URI → handled like a normal case
      No need to log

  3. Malformed href or inaccessible Document.ActiveElement:

    • Unlikely, but could happen in extreme/edge cases (e.g., script errors, DOM timing weirdness)

    • Already caught in the catch (Exception) block
      Already logged if something goes wrong

Why Logging an else Might Not Help:

Logging a “no URL extracted” case (i.e., if (IsNullOrEmpty(url))) would:

  • Add noise in virtually all realistic usage scenarios

  • Potentially confuse maintainers into thinking something is broken when it’s not

  • Not be helpful unless we had a reproducible case where url is unexpectedly null and no exception is thrown — which you've just ruled out.

Recommendation:

Don't add the else log. It's not meaningfully actionable, and the current try/catch already captures and logs any real failure in the URL extraction logic.

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 7 files at r1, all commit messages.
Reviewable status: 5 of 11 files reviewed, 2 unresolved discussions (waiting on @tombogle)


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 62 at r7 (raw file):

		[TestCase(htmlWithSelfClosingEmptyHead)]
		[TestCase(htmlWithNonEmptyHeadButNoTarget)]
		public void HasBaseTarget_No_ReturnsFalse(string html)

How about checking HasBaseTarget in the case that base exists with href but no target? E.g.,

		[TestCase(@"<html>
<head><base href='http://example.com/'></head>
<body><p>Stuff.</p></body>
</html>")]

SIL.Core.Tests/Html/HtmlUtilsTests.cs line 145 at r7 (raw file):

		{
			var html = @"<html><head></head><body><a href=""#section'1'"">Jump</a> and 
				<a href='http://example.com'>Go</a>!</body></html>";

Maybe something like

			var html = @"<html><head></head><body>
<a href=""#section'1'"">Jump</a> and <a href='http://example.com'>Go</a>!
</body></html>";

for readability and consistency. (Some with a few other var htmls below.)


CHANGELOG.md line 25 at r5 (raw file):

- [SIL.Windows.Forms] Added PortableClipboard.CanGetImage()
- [ClipboardTestApp] Restored this test program and added tests for PortableClipboard.CanGetImage() and GetImageFromClipboard()
- [SIL.Windows.Forms] Check in debug mode to alert developer when using the SILAboutBox to display HTML that has links but has neither a base target in the head element (`<base target="_blank"> element`) nor explicit `target="_blank"` attributes for any of the links when they have not handled the Navigating event to customize the navigation behavior. In this situation links will likely open directly in the About browser window and will probably not behave as expected.

element` -> ` element

Also, this first sentence is hard to parse.

Copy link
Contributor Author

@tombogle tombogle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 11 files reviewed, 1 unresolved discussion (waiting on @imnasnainaec and @tombogle)


CHANGELOG.md line 25 at r5 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

element` -> ` element

Also, this first sentence is hard to parse.

Improved wording and moved to Changed section


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 62 at r7 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

How about checking HasBaseTarget in the case that base exists with href but no target? E.g.,

		[TestCase(@"<html>
<head><base href='http://example.com/'></head>
<body><p>Stuff.</p></body>
</html>")]

Done


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 145 at r7 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Maybe something like

			var html = @"<html><head></head><body>
<a href=""#section'1'"">Jump</a> and <a href='http://example.com'>Go</a>!
</body></html>";

for readability and consistency. (Some with a few other var htmls below.)

Done

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 11 files reviewed, 1 unresolved discussion (waiting on @tombogle)


CHANGELOG.md line 38 at r8 (raw file):

- [SIL.WritingSystems] Updated embedded langtags.json
- [SIL.WritingSystems] Updated embedded ianaSubtagRegistry.txt
- [SIL.Windows.Forms] Improved SILAboutBox to help prevent navigation issues when the supplied HTML contains links but lacks both a <base target="_blank"> element and explicit target="_blank" attributes on the links. If the Navigating event is not handled, such links would otherwise open within the About box itself, often resulting in unexpected behavior. A sensible fallback is now applied automatically to the HTML, and in debug mode, a developer warning is shown.

These <base target="_blank"> and target="_blank" could both use ` code blocks.

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 11 files reviewed, 3 unresolved discussions (waiting on @tombogle)


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 239 at r8 (raw file):

		[TestCase("tel:8008008000")]
		[TestCase(null)]
		public void IsExternalHref_IsNotExternal_ReturnsFalse(string href)

I don't know if you might want to add test cases to either IsExternalHref_IsExternal_ReturnsTrue or IsExternalHref_IsNotExternal_ReturnsFalse for any of these other URI schemes that ChatGPT says are fairly common:

bitcoin: blob: data: ftp: ftps: geo: git: intent: market: mms: rtsp: sftp: sip: sips: slack: smb: sms: spotify: ssh: whatsapp: ws: wss: zoommtg:


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 255 at r8 (raw file):

		[TestCase("")]
		[TestCase(@"<p><a name='gumby'/></p>")]
		[TestCase(@"<p><a href='https://www.example.com'/></p>")]

These two @s no longer needed.


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 291 at r8 (raw file):

		{
			const string cssName = "style.css";
			var cssPath = Combine(_testDir, cssName);

With only using cssPath once, (as in the other tests) it doesn't need its own variable.


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 308 at r8 (raw file):

		[TestCase("./")]
		[TestCase(" ")]
		[TestCase(" ./")]

The " ./" case could be added to the previous test.

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 11 files reviewed, 7 unresolved discussions (waiting on @tombogle)


TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs line 101 at r8 (raw file):

			this._cboAboutHTML.Items.AddRange(new object[] {
			"Links without target attribute",
			"HTML head includes <base target=\"_blank\" rel=\"noopener noreferrer\">",

The text of this longest option is getting cut off. The open dropdown menu needs to be wider.


SIL.Core/Html/HtmlUtils.cs line 55 at r8 (raw file):

		/// <param name="additionalDebugInfo">Any additional information to be appended to the
		/// debug message for the developer.</param>
		/// <returns></returns>

Drop empty <returns>


SIL.Core/Html/HtmlUtils.cs line 113 at r8 (raw file):

					@"{0} HTML has links but has neither a base target in the head element (`<base target=""_blank""> element`) nor explicit `target=""_blank""` attributes for any of the links.
Unless you override the default Navigating behavior, links may open directly in the {0} browser window and will probably not behave as expected.
Easy fix: consider adding <base target=""_blank"" rel=""noopener noreferrer""> inside your HTML head if your HTML does not use internal or mailto links.

Could add ` needed around <base target=""_blank"" rel=""noopener noreferrer""> to match the first sentence.


SIL.Core/Html/HtmlUtils.cs line 130 at r8 (raw file):

		}

		internal static bool IsExternalHref(string href)

In IsExternalHref, rather than checking if it starts with www., we could check if it looks like it starts with a domain with something like

^([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}(:\d+)?(\/.*)?$

Enhanced IsExternalHref to cover edge cases and unlikely schemes.
Made Regex's public in case they prove useful elsewhere.
Copy link
Contributor Author

@tombogle tombogle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 11 files reviewed, 5 unresolved discussions (waiting on @imnasnainaec and @tombogle)


CHANGELOG.md line 38 at r8 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

These <base target="_blank"> and target="_blank" could both use ` code blocks.

Oops. Lost those when ChatGPT helped me revise the wording!


SIL.Core/Html/HtmlUtils.cs line 55 at r8 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Drop empty <returns>

Helpful content filled in.


SIL.Core/Html/HtmlUtils.cs line 113 at r8 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Could add ` needed around <base target=""_blank"" rel=""noopener noreferrer""> to match the first sentence.

Done. Made references to those elements/attributes in the first sentence more generic to accurately reflect what the code does.


SIL.Core/Html/HtmlUtils.cs line 130 at r8 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

In IsExternalHref, rather than checking if it starts with www., we could check if it looks like it starts with a domain with something like

^([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}(:\d+)?(\/.*)?$

The problem is that there is no way to distinguish between a bare domain (example.com) and a local file (help.htm) without a complex and unreliable heuristic based on either common URL endings or file extensions.


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 239 at r8 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

I don't know if you might want to add test cases to either IsExternalHref_IsExternal_ReturnsTrue or IsExternalHref_IsNotExternal_ReturnsFalse for any of these other URI schemes that ChatGPT says are fairly common:

bitcoin: blob: data: ftp: ftps: geo: git: intent: market: mms: rtsp: sftp: sip: sips: slack: smb: sms: spotify: ssh: whatsapp: ws: wss: zoommtg:

Added a couple. Probably none of these are likely to be used in an HTML file that a developer would want to host in an old IE-style browser control.


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 255 at r8 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

These two @s no longer needed.

Removed

Comment on lines +96 to +97
this._cboAboutHTML.Anchor = ((System.Windows.Forms.AnchorStyles)(((System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Left)
| System.Windows.Forms.AnchorStyles.Right)));

Check warning

Code scanning / CodeQL

Cast to same type Warning

This cast is redundant because the expression already has type AnchorStyles.

Copilot Autofix

AI 8 months ago

The redundant cast in the expression should be removed. Specifically, we need to eliminate the outer cast (System.Windows.Forms.AnchorStyles) because the result of the bitwise OR operation already has the type System.Windows.Forms.AnchorStyles. This does not affect the functionality of the code but simplifies its readability and maintainability.


Suggested changeset 1
TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs b/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs
--- a/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs
+++ b/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs
@@ -93,8 +93,8 @@
             // 
             // _cboAboutHTML
             // 
-            this._cboAboutHTML.Anchor = ((System.Windows.Forms.AnchorStyles)(((System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Left) 
-            | System.Windows.Forms.AnchorStyles.Right)));
+            this._cboAboutHTML.Anchor = (System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Left 
+            | System.Windows.Forms.AnchorStyles.Right);
             this._cboAboutHTML.DropDownStyle = System.Windows.Forms.ComboBoxStyle.DropDownList;
             this._cboAboutHTML.DropDownWidth = 380;
             this._cboAboutHTML.FormattingEnabled = true;
EOF
@@ -93,8 +93,8 @@
//
// _cboAboutHTML
//
this._cboAboutHTML.Anchor = ((System.Windows.Forms.AnchorStyles)(((System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Left)
| System.Windows.Forms.AnchorStyles.Right)));
this._cboAboutHTML.Anchor = (System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Left
| System.Windows.Forms.AnchorStyles.Right);
this._cboAboutHTML.DropDownStyle = System.Windows.Forms.ComboBoxStyle.DropDownList;
this._cboAboutHTML.DropDownWidth = 380;
this._cboAboutHTML.FormattingEnabled = true;
Copilot is powered by AI and may make mistakes. Always verify output.
//
// label1
//
this.label1.Anchor = ((System.Windows.Forms.AnchorStyles)((System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left)));

Check warning

Code scanning / CodeQL

Cast to same type Warning

This cast is redundant because the expression already has type AnchorStyles.

Copilot Autofix

AI 8 months ago

The best way to fix this issue involves removing the redundant cast to System.Windows.Forms.AnchorStyles. Since the expression already evaluates to the correct type (AnchorStyles), the cast is superfluous. Removing it will not alter the functionality of the code but will make it cleaner and easier to read.

Suggested changeset 1
TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs b/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs
--- a/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs
+++ b/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs
@@ -185,7 +185,7 @@
             // 
             // label1
             // 
-            this.label1.Anchor = ((System.Windows.Forms.AnchorStyles)((System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left)));
+            this.label1.Anchor = (System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left);
             this.label1.AutoSize = true;
             this.label1.Location = new System.Drawing.Point(12, 583);
             this.label1.Name = "label1";
EOF
@@ -185,7 +185,7 @@
//
// label1
//
this.label1.Anchor = ((System.Windows.Forms.AnchorStyles)((System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left)));
this.label1.Anchor = (System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left);
this.label1.AutoSize = true;
this.label1.Location = new System.Drawing.Point(12, 583);
this.label1.Name = "label1";
Copilot is powered by AI and may make mistakes. Always verify output.
//
// label2
//
this.label2.Anchor = ((System.Windows.Forms.AnchorStyles)((System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left)));

Check warning

Code scanning / CodeQL

Cast to same type Warning

This cast is redundant because the expression already has type AnchorStyles.

Copilot Autofix

AI 8 months ago

To fix the issue, we should remove the redundant cast around the System.Windows.Forms.AnchorStyles expression on line 212. This will simplify the code while maintaining the same behavior. The type of the expression is already AnchorStyles, so the cast provides no additional value.

Suggested changeset 1
TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs b/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs
--- a/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs
+++ b/TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs
@@ -209,7 +209,7 @@
             // 
             // label2
             // 
-            this.label2.Anchor = ((System.Windows.Forms.AnchorStyles)((System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left)));
+            this.label2.Anchor = (System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left);
             this.label2.AutoSize = true;
             this.label2.Location = new System.Drawing.Point(12, 603);
             this.label2.Name = "label2";
EOF
@@ -209,7 +209,7 @@
//
// label2
//
this.label2.Anchor = ((System.Windows.Forms.AnchorStyles)((System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left)));
this.label2.Anchor = (System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left);
this.label2.AutoSize = true;
this.label2.Location = new System.Drawing.Point(12, 603);
this.label2.Name = "label2";
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r9, all commit messages.
Reviewable status: 5 of 11 files reviewed, 6 unresolved discussions (waiting on @tombogle)


SIL.Windows.Forms/Miscellaneous/SILAboutBox.cs line 275 at r9 (raw file):

			if (!AllowExternalLinksToOpenInsideAboutBox || _navigatingHandlers != null)
			{
				var fixedContents = HtmlUtils.HandleMissingLinkTargets(newHtmlContents ?? aboutBoxHtml,

For code readability, it might be nice for each argument of HandleMissingLinkTargets to start on a new line.


CHANGELOG.md line 19 at r8 (raw file):

## [Unreleased]

## [16.1.0] - 2025-07-17

Are you expecting to release as soon as this is merged? It might be better to leave off this header (or at least the date) until a release is actually happening.


SIL.Core/Html/HtmlUtils.cs line 34 at r9 (raw file):

		public static readonly Regex RegexUriSchemeLike = new Regex(@"^[a-zA-Z][a-zA-Z0-9+\-.]*:",
			IgnoreCase | Compiled);

Aren't a-zA-Z and IgnoreCase redundant?


SIL.Core/Html/HtmlUtils.cs line 150 at r9 (raw file):

			// Windows-Explorer-like view, but then the user has no way to get back, so that's
			// probably not what we want.
			return href.Length == 0 || 

line-final space after 0 ||.


SIL.Core/Html/HtmlUtils.cs line 158 at r9 (raw file):

					href.EndsWith(".com") ||
			        href.EndsWith(".org") ||
			        href.EndsWith(".net");

I don't think these EndsWith checks are very helpful if your not first splitting by / and taking the first element.


TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs line 0 at r9 (raw file):
Now this file has a mix of tab and space indenting.


TestApps/SIL.Windows.Forms.TestApp/TestAppForm.cs line 273 at r9 (raw file):

								LocalizationManager.GetString(
									"About.ExternalNavigationConfirmationTitle",
									"External navigation request"), MessageBoxButtons.OKCancel);

Would be nice to have MessageBoxButtons.OKCancel); on a new line for code readibility.

…l test cases

Also added tests to HtmlUtilsCreatePatchedTempHtmlFileTests to cover exception/edge cases
Copy link
Contributor Author

@tombogle tombogle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 11 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @tombogle)


CHANGELOG.md line 19 at r8 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Are you expecting to release as soon as this is merged? It might be better to leave off this header (or at least the date) until a release is actually happening.

Yes, I was planning on it, as this solves a problem affecting the current release of 3 products.


TestApps/SIL.Windows.Forms.TestApp/TestAppForm.Designer.cs line at r9 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Now this file has a mix of tab and space indenting.

I didn't edit that file directly (outside of Designer). Designer doesn't follow our .editorconfig rules (at least for the lines inside InitializeComponent), so I'm not sure if there's anything I can/should do. I could use Replace to put them back to tabs, but next time someone opens it in Designer, they're going to revert to spaces. ChatGPT claims I could add this to .gitattributes:
*.Designer.cs linguist-generated=true
I don't know if that would work or whether it would be worth it. Depending on the tools you use, you could still see that some are tabs and some are spaces. Alternatively, we could put an override in .editorconfig to use spaces instead of tabs in all Designer files. Then they would be internally consistent, though out of step with other files. But if we were to globally change all Designer files in accordance with that, that should definitely be a separate PR just to do that and would probably require some level of team buy-in.

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r9, 1 of 5 files at r10, all commit messages.
Reviewable status: 7 of 11 files reviewed, 1 unresolved discussion (waiting on @tombogle)


SIL.Windows.Forms/Miscellaneous/SILAboutBox.cs line 275 at r9 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

For code readability, it might be nice for each argument of HandleMissingLinkTargets to start on a new line.

The 2nd and 3rd arguments aren't split.


TestApps/SIL.Windows.Forms.TestApp/TestAppForm.cs line 273 at r9 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Would be nice to have MessageBoxButtons.OKCancel); on a new line for code readibility.

And probably indented at the same level as the other arguments.


SIL.Core.Tests/Html/HtmlUtilsTests.cs line 406 at r10 (raw file):

			var tempDir = GetDirectoryName(tempFile.Path);
			Assert.That(Directory.GetFiles(tempDir).Length, Is.EqualTo(1),
				"Subdirectory assets should not be copied.");

Copy-pasted "Subdirectory assets should not be copied." needs to be updated for this test.


SIL.Core/Html/HtmlUtils.cs line 172 at r10 (raw file):

			// Must contain at least one dot to have a top-level domain
			return lastDotIndex >= 0 && lastDotIndex < host.Length - 1 &&
				s_commonTopLevelDomains.Contains(host.Substring(lastDotIndex));

For consistency and readability (though without really considering performance), how about something like one of

			// Strip off any path and get the TLD.
			var host = href.Split(new[] { '/' }, 2)[0];
			var domainParts = host.Split('.');
			var numParts = domainParts.Length;
			return numParts > 1 && s_commonTopLevelDomains.Contains(domainParts[numParts - 1]);

or


			// Strip off any path and get the TLD.
			var firstSlashIndex = href.IndexOf('/');
			var host = firstSlashIndex >= 0 ? href.Substring(0, firstSlashIndex) : href;
			var lastDotIndex = host.LastIndexOf('.');
			var tld = lastDotIndex >= 0 ? host.Substring(lastDotIndex) : "";
			return s_commonTopLevelDomains.Contains(tld);

(The former would require removing the initial . from each entry in s_commonTopLevelDomains.)

Copy link
Contributor

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r11, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @tombogle)

@tombogle tombogle merged commit 2f62f87 into master Jul 18, 2025
12 checks passed
@tombogle tombogle deleted the sp-2372-handle-links-to-external-window-in-about-box branch July 18, 2025 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants