Skip to content

Commit ec05e4f

Browse files
Merge pull request commonmark#292 from kevinbackhouse/domains-with-underscores
Add test for domains with underscores and fix roundtrip behavior
2 parents 15010f1 + 7327ccb commit ec05e4f

File tree

2 files changed

+22
-0
lines changed

2 files changed

+22
-0
lines changed

extensions/autolink.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,20 @@ static size_t autolink_delim(uint8_t *data, size_t link_end) {
115115
static size_t check_domain(uint8_t *data, size_t size, int allow_short) {
116116
size_t i, np = 0, uscore1 = 0, uscore2 = 0;
117117

118+
/* The purpose of this code is to reject urls that contain an underscore
119+
* in one of the last two segments. Examples:
120+
*
121+
* www.xxx.yyy.zzz autolinked
122+
* www.xxx.yyy._zzz not autolinked
123+
* www.xxx._yyy.zzz not autolinked
124+
* www._xxx.yyy.zzz autolinked
125+
*
126+
* The reason is that domain names are allowed to include underscores,
127+
* but host names are not. See: https://stackoverflow.com/a/2183140
128+
*/
118129
for (i = 1; i < size - 1; i++) {
130+
if (data[i] == '\\' && i < size - 2)
131+
i++;
119132
if (data[i] == '_')
120133
uscore2++;
121134
else if (data[i] == '.') {

test/extensions.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -581,6 +581,12 @@ www.github.com www.github.com/á
581581

582582
www.google.com/a_b
583583

584+
Underscores not allowed in host name www.xxx.yyy._zzz
585+
586+
Underscores not allowed in host name www.xxx._yyy.zzz
587+
588+
Underscores allowed in domain name www._xxx.yyy.zzz
589+
584590
**Autolink and http://inlines**
585591

586592
![http://inline.com/image](http://inline.com/image)
@@ -618,6 +624,9 @@ http://🍄.ga/ http://x🍄.ga/
618624
<p>Email me at:<a href="mailto:[email protected]">[email protected]</a></p>
619625
<p><a href="http://www.github.com">www.github.com</a> <a href="http://www.github.com/%C3%A1">www.github.com/á</a></p>
620626
<p><a href="http://www.google.com/a_b">www.google.com/a_b</a></p>
627+
<p>Underscores not allowed in host name www.xxx.yyy._zzz</p>
628+
<p>Underscores not allowed in host name www.xxx._yyy.zzz</p>
629+
<p>Underscores allowed in domain name <a href="http://www._xxx.yyy.zzz">www._xxx.yyy.zzz</a></p>
621630
<p><strong>Autolink and <a href="http://inlines">http://inlines</a></strong></p>
622631
<p><img src="http://inline.com/image" alt="http://inline.com/image" /></p>
623632
<p><a href="mailto:[email protected]">[email protected]</a></p>

0 commit comments

Comments
 (0)