Skip to content

Commit bb117ff

Browse files
committed
Fix for when footnote reference labels get broken up into multiple cmark_nodes.
Sometimes, the autolinker will go ahead and greedily split input into multiple text nodes in the hopes of matching a hyperlink. This broke footnotes, which expected a singular node. Instead of relying on the tokenizing to have worked perfectly, when handling footnote references we now simply insert the reference based on the closing bracket and ignore and delete any existing and superfluous nodes.
1 parent 85d8952 commit bb117ff

File tree

1 file changed

+55
-10
lines changed

1 file changed

+55
-10
lines changed

src/inlines.c

Lines changed: 55 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1137,17 +1137,62 @@ static cmark_node *handle_close_bracket(cmark_parser *parser, subject *subj) {
11371137
// What if we're a footnote link?
11381138
if (parser->options & CMARK_OPT_FOOTNOTES &&
11391139
opener->inl_text->next &&
1140-
opener->inl_text->next->type == CMARK_NODE_TEXT &&
1141-
!opener->inl_text->next->next) {
1140+
opener->inl_text->next->type == CMARK_NODE_TEXT) {
1141+
11421142
cmark_chunk *literal = &opener->inl_text->next->as.literal;
1143-
if (literal->len > 1 && literal->data[0] == '^') {
1144-
inl = make_simple(subj->mem, CMARK_NODE_FOOTNOTE_REFERENCE);
1145-
inl->as.literal = cmark_chunk_dup(literal, 1, literal->len - 1);
1146-
inl->start_line = inl->end_line = subj->line;
1147-
inl->start_column = opener->inl_text->start_column;
1148-
inl->end_column = subj->pos + subj->column_offset + subj->block_offset;
1149-
cmark_node_insert_before(opener->inl_text, inl);
1150-
cmark_node_free(opener->inl_text->next);
1143+
1144+
// look back to the opening '[', and skip ahead to the next character
1145+
// if we're looking at a '[^' sequence, and there is other text or nodes
1146+
// after the ^, let's call it a footnote reference.
1147+
if (literal->data[0] == '^' && (literal->len > 1 || opener->inl_text->next->next)) {
1148+
1149+
cmark_node *fnref = make_simple(subj->mem, CMARK_NODE_FOOTNOTE_REFERENCE);
1150+
1151+
// the start and end of the footnote ref is the opening and closing brace
1152+
// i.e. the subject's current position, and the opener's start_column
1153+
int fnref_end_column = subj->pos + subj->column_offset + subj->block_offset;
1154+
int fnref_start_column = opener->inl_text->start_column;
1155+
1156+
// any given node delineates a substring of the line being processed,
1157+
// with the remainder of the line being pointed to thru its 'literal'
1158+
// struct member.
1159+
// here, we copy the literal's pointer, moving it past the '^' character
1160+
// for a length equal to the size of footnote reference text.
1161+
// i.e. end_col minus start_col, minus the [ and the ^ characters
1162+
//
1163+
// this copies the footnote reference string, even if between the
1164+
// `opener` and the subject's current position there are other nodes
1165+
fnref->as.literal = cmark_chunk_dup(literal, 1, (fnref_end_column - fnref_start_column) - 2);
1166+
1167+
fnref->start_line = fnref->end_line = subj->line;
1168+
fnref->start_column = fnref_start_column;
1169+
fnref->end_column = fnref_end_column;
1170+
1171+
// we then replace the opener with this new fnref node, the net effect
1172+
// being replacing the opening '[' text node with a `^footnote-ref]` node.
1173+
cmark_node_insert_before(opener->inl_text, fnref);
1174+
1175+
// sometimes, the footnote reference text gets parsed into multiple nodes
1176+
// i.e. '[^example]' parsed into '[', '^exam', 'ple]'.
1177+
// this happens for ex with the autolink extension. when the autolinker
1178+
// finds the 'w' character, it will split the text into multiple nodes
1179+
// in hopes of being able to match a 'www.' substring.
1180+
//
1181+
// because this function is called one character at a time via the
1182+
// `parse_inlines` function, and the current subj->pos is pointing at the
1183+
// closing ] brace, and because we copy all the text between the [ ]
1184+
// braces, we should be able to safely ignore and delete any nodes after
1185+
// the opener->inl_text->next.
1186+
//
1187+
// therefore, here we walk thru the list and free them all up
1188+
cmark_node *next_node;
1189+
cmark_node *current_node = opener->inl_text->next;
1190+
while(current_node) {
1191+
next_node = current_node->next;
1192+
cmark_node_free(current_node);
1193+
current_node = next_node;
1194+
}
1195+
11511196
cmark_node_free(opener->inl_text);
11521197
process_emphasis(parser, subj, opener->previous_delimiter);
11531198
pop_bracket(subj);

0 commit comments

Comments
 (0)