Skip to content

Commit bfdee30

Browse files
committed
chore(parser): add inline heavy fixture doc
1 parent 3bc9a4f commit bfdee30

File tree

2 files changed

+9177
-0
lines changed

2 files changed

+9177
-0
lines changed
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
= Inline-Heavy Benchmark Document
2+
:author: Test Author
3+
:toc:
4+
5+
== Introduction
6+
7+
This document exercises *bold*, _italic_, `monospace`, and #highlight# formatting extensively.
8+
It also uses **unconstrained bold**, __unconstrained italic__, ``unconstrained monospace``, and ##unconstrained highlight##.
9+
10+
Here is some *bold text* followed by _italic text_ and `monospace text` and #highlighted text#.
11+
Multiple *bold* words _italic_ words `mono` words #mark# words in a single line.
12+
Nesting works too: *bold _italic_ text* and _italic `monospace` text_ are common patterns.
13+
14+
This is a longer paragraph of plain text without any formatting to exercise the negative lookahead
15+
path of the parser. Each character must be checked against all possible inline constructs before
16+
being accepted as plain text. The more plain text there is, the more the lookahead cache helps.
17+
This paragraph deliberately avoids special characters to maximize the number of cache lookups
18+
that occur during parsing. We want to stress the plain text accumulation loop here.
19+
20+
== Cross-references and anchors
21+
22+
[[section-one]]
23+
=== Section one with anchor
24+
25+
See <<section-one>> for details. Also see <<section-two,Section Two>> for more.
26+
Reference xref:other-doc.adoc[another document] and xref:guide.adoc#tips[specific section].
27+
28+
[[section-two]]
29+
=== Section two with anchor
30+
31+
Back to <<section-one>> and forward to <<section-three,the third section>>.
32+
Cross-reference xref:api.adoc[API docs] and xref:tutorial.adoc#step1[step one].
33+
34+
[[section-three]]
35+
=== Section three
36+
37+
More xrefs: <<section-one>>, <<section-two>>, <<section-three>>.
38+
39+
== Index terms
40+
41+
This section has index terms. (((primary term))) Here is a concealed index.
42+
And ((visible index term)) appears inline. Also indexterm:[another term] works.
43+
Multiple indexterm2:[term1, term2] entries indexterm:[entry three] in one paragraph.
44+
More text with (((term A))) and (((term B))) and (((term C))) scattered throughout.
45+
Flow terms like ((alpha)) and ((beta)) and ((gamma)) are also present.
46+
47+
== Dense formatting
48+
49+
The *quick* _brown_ `fox` #jumps# over the *lazy* _dog_ `and` #runs# away.
50+
A *bold statement* with _italic emphasis_ in `monospace code` and #highlighted text# here.
51+
Then *more bold* and _more italic_ and `more mono` and #more highlight# continues.
52+
Even *more* _formatting_ `mixed` #together# in *every* _single_ `line` #here#.
53+
54+
*Bold* at start, middle *bold* word, and *bold* at end.
55+
_Italic_ at start, middle _italic_ word, and _italic_ at end.
56+
`Mono` at start, middle `mono` word, and `mono` at end.
57+
#Mark# at start, middle #mark# word, and #mark# at end.
58+
59+
=== Escaped syntax
60+
61+
Use \*not bold* and \_not italic_ and \`not mono` and \#not highlight#.
62+
Double escape \\*also not bold* and \\_also not italic_.
63+
The backslash \\ is literal here and here \\ too.
64+
Escaped cross-ref \<<not-a-ref>> and escaped anchor \[[not-an-anchor]].
65+
66+
== Mixed inline constructs
67+
68+
A paragraph with *bold*, _italic_, `monospace`, #highlight#, ^super^, ~sub~,
69+
((index)), (((concealed))), <<section-one>>, and xref:doc.adoc[link] all together.
70+
Followed by more *bold words* and _italic words_ and `monospace words` and #highlight words#.
71+
72+
Another paragraph: the *first* word is bold, the _second_ is italic, the `third` is monospace,
73+
the #fourth# is highlighted, then *fifth* bold, _sixth_ italic, `seventh` mono, #eighth# mark.
74+
75+
Yet another line with *a* _b_ `c` #d# *e* _f_ `g` #h# *i* _j_ `k` #l# *m* _n_ `o` #p#.
76+
77+
== Long plain text sections
78+
79+
This is a very long section of plain text that contains no special formatting whatsoever.
80+
The parser must check every single character against all the negative lookahead patterns
81+
and find that none of them match. This exercises the caching behavior because the same
82+
position will be checked for bold, italic, monospace, highlight, cross-references, anchors,
83+
index terms, escaped syntax, and many other patterns before accepting each character.
84+
85+
More plain text follows here without any special characters or patterns. Just regular
86+
English prose that flows naturally from one sentence to the next. The sentences are
87+
designed to be long enough to stress the character-by-character lookahead checking
88+
but not so long that they become difficult to read or maintain as test fixtures.
89+
90+
A third paragraph of plain text continues the theme. Each word here is checked against
91+
the full set of inline constructs before being accepted as part of the plain text node.
92+
The parser verifies that the character is not a star, underscore, backtick, hash, caret,
93+
tilde, open bracket, less-than sign, or any other trigger for inline constructs.
94+
95+
Final plain text paragraph. This rounds out the long plain text section with even more
96+
content that must be parsed character by character through the negative lookahead loop.
97+
The goal is to have enough text that the caching optimization becomes measurable in
98+
benchmarks. Without caching, each position triggers dozens of failed match attempts.
99+
100+
== Repeated patterns
101+
102+
*bold1* text *bold2* text *bold3* text *bold4* text *bold5* text.
103+
_ital1_ text _ital2_ text _ital3_ text _ital4_ text _ital5_ text.
104+
`mono1` text `mono2` text `mono3` text `mono4` text `mono5` text.
105+
#mark1# text #mark2# text #mark3# text #mark4# text #mark5# text.
106+
107+
*b1* _i1_ `m1` #h1# text *b2* _i2_ `m2` #h2# text *b3* _i3_ `m3` #h3# text.
108+
<<section-one>> text <<section-two>> text <<section-three>> text.
109+
(((term1))) text (((term2))) text (((term3))) text (((term4))) text.
110+
((vis1)) text ((vis2)) text ((vis3)) text ((vis4)) text ((vis5)) text.
111+
112+
== Final section
113+
114+
This *final* section _wraps_ up the `inline-heavy` benchmark #document# with a mix
115+
of all formatting types: *bold*, _italic_, `mono`, #mark#, ^super^, ~sub~,
116+
<<section-one,cross-ref>>, ((index)), (((concealed index))), and plain text.

0 commit comments

Comments
 (0)