|
| 1 | += Inline-Heavy Benchmark Document |
| 2 | +:author: Test Author |
| 3 | +:toc: |
| 4 | + |
| 5 | +== Introduction |
| 6 | + |
| 7 | +This document exercises *bold*, _italic_, `monospace`, and #highlight# formatting extensively. |
| 8 | +It also uses **unconstrained bold**, __unconstrained italic__, ``unconstrained monospace``, and ##unconstrained highlight##. |
| 9 | + |
| 10 | +Here is some *bold text* followed by _italic text_ and `monospace text` and #highlighted text#. |
| 11 | +Multiple *bold* words _italic_ words `mono` words #mark# words in a single line. |
| 12 | +Nesting works too: *bold _italic_ text* and _italic `monospace` text_ are common patterns. |
| 13 | + |
| 14 | +This is a longer paragraph of plain text without any formatting to exercise the negative lookahead |
| 15 | +path of the parser. Each character must be checked against all possible inline constructs before |
| 16 | +being accepted as plain text. The more plain text there is, the more the lookahead cache helps. |
| 17 | +This paragraph deliberately avoids special characters to maximize the number of cache lookups |
| 18 | +that occur during parsing. We want to stress the plain text accumulation loop here. |
| 19 | + |
| 20 | +== Cross-references and anchors |
| 21 | + |
| 22 | +[[section-one]] |
| 23 | +=== Section one with anchor |
| 24 | + |
| 25 | +See <<section-one>> for details. Also see <<section-two,Section Two>> for more. |
| 26 | +Reference xref:other-doc.adoc[another document] and xref:guide.adoc#tips[specific section]. |
| 27 | + |
| 28 | +[[section-two]] |
| 29 | +=== Section two with anchor |
| 30 | + |
| 31 | +Back to <<section-one>> and forward to <<section-three,the third section>>. |
| 32 | +Cross-reference xref:api.adoc[API docs] and xref:tutorial.adoc#step1[step one]. |
| 33 | + |
| 34 | +[[section-three]] |
| 35 | +=== Section three |
| 36 | + |
| 37 | +More xrefs: <<section-one>>, <<section-two>>, <<section-three>>. |
| 38 | + |
| 39 | +== Index terms |
| 40 | + |
| 41 | +This section has index terms. (((primary term))) Here is a concealed index. |
| 42 | +And ((visible index term)) appears inline. Also indexterm:[another term] works. |
| 43 | +Multiple indexterm2:[term1, term2] entries indexterm:[entry three] in one paragraph. |
| 44 | +More text with (((term A))) and (((term B))) and (((term C))) scattered throughout. |
| 45 | +Flow terms like ((alpha)) and ((beta)) and ((gamma)) are also present. |
| 46 | + |
| 47 | +== Dense formatting |
| 48 | + |
| 49 | +The *quick* _brown_ `fox` #jumps# over the *lazy* _dog_ `and` #runs# away. |
| 50 | +A *bold statement* with _italic emphasis_ in `monospace code` and #highlighted text# here. |
| 51 | +Then *more bold* and _more italic_ and `more mono` and #more highlight# continues. |
| 52 | +Even *more* _formatting_ `mixed` #together# in *every* _single_ `line` #here#. |
| 53 | + |
| 54 | +*Bold* at start, middle *bold* word, and *bold* at end. |
| 55 | +_Italic_ at start, middle _italic_ word, and _italic_ at end. |
| 56 | +`Mono` at start, middle `mono` word, and `mono` at end. |
| 57 | +#Mark# at start, middle #mark# word, and #mark# at end. |
| 58 | + |
| 59 | +=== Escaped syntax |
| 60 | + |
| 61 | +Use \*not bold* and \_not italic_ and \`not mono` and \#not highlight#. |
| 62 | +Double escape \\*also not bold* and \\_also not italic_. |
| 63 | +The backslash \\ is literal here and here \\ too. |
| 64 | +Escaped cross-ref \<<not-a-ref>> and escaped anchor \[[not-an-anchor]]. |
| 65 | + |
| 66 | +== Mixed inline constructs |
| 67 | + |
| 68 | +A paragraph with *bold*, _italic_, `monospace`, #highlight#, ^super^, ~sub~, |
| 69 | +((index)), (((concealed))), <<section-one>>, and xref:doc.adoc[link] all together. |
| 70 | +Followed by more *bold words* and _italic words_ and `monospace words` and #highlight words#. |
| 71 | + |
| 72 | +Another paragraph: the *first* word is bold, the _second_ is italic, the `third` is monospace, |
| 73 | +the #fourth# is highlighted, then *fifth* bold, _sixth_ italic, `seventh` mono, #eighth# mark. |
| 74 | + |
| 75 | +Yet another line with *a* _b_ `c` #d# *e* _f_ `g` #h# *i* _j_ `k` #l# *m* _n_ `o` #p#. |
| 76 | + |
| 77 | +== Long plain text sections |
| 78 | + |
| 79 | +This is a very long section of plain text that contains no special formatting whatsoever. |
| 80 | +The parser must check every single character against all the negative lookahead patterns |
| 81 | +and find that none of them match. This exercises the caching behavior because the same |
| 82 | +position will be checked for bold, italic, monospace, highlight, cross-references, anchors, |
| 83 | +index terms, escaped syntax, and many other patterns before accepting each character. |
| 84 | + |
| 85 | +More plain text follows here without any special characters or patterns. Just regular |
| 86 | +English prose that flows naturally from one sentence to the next. The sentences are |
| 87 | +designed to be long enough to stress the character-by-character lookahead checking |
| 88 | +but not so long that they become difficult to read or maintain as test fixtures. |
| 89 | + |
| 90 | +A third paragraph of plain text continues the theme. Each word here is checked against |
| 91 | +the full set of inline constructs before being accepted as part of the plain text node. |
| 92 | +The parser verifies that the character is not a star, underscore, backtick, hash, caret, |
| 93 | +tilde, open bracket, less-than sign, or any other trigger for inline constructs. |
| 94 | + |
| 95 | +Final plain text paragraph. This rounds out the long plain text section with even more |
| 96 | +content that must be parsed character by character through the negative lookahead loop. |
| 97 | +The goal is to have enough text that the caching optimization becomes measurable in |
| 98 | +benchmarks. Without caching, each position triggers dozens of failed match attempts. |
| 99 | + |
| 100 | +== Repeated patterns |
| 101 | + |
| 102 | +*bold1* text *bold2* text *bold3* text *bold4* text *bold5* text. |
| 103 | +_ital1_ text _ital2_ text _ital3_ text _ital4_ text _ital5_ text. |
| 104 | +`mono1` text `mono2` text `mono3` text `mono4` text `mono5` text. |
| 105 | +#mark1# text #mark2# text #mark3# text #mark4# text #mark5# text. |
| 106 | + |
| 107 | +*b1* _i1_ `m1` #h1# text *b2* _i2_ `m2` #h2# text *b3* _i3_ `m3` #h3# text. |
| 108 | +<<section-one>> text <<section-two>> text <<section-three>> text. |
| 109 | +(((term1))) text (((term2))) text (((term3))) text (((term4))) text. |
| 110 | +((vis1)) text ((vis2)) text ((vis3)) text ((vis4)) text ((vis5)) text. |
| 111 | + |
| 112 | +== Final section |
| 113 | + |
| 114 | +This *final* section _wraps_ up the `inline-heavy` benchmark #document# with a mix |
| 115 | +of all formatting types: *bold*, _italic_, `mono`, #mark#, ^super^, ~sub~, |
| 116 | +<<section-one,cross-ref>>, ((index)), (((concealed index))), and plain text. |
0 commit comments