File tree Expand file tree Collapse file tree 6 files changed +8
-8
lines changed
expat-internals-a-simple-parse
expat-internals-encodings
expat-internals-parsing-xml-declarations
expat-internals-string-pools
writing-a-custom-encoding Expand file tree Collapse file tree 6 files changed +8
-8
lines changed Original file line number Diff line number Diff line change @@ -592,7 +592,7 @@ <h2>Macro Abuse</h2>
592592three bytes available, then calls through the < code > isNmstrt3</ code > function
593593pointer in the encoding. This time < code > utf8_isNmstrt3()</ code > as it becomes
594594is a real function, one that uses macros to turn the UTF-8 into
595- a < a href ="http ://unicode.org/glossary/#code_point "> Unicode codepoint</ a > and
595+ a < a href ="https ://unicode.org/glossary/#code_point "> Unicode codepoint</ a > and
596596look up that codepoint (an integer in the range 0–1114111
597597(0x10ffff in hexadecimal), or rather 2048–65535
598598(0x0800–0xffff in hex) given that it comes from a three-byte
Original file line number Diff line number Diff line change @@ -105,7 +105,7 @@ <h1><a href="../../doc/expat-internals-encodings/">Expat Internals: Encodings</a
105105maintainers understand what it does.</ p >
106106< h2 > What Is A Character Encoding?</ h2 >
107107< p > A < em > character encoding</ em > in Expat is a combination of tables and
108- functions that translates a sequence of bytes into < a href ="http ://unicode.org/glossary/#code_point "> Unicode
108+ functions that translates a sequence of bytes into < a href ="https ://unicode.org/glossary/#code_point "> Unicode
109109codepoints</ a > and from there
110110to UTF-8 or UTF-16 (as configured at compile time). This includes
111111functions to determine various syntactic elements of XML, such as
@@ -381,7 +381,7 @@ <h2>The <code>normal_encoding</code> Structure</h2>
381381use them. The macro definitions for the 16-bit encodings still use
382382the < code > type</ code > table as an optimisation, but use the function
383383< code > unicode_byte_type()</ code > to convert the input into a byte type. Slightly
384- different logic is used to deal with < a href ="http ://unicode.org/glossary/#surrogate_pair "> surrogate
384+ different logic is used to deal with < a href ="https ://unicode.org/glossary/#surrogate_pair "> surrogate
385385pairs</ a > , and as a result
386386none of the functions are needed.</ p >
387387< h2 > Table-Building Macros</ h2 >
Original file line number Diff line number Diff line change @@ -376,7 +376,7 @@ <h2>Macro Abuse Redux</h2>
376376character permitted in XML. That includes the ASCII < a href ="https://en.wikipedia.org/wiki/Control_character "> control
377377characters</ a > other
378378than whitespace characters, and bytes that would start a four byte
379- sequence that would encode a < a href ="http ://unicode.org/glossary/#code_point "> Unicode
379+ sequence that would encode a < a href ="https ://unicode.org/glossary/#code_point "> Unicode
380380codepoint</ a > outside the
381381permitted range.</ p >
382382< p > < code > BT_MALFORM</ code > is slightly different; it is reserved for 0xFE and 0xFF,
Original file line number Diff line number Diff line change @@ -252,7 +252,7 @@ <h3>Initialising and Expanding a Pool</h3>
252252block can be hanging off < code > blocks</ code > . That makes linking and unlinking
253253simple.</ p >
254254< p > The pointer-fiddling for < code > pool->blocks</ code > and < code > pool->freeBlocks</ code > is
255- < a href ="http ://www.learn-c.org/en/Linked_lists "> fairly standard linked-list</ a >
255+ < a href ="https ://www.learn-c.org/en/Linked_lists "> fairly standard linked-list</ a >
256256stuff. More interesting is the initialisation of < code > start</ code > (to the
257257start of the available memory in the block), < code > end</ code > (calculated from
258258the number of characters available) and < code > ptr</ code > (same as < code > start</ code > ,
@@ -775,7 +775,7 @@ <h2>Footnotes</h2>
775775< p > < a name ="cunning "> 3</ a > : something is described as cunning if it is
776776very clever, often deceitful. In recent years it has come to have
777777sarcastic overtones, thanks to
778- < a href ="http ://www.bbc.co.uk/programmes/b006xxw3 "> Blackadder</ a > ; Baldrick's cry
778+ < a href ="https ://www.bbc.co.uk/programmes/b006xxw3 "> Blackadder</ a > ; Baldrick's cry
779779of "I have a cunning plan, milord" generally introduced a bizarre,
780780complicated and very stupid suggestion.</ p >
781781< p > —Rhodri James, 19 July 2017</ p >
Original file line number Diff line number Diff line change @@ -125,7 +125,7 @@ <h1>Other Resources</h1>
125125< h1 > External Articles and References</ h1 >
126126< ul >
127127< li > < a href ="http://www.jclark.com/xml/expat.html "> James Clark's original Expat page</ a > , for Expat 1.2 and earlier</ li >
128- < li > < a href ="http ://www.xml.com/pub/1999/09/expat/index.html "> Introductory article "Using Expat"</ a >
128+ < li > < a href ="https ://www.xml.com/pub/1999/09/expat/index.html "> Introductory article "Using Expat"</ a >
129129 by < a href ="https://www.xml.com/pub/au/43 "> Clark Cooper</ a > </ li >
130130</ ul >
131131</ div >
Original file line number Diff line number Diff line change @@ -107,7 +107,7 @@ <h1><a href="../../doc/writing-a-custom-encoding/">Writing A Custom Encoding</a>
107107encoding.</ p >
108108< h2 > What Is A Custom Encoding?</ h2 >
109109< p > A < em > character encoding</ em > in Expat is a combination of tables and
110- functions that translates a sequence of bytes into < a href ="http ://unicode.org/glossary/#code_point "> Unicode
110+ functions that translates a sequence of bytes into < a href ="https ://unicode.org/glossary/#code_point "> Unicode
111111codepoints</ a > and from there to
112112UTF-8 or UTF-16 (as configured at compile time) for the library's
113113internal use. Expat natively understands several encodings: UTF-8,
You can’t perform that action at this time.
0 commit comments