Skip to content

Commit e3006b5

Browse files
committed
Mozilla bug 1997049 - Accelerate more HTML tokenizer states with SIMD. r=smaug
This restores the exact old structure for Java, and only injects the SIMD acceleration between incrementing pos and checking if pos reached endPos. This makes going back to SIMD after a character reference significantly simpler. The downside is that wholly-non-BMP text (as opposed to isolated non-BMP emoji or Hanzi) ends up uselessly bouncing to the SIMD code without benefiting from it when loading from network and counting column numbers as Unicode scalar values. If we want to avoid this failure mode, we should change column numbers to count UTF-16 code units instead of scalars. Either way, the column is "wrong" in some cases. Differential Revision: https://phabricator.services.mozilla.com/D270673
1 parent 4686aff commit e3006b5

File tree

3 files changed

+137
-237
lines changed

3 files changed

+137
-237
lines changed

0 commit comments

Comments
 (0)