Skip to content

Add scientific notation support to XPath number parser#692

Closed
f-seoane wants to merge 1 commit intozeux:masterfrom
BBIO-1590276000:master
Closed

Add scientific notation support to XPath number parser#692
f-seoane wants to merge 1 commit intozeux:masterfrom
BBIO-1590276000:master

Conversation

@f-seoane
Copy link
Copy Markdown

  • Extended xpath_lexer to recognize exponential notation (e/E with optional +/- sign)
  • Updated check_string_to_number_format() to validate scientific notation syntax
  • Handles formats like: 1.5e10, 3.14E-5, 2e+8, .5e3
  • Added comprehensive test cases for scientific notation parsing

Changes made in both lexer tokenization (for numbers starting with digits and with decimal point) and number validation to ensure proper parsing of scientific notation in XPath expressions.

- Extended xpath_lexer to recognize exponential notation (e/E with optional +/- sign)
- Updated check_string_to_number_format() to validate scientific notation syntax
- Handles formats like: 1.5e10, 3.14E-5, 2e+8, .5e3
- Added comprehensive test cases for scientific notation parsing

Changes made in both lexer tokenization (for numbers starting with digits and with decimal point) and number validation to ensure proper parsing of scientific notation in XPath expressions.
@zeux
Copy link
Copy Markdown
Owner

zeux commented Dec 20, 2025

pugixml implements XPath 1.0 as per W3C recommendation: https://www.w3.org/TR/1999/REC-xpath-19991116/

(the implementation is mostly complete, but has a couple namespace- and Unicode-related caveats, see https://pugixml.org/docs/manual.html#xpath.w3c)

The specification is quite clear about the number format: neither lexing of XPath expressions, nor number(string) function, supports scientific notation.

This was changed in XPath 2.0; however, pugixml doesn't support any XPath 2.0 features, doesn't plan to, and I would like to not have to consider individual XPath 2.0 features for inclusion - there's too many, and different people have different opinions on which features are important and which aren't. So I don't think I can merge this PR.

If you need to use scientific notation with XPath 1.0 queries, depending on the use case there are some potential options:

  • If the number is a query input, instead of dynamically constructing the query string you could use XPath variables, and set the variable from C++ code
  • If the number is a query output, and no arithmetic is performed, you could output a string instead and convert it on the C++ side
  • If the number is used in mathematical expressions in the query, I suppose it's possible to pre-process the XML document and to replace the attributes that may have scientific values with equivalent strings without these.

@f-seoane
Copy link
Copy Markdown
Author

Sure, that's fine

@f-seoane f-seoane closed this Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants