Skip to content

Conversation

@pyup-bot
Copy link
Collaborator

@pyup-bot pyup-bot commented Apr 5, 2020

This PR updates pyparsing from 2.0.1 to 2.4.7.

Changelog

2.4.6

------------------------------
- Fixed typos in White mapping of whitespace characters, to use
correct "\u" prefix instead of "u\".

- Fix bug in left-associative ternary operators defined using
infixNotation. First reported on StackOverflow by user Jeronimo.

- Backport of pyparsing_test namespace from 3.0.0, including
TestParseResultsAsserts mixin class defining unittest-helper
methods:
. def assertParseResultsEquals(
         self, result, expected_list=None, expected_dict=None, msg=None)
. def assertParseAndCheckList(
         self, expr, test_string, expected_list, msg=None, verbose=True)
. def assertParseAndCheckDict(
         self, expr, test_string, expected_dict, msg=None, verbose=True)
. def assertRunTestResults(
         self, run_tests_report, expected_parse_results=None, msg=None)
. def assertRaisesParseException(self, exc_type=ParseException, msg=None)

To use the methods in this mixin class, declare your unittest classes as:

 from pyparsing import pyparsing_test as ppt
 class MyParserTest(ppt.TestParseResultsAsserts, unittest.TestCase):
     ...

2.4.5

------------------------------
- NOTE: final release compatible with Python 2.x.

- Fixed issue with reading README.rst as part of setup.py's
initialization of the project's long_description, with a
non-ASCII space character causing errors when installing from
source on platforms where UTF-8 is not the default encoding.

2.4.4

--------------------------------
- Unresolved symbol reference in 2.4.3 release was masked by stdout
buffering in unit tests, thanks for the prompt heads-up, Ned
Batchelder!

2.4.3

------------------------------
- Fixed a bug in ParserElement.__eq__ that would for some parsers
create a recursion error at parser definition time. Thanks to
Michael Clerx for the assist. (Addresses issue 123)

- Fixed bug in indentedBlock where a block that ended at the end
of the input string could cause pyparsing to loop forever. Raised
as part of discussion on StackOverflow with geckos.

- Backports from pyparsing 3.0.0:
. __diag__.enable_all_warnings()
. Fixed bug in PrecededBy which caused infinite recursion, issue 127
. support for using regex-compiled RE to construct Regex expressions

2.4.2

- API change adding support for `expr[...]` - the original
code in 2.4.1 incorrectly implemented this as OneOrMore.
Code using this feature under this relase should explicitly
use `expr[0, ...]` for ZeroOrMore and `expr[1, ...]` for
OneOrMore. In 2.4.2 you will be able to write `expr[...]`
equivalent to `ZeroOrMore(expr)`.

- Bug if composing And, Or, MatchFirst, or Each expressions
using an expression. This only affects code which uses
explicit expression construction using the And, Or, etc.
classes instead of using overloaded operators '+', '^', and
so on. If constructing an And using a single expression,
you may get an error that "cannot multiply ParserElement by
0 or (0, 0)" or a Python `IndexError`. Change code like

 cmd = Or(Word(alphas))

to

 cmd = Or([Word(alphas)])

(Note that this is not the recommended style for constructing
Or expressions.)

- Some newly-added `__diag__` switches are enabled by default,
which may give rise to noisy user warnings for existing parsers.
You can disable them using:

 import pyparsing as pp
 pp.__diag__.warn_multiple_tokens_in_named_alternation = False
 pp.__diag__.warn_ungrouped_named_tokens_in_collection = False
 pp.__diag__.warn_name_set_on_empty_Forward = False
 pp.__diag__.warn_on_multiple_string_args_to_oneof = False
 pp.__diag__.enable_debug_on_named_expressions = False

In 2.4.2 these will all be set to False by default.

2.4.2a1

----------------------------
It turns out I got the meaning of `[...]` absolutely backwards,
so I've deleted 2.4.1 and am repushing this release as 2.4.2a1
for people to give it a try before I can call it ready to go.

The `expr[...]` notation was pushed out to be synonymous with
`OneOrMore(expr)`, but this is really counter to most Python
notations (and even other internal pyparsing notations as well).
It should have been defined to be equivalent to ZeroOrMore(expr).

- Changed [...] to emit ZeroOrMore instead of OneOrMore.

- Removed code that treats ParserElements like iterables.

- Change all __diag__ switches to False.

2.4.1.1

-------------------------------
This is a re-release of version 2.4.1 to restore the release history
in PyPI, since the 2.4.1 release was deleted.

There are 3 known issues in this release, which are fixed in

2.4.1

--------------------------
- NOTE: Deprecated functions and features that will be dropped
in pyparsing 2.5.0 (planned next release):

. support for Python 2 - ongoing users running with
 Python 2 can continue to use pyparsing 2.4.1

. ParseResults.asXML() - if used for debugging, switch
 to using ParseResults.dump(); if used for data transfer,
 use ParseResults.asDict() to convert to a nested Python
 dict, which can then be converted to XML or JSON or
 other transfer format

. operatorPrecedence synonym for infixNotation -
 convert to calling infixNotation

. commaSeparatedList - convert to using
 pyparsing_common.comma_separated_list

. upcaseTokens and downcaseTokens - convert to using
 pyparsing_common.upcaseTokens and downcaseTokens

. __compat__.collect_all_And_tokens will not be settable to
 False to revert to pre-2.3.1 results name behavior -
 review use of names for MatchFirst and Or expressions
 containing And expressions, as they will return the
 complete list of parsed tokens, not just the first one.
 Use __diag__.warn_multiple_tokens_in_named_alternation
 (described below) to help identify those expressions
 in your parsers that will have changed as a result.

- A new shorthand notation has been added for repetition
expressions: expr[min, max], with '...' valid as a min
or max value:
  - expr[...] is equivalent to OneOrMore(expr)
  - expr[0, ...] is equivalent to ZeroOrMore(expr)
  - expr[1, ...] is equivalent to OneOrMore(expr)
  - expr[n, ...] or expr[n,] is equivalent
       to expr*n + ZeroOrMore(expr)
       (read as "n or more instances of expr")
  - expr[..., n] is equivalent to expr*(0, n)
  - expr[m, n] is equivalent to expr*(m, n)
Note that expr[..., n] and expr[m, n] do not raise an exception
if more than n exprs exist in the input stream.  If this
behavior is desired, then write expr[..., n] + ~expr.

- '...' can also be used as short hand for SkipTo when used
in adding parse expressions to compose an And expression.

   Literal('start') + ... + Literal('end')
   And(['start', ..., 'end'])

are both equivalent to:

   Literal('start') + SkipTo('end')("_skipped*") + Literal('end')

The '...' form has the added benefit of not requiring repeating
the skip target expression. Note that the skipped text is
returned with '_skipped' as a results name, and that the contents of
`_skipped` will contain a list of text from all `...`s in the expression.

- '...' can also be used as a "skip forward in case of error" expression:

     expr = "start" + (Word(nums).setName("int") | ...) + "end"

     expr.parseString("start 456 end")
     ['start', '456', 'end']

     expr.parseString("start 456 foo 789 end")
     ['start', '456', 'foo 789 ', 'end']
     - _skipped: ['foo 789 ']

     expr.parseString("start foo end")
     ['start', 'foo ', 'end']
     - _skipped: ['foo ']

     expr.parseString("start end")
     ['start', '', 'end']
     - _skipped: ['missing <int>']

Note that in all the error cases, the '_skipped' results name is
present, showing a list of the extra or missing items.

This form is only valid when used with the '|' operator.

- Improved exception messages to show what was actually found, not
just what was expected.

 word = pp.Word(pp.alphas)
 pp.OneOrMore(word).parseString("aaa bbb 123", parseAll=True)

Former exception message:

 pyparsing.ParseException: Expected end of text (at char 8), (line:1, col:9)

New exception message:

 pyparsing.ParseException: Expected end of text, found '1' (at char 8), (line:1, col:9)

- Added diagnostic switches to help detect and warn about common
parser construction mistakes, or enable additional parse
debugging. Switches are attached to the pyparsing.__diag__
namespace object:
  - warn_multiple_tokens_in_named_alternation - flag to enable warnings when a results
    name is defined on a MatchFirst or Or expression with one or more And subexpressions
    (default=True)
  - warn_ungrouped_named_tokens_in_collection - flag to enable warnings when a results
    name is defined on a containing expression with ungrouped subexpressions that also
    have results names (default=True)
  - warn_name_set_on_empty_Forward - flag to enable warnings whan a Forward is defined
    with a results name, but has no contents defined (default=False)
  - warn_on_multiple_string_args_to_oneof - flag to enable warnings whan oneOf is
    incorrectly called with multiple str arguments (default=True)
  - enable_debug_on_named_expressions - flag to auto-enable debug on all subsequent
    calls to ParserElement.setName() (default=False)

warn_multiple_tokens_in_named_alternation is intended to help
those who currently have set __compat__.collect_all_And_tokens to
False as a workaround for using the pre-2.3.1 code with named
MatchFirst or Or expressions containing an And expression.

- Added ParseResults.from_dict classmethod, to simplify creation
of a ParseResults with results names using a dict, which may be nested.
This makes it easy to add a sub-level of named items to the parsed
tokens in a parse action.

- Added asKeyword argument (default=False) to oneOf, to force
keyword-style matching on the generated expressions.

- ParserElement.runTests now accepts an optional 'file' argument to
redirect test output to a file-like object (such as a StringIO,
or opened file). Default is to write to sys.stdout.

- conditionAsParseAction is a helper method for constructing a
parse action method from a predicate function that simply
returns a boolean result. Useful for those places where a
predicate cannot be added using addCondition, but must be
converted to a parse action (such as in infixNotation). May be
used as a decorator if default message and exception types
can be used. See ParserElement.addCondition for more details
about the expected signature and behavior for predicate condition
methods.

- While investigating issue 93, I found that Or and
addCondition could interact to select an alternative that
is not the longest match. This is because Or first checks
all alternatives for matches without running attached
parse actions or conditions, orders by longest match, and
then rechecks for matches with conditions and parse actions.
Some expressions, when checking with conditions, may end
up matching on a shorter token list than originally matched,
but would be selected because of its original priority.
This matching code has been expanded to do more extensive
searching for matches when a second-pass check matches a
smaller list than in the first pass.

- Fixed issue 87, a regression in indented block.
Reported by Renz Bagaporo, who submitted a very nice repro
example, which makes the bug-fixing process a lot easier,
thanks!

- Fixed MemoryError issue 85 and 91 with str generation for
Forwards. Thanks decalage2 and Harmon758 for your patience.

- Modified setParseAction to accept None as an argument,
indicating that all previously-defined parse actions for the
expression should be cleared.

- Modified pyparsing_common.real and sci_real to parse reals
without leading integer digits before the decimal point,
consistent with Python real number formats. Original PR 98
submitted by ansobolev.

- Modified runTests to call postParse function before dumping out
the parsed results - allows for postParse to add further results,
such as indications of additional validation success/failure.

- Updated statemachine example: refactored state transitions to use
overridden classmethods; added <statename>Mixin class to simplify
definition of application classes that "own" the state object and
delegate to it to model state-specific properties and behavior.

- Added example nested_markup.py, showing a simple wiki markup with
nested markup directives, and illustrating the use of '...' for
skipping over input to match the next expression. (This example
uses syntax that is not valid under Python 2.)

- Rewrote delta_time.py example (renamed from deltaTime.py) to
fix some omitted formats and upgrade to latest pyparsing idioms,
beginning with writing an actual BNF.

- With the help and encouragement from several contributors, including
Matěj Cepl and Cengiz Kaygusuz, I've started cleaning up the internal
coding styles in core pyparsing, bringing it up to modern coding
practices from pyparsing's early development days dating back to
2003. Whitespace has been largely standardized along PEP8 guidelines,
removing extra spaces around parentheses, and adding them around
arithmetic operators and after colons and commas. I was going to hold
off on doing this work until after 2.4.1, but after cleaning up a
few trial classes, the difference was so significant that I continued
on to the rest of the core code base. This should facilitate future
work and submitted PRs, allowing them to focus on substantive code
changes, and not get sidetracked by whitespace issues.

2.4.0

---------------------------
- Well, it looks like the API change that was introduced in 2.3.1 was more
drastic than expected, so for a friendlier forward upgrade path, this
release:
. Bumps the current version number to 2.4.0, to reflect this
 incompatible change.
. Adds a pyparsing.__compat__ object for specifying compatibility with
 future breaking changes.
. Conditionalizes the API-breaking behavior, based on the value
 pyparsing.__compat__.collect_all_And_tokens.  By default, this value
 will be set to True, reflecting the new bugfixed behavior. To set this
 value to False, add to your code:

     import pyparsing
     pyparsing.__compat__.collect_all_And_tokens = False

. User code that is dependent on the pre-bugfix behavior can restore
 it by setting this value to False.

In 2.5 and later versions, the conditional code will be removed and
setting the flag to True or False in these later versions will have no
effect.

- Updated unitTests.py and simple_unit_tests.py to be compatible with
"python setup.py test". To run tests using setup, do:

   python setup.py test
   python setup.py test -s unitTests.suite
   python setup.py test -s simple_unit_tests.suite

Prompted by issue 83 and PR submitted by bdragon28, thanks.

- Fixed bug in runTests handling '\n' literals in quoted strings.

- Added tag_body attribute to the start tag expressions generated by
makeHTMLTags, so that you can avoid using SkipTo to roll your own
tag body expression:

   a, aEnd = pp.makeHTMLTags('a')
   link = a + a.tag_body("displayed_text") + aEnd
   for t in s.searchString(html_page):
       print(t.displayed_text, '->', t.startA.href)

- indentedBlock failure handling was improved; PR submitted by TMiguelT,
thanks!

- Address Py2 incompatibility in simpleUnitTests, plus explain() and
Forward str() cleanup; PRs graciously provided by eswald.

- Fixed docstring with embedded '\w', which creates SyntaxWarnings in
Py3.8, issue 80.

- Examples:

- Added example parser for rosettacode.org tutorial compiler.

- Added example to show how an HTML table can be parsed into a
 collection of Python lists or dicts, one per row.

- Updated SimpleSQL.py example to handle nested selects, reworked
 'where' expression to use infixNotation.

- Added include_preprocessor.py, similar to macroExpander.py.

- Examples using makeHTMLTags use new tag_body expression when
 retrieving a tag's body text.

- Updated examples that are runnable as unit tests:

     python setup.py test -s examples.antlr_grammar_tests
     python setup.py test -s examples.test_bibparse

2.3.1

-----------------------------
- POSSIBLE API CHANGE: this release fixes a bug when results names were
attached to a MatchFirst or Or object containing an And object.
Previously, a results name on an And object within an enclosing MatchFirst
or Or could return just the first token in the And. Now, all the tokens
matched by the And are correctly returned. This may result in subtle
changes in the tokens returned if you have this condition in your pyparsing
scripts.

- New staticmethod ParseException.explain() to help diagnose parse exceptions
by showing the failing input line and the trace of ParserElements in
the parser leading up to the exception. explain() returns a multiline
string listing each element by name. (This is still an experimental
method, and the method signature and format of the returned string may
evolve over the next few releases.)

Example:
      define a parser to parse an integer followed by an
      alphabetic word
     expr = pp.Word(pp.nums).setName("int")
            + pp.Word(pp.alphas).setName("word")
     try:
          parse a string with a numeric second value instead of alpha
         expr.parseString("123 355")
     except pp.ParseException as pe:
         print(pp.ParseException.explain(pe))

Prints:
     123 355
         ^
     ParseException: Expected word (at char 4), (line:1, col:5)
     __main__.ExplainExceptionTest
     pyparsing.And - {int word}
     pyparsing.Word - word

explain() will accept any exception type and will list the function
names and parse expressions in the stack trace. This is especially
useful when an exception is raised in a parse action.

Note: explain() is only supported under Python 3.

- Fix bug in dictOf which could match an empty sequence, making it
infinitely loop if wrapped in a OneOrMore.

- Added unicode sets to pyparsing_unicode for Latin-A and Latin-B ranges.

- Added ability to define custom unicode sets as combinations of other sets
using multiple inheritance.

 class Turkish_set(pp.pyparsing_unicode.Latin1, pp.pyparsing_unicode.LatinA):
     pass

 turkish_word = pp.Word(Turkish_set.alphas)

- Updated state machine import examples, with state machine demos for:
. traffic light
. library book checkin/checkout
. document review/approval

In the traffic light example, you can use the custom 'statemachine' keyword
to define the states for a traffic light, and have the state classes
auto-generated for you:

   statemachine TrafficLightState:
       Red -> Green
       Green -> Yellow
       Yellow -> Red

Similar for state machines with named transitions, like the library book
state example:

   statemachine LibraryBookState:
       New -(shelve)-> Available
       Available -(reserve)-> OnHold
       OnHold -(release)-> Available
       Available -(checkout)-> CheckedOut
       CheckedOut -(checkin)-> Available

Once the classes are defined, then additional Python code can reference those
classes to add class attributes, instance methods, etc.

See the examples in examples/statemachine

- Added an example parser for the decaf language. This language is used in
CS compiler classes in many colleges and universities.

- Fixup of docstrings to Sphinx format, inclusion of test files in the source
package, and convert markdown to rst throughout the distribution, great job
by Matěj Cepl!

- Expanded the whitespace characters recognized by the White class to include
all unicode defined spaces. Suggested in Issue 51 by rtkjbillo.

- Added optional postParse argument to ParserElement.runTests() to add a
custom callback to be called for test strings that parse successfully. Useful
for running tests that do additional validation or processing on the parsed
results. See updated chemicalFormulas.py example.

- Removed distutils fallback in setup.py. If installing the package fails,
please update to the latest version of setuptools. Plus overall project code
cleanup (CRLFs, whitespace, imports, etc.), thanks Jon Dufresne!

- Fix bug in CaselessKeyword, to make its behavior consistent with
Keyword(caseless=True). Fixes Issue 65 reported by telesphore.

2.3.0

-----------------------------
- NEW SUPPORT FOR UNICODE CHARACTER RANGES
This release introduces the pyparsing_unicode namespace class, defining
a series of language character sets to simplify the definition of alphas,
nums, alphanums, and printables in the following language sets:
. Arabic
. Chinese
. Cyrillic
. Devanagari
. Greek
. Hebrew
. Japanese (including Kanji, Katakana, and Hirigana subsets)
. Korean
. Latin1 (includes 7 and 8-bit Latin characters)
. Thai
. CJK (combination of Chinese, Japanese, and Korean sets)

For example, your code can define words using:

 korean_word = Word(pyparsing_unicode.Korean.alphas)

See their use in the updated examples greetingInGreek.py and
greetingInKorean.py.

This namespace class also offers access to these sets using their
unicode identifiers.

- POSSIBLE API CHANGE: Fixed bug where a parse action that explicitly
returned the input ParseResults could add another nesting level in
the results if the current expression had a results name.

     vals = pp.OneOrMore(pp.pyparsing_common.integer)("int_values")

     def add_total(tokens):
         tokens['total'] = sum(tokens)
         return tokens   this line can be removed

     vals.addParseAction(add_total)
     print(vals.parseString("244 23 13 2343").dump())

Before the fix, this code would print (note the extra nesting level):

 [244, 23, 13, 2343]
 - int_values: [244, 23, 13, 2343]
   - int_values: [244, 23, 13, 2343]
   - total: 2623
 - total: 2623

With the fix, this code now prints:

 [244, 23, 13, 2343]
 - int_values: [244, 23, 13, 2343]
 - total: 2623

This fix will change the structure of ParseResults returned if a
program defines a parse action that returns the tokens that were
sent in. This is not necessary, and statements like "return tokens"
in the example above can be safely deleted prior to upgrading to
this release, in order to avoid the bug and get the new behavior.

Reported by seron in Issue 22, nice catch!

- POSSIBLE API CHANGE: Fixed a related bug where a results name
erroneously created a second level of hierarchy in the returned
ParseResults. The intent for accumulating results names into ParseResults
is that, in the absence of Group'ing, all names get merged into a
common namespace. This allows us to write:

    key_value_expr = (Word(alphas)("key") + '=' + Word(nums)("value"))
    result = key_value_expr.parseString("a = 100")

and have result structured as {"key": "a", "value": "100"}
instead of [{"key": "a"}, {"value": "100"}].

However, if a named expression is used in a higher-level non-Group
expression that *also* has a name, a false sub-level would be created
in the namespace:

     num = pp.Word(pp.nums)
     num_pair = ("[" + (num("A") + num("B"))("values") + "]")
     U = num_pair.parseString("[ 10 20 ]")
     print(U.dump())

Since there is no grouping, "A", "B", and "values" should all appear
at the same level in the results, as:

     ['[', '10', '20', ']']
     - A: '10'
     - B: '20'
     - values: ['10', '20']

Instead, an extra level of "A" and "B" show up under "values":

     ['[', '10', '20', ']']
     - A: '10'
     - B: '20'
     - values: ['10', '20']
       - A: '10'
       - B: '20'

This bug has been fixed. Now, if this hierarchy is desired, then a
Group should be added:

     num_pair = ("[" + pp.Group(num("A") + num("B"))("values") + "]")

Giving:

     ['[', ['10', '20'], ']']
     - values: ['10', '20']
       - A: '10'
       - B: '20'

But in no case should "A" and "B" appear in multiple levels. This bug-fix
fixes that.

If you have current code which relies on this behavior, then add or remove
Groups as necessary to get your intended results structure.

Reported by Athanasios Anastasiou.

- IndexError's raised in parse actions will get explicitly reraised
as ParseExceptions that wrap the original IndexError. Since
IndexError sometimes occurs as part of pyparsing's normal parsing
logic, IndexErrors that are raised during a parse action may have
gotten silently reinterpreted as parsing errors. To retain the
information from the IndexError, these exceptions will now be
raised as ParseExceptions that reference the original IndexError.
This wrapping will only be visible when run under Python3, since it
emulates "raise ... from ..." syntax.

Addresses Issue 4, reported by guswns0528.

- Added Char class to simplify defining expressions of a single
character. (Char("abc") is equivalent to Word("abc", exact=1))

- Added class PrecededBy to perform lookbehind tests. PrecededBy is
used in the same way as FollowedBy, passing in an expression that
must occur just prior to the current parse location.

For fixed-length expressions like a Literal, Keyword, Char, or a
Word with an `exact` or `maxLen` length given, `PrecededBy(expr)`
is sufficient. For varying length expressions like a Word with no
given maximum length, `PrecededBy` must be constructed with an
integer `retreat` argument, as in
`PrecededBy(Word(alphas, nums), retreat=10)`, to specify the maximum
number of characters pyparsing must look backward to make a match.
pyparsing will check all the values from 1 up to retreat characters
back from the current parse location.

When stepping backwards through the input string, PrecededBy does
*not* skip over whitespace.

PrecededBy can be created with a results name so that, even though
it always returns an empty parse result, the result *can* include
named results.

Idea first suggested in Issue 30 by Freakwill.

- Updated FollowedBy to accept expressions that contain named results,
so that results names defined in the lookahead expression will be
returned, even though FollowedBy always returns an empty list.
Inspired by the same feature implemented in PrecededBy.

2.2.2

-------------------------------
- Fixed bug in SkipTo, if a SkipTo expression that was skipping to
an expression that returned a list (such as an And), and the
SkipTo was saved as a named result, the named result could be
saved as a ParseResults - should always be saved as a string.
Issue 28, reported by seron.

- Added simple_unit_tests.py, as a collection of easy-to-follow unit
tests for various classes and features of the pyparsing library.
Primary intent is more to be instructional than actually rigorous
testing. Complex tests can still be added in the unitTests.py file.

- New features added to the Regex class:
- optional asGroupList parameter, returns all the capture groups as
 a list
- optional asMatch parameter, returns the raw re.match result
- new sub(repl) method, which adds a parse action calling
 re.sub(pattern, repl, parsed_result). Simplifies creating
 Regex expressions to be used with transformString. Like re.sub,
 repl may be an ordinary string (similar to using pyparsing's
 replaceWith), or may contain references to capture groups by group
 number, or may be a callable that takes an re match group and
 returns a string.

 For instance:
     expr = pp.Regex(r"([Hh]\d):\s*(.*)").sub(r"<\1>\2</\1>")
     expr.transformString("h1: This is the title")

 will return
     <h1>This is the title</h1>

- Fixed omission of LICENSE file in source tarball, also added
CODE_OF_CONDUCT.md per GitHub community standards.

2.2.1

-------------------------------
- Applied changes necessary to migrate hosting of pyparsing source
over to GitHub. Many thanks for help and contributions from hugovk,
jdufresne, and cngkaygusuz among others through this transition,
sorry it took me so long!

- Fixed import of collections.abc to address DeprecationWarnings
in Python 3.7.

- Updated oc.py example to support function calls in arithmetic
expressions; fixed regex for '==' operator; and added packrat
parsing. Raised on the pyparsing wiki by Boris Marin, thanks!

- Fixed bug in select_parser.py example, group_by_terms was not
reported. Reported on SF bugs by Adam Groszer, thanks Adam!

- Added "Getting Started" section to the module docstring, to
guide new users to the most common starting points in pyparsing's
API.

- Fixed bug in Literal and Keyword classes, which erroneously
raised IndexError instead of ParseException.

2.2.0

---------------------------
- Bumped minor version number to reflect compatibility issues with
OneOrMore and ZeroOrMore bugfixes in 2.1.10. (2.1.10 fixed a bug
that was introduced in 2.1.4, but the fix could break code
written against 2.1.4 - 2.1.9.)

- Updated setup.py to address recursive import problems now
that pyparsing is part of 'packaging' (used by setuptools).
Patch submitted by Joshua Root, much thanks!

- Fixed KeyError issue reported by Yann Bizeul when using packrat
parsing in the Graphite time series database, thanks Yann!

- Fixed incorrect usages of '\' in literals, as described in
https://docs.python.org/3/whatsnew/3.6.htmldeprecated-python-behavior
Patch submitted by Ville Skyttä - thanks!

- Minor internal change when using '-' operator, to be compatible
with ParserElement.streamline() method.

- Expanded infixNotation to accept a list or tuple of parse actions
to attach to an operation.

- New unit test added for dill support for storing pyparsing parsers.
Ordinary Python pickle can be used to pickle pyparsing parsers as
long as they do not use any parse actions. The 'dill' module is an
extension to pickle which *does* support pickling of attached
parse actions.

2.1.10

-------------------------------
- Fixed bug in reporting named parse results for ZeroOrMore
expressions, thanks Ethan Nash for reporting this!

- Fixed behavior of LineStart to be much more predictable.
LineStart can now be used to detect if the next parse position
is col 1, factoring in potential leading whitespace (which would
cause LineStart to fail). Also fixed a bug in col, which is
used in LineStart, where '\n's were erroneously considered to
be column 1.

- Added support for multiline test strings in runTests.

- Fixed bug in ParseResults.dump when keys were not strings.
Also changed display of string values to show them in quotes,
to help distinguish parsed numeric strings from parsed integers
that have been converted to Python ints.

2.1.9

-------------------------------
- Added class CloseMatch, a variation on Literal which matches
"close" matches, that is, strings with at most 'n' mismatching
characters.

- Fixed bug in Keyword.setDefaultKeywordChars(), reported by Kobayashi
Shinji - nice catch, thanks!

- Minor API change in pyparsing_common. Renamed some of the common
expressions to PEP8 format (to be consistent with the other
pyparsing_common expressions):
. signedInteger -> signed_integer
. sciReal -> sci_real

Also, in trying to stem the API bloat of pyparsing, I've copied
some of the global expressions and helper parse actions into
pyparsing_common, with the originals to be deprecated and removed
in a future release:
. commaSeparatedList -> pyparsing_common.comma_separated_list
. upcaseTokens -> pyparsing_common.upcaseTokens
. downcaseTokens -> pyparsing_common.downcaseTokens

(I don't expect any other expressions, like the comment expressions,
quotedString, or the Word-helping strings like alphas, nums, etc.
to migrate to pyparsing_common - they are just too pervasive. As for
the PEP8 vs camelCase naming, all the expressions are PEP8, while
the parse actions in pyparsing_common are still camelCase. It's a
small step - when pyparsing 3.0 comes around, everything will change
to PEP8 snake case.)

- Fixed Python3 compatibility bug when using dict keys() and values()
in ParseResults.getName().

- After some prodding, I've reworked the unitTests.py file for
pyparsing over the past few releases. It uses some variations on
unittest to handle my testing style. The test now:
. auto-discovers its test classes (while maintining their order
 of definition)
. suppresses voluminous 'print' output for tests that pass

2.1.8

----------------------------
- Fixed issue in the optimization to _trim_arity, when the full
stacktrace is retrieved to determine if a TypeError is raised in
pyparsing or in the caller's parse action. Code was traversing
the full stacktrace, and potentially encountering UnicodeDecodeError.

- Fixed bug in ParserElement.inlineLiteralsUsing, causing infinite
loop with Suppress.

- Fixed bug in Each, when merging named results from multiple
expressions in a ZeroOrMore or OneOrMore. Also fixed bug when
ZeroOrMore expressions were erroneously treated as required
expressions in an Each expression.

- Added a few more inline doc examples.

- Improved use of runTests in several example scripts.

2.1.7

----------------------------
- Fixed regression reported by Andrea Censi (surfaced in PyContracts
tests) when using ParseSyntaxExceptions (raised when using operator '-')
with packrat parsing.

- Minor fix to oneOf, to accept all iterables, not just space-delimited
strings and lists. (If you have a list or set of strings, it is
not necessary to concat them using ' '.join to pass them to oneOf,
oneOf will accept the list or set or generator directly.)

2.1.6

----------------------------
- *Major packrat upgrade*, inspired by patch provided by Tal Einat -
many, many, thanks to Tal for working on this! Tal's tests show
faster parsing performance (2X in some tests), *and* memory reduction
from 3GB down to ~100MB! Requires no changes to existing code using
packratting. (Uses OrderedDict, available in Python 2.7 and later.
For Python 2.6 users, will attempt to import from ordereddict
backport. If not present, will implement pure-Python Fifo dict.)

- Minor API change - to better distinguish between the flexible
numeric types defined in pyparsing_common, I've changed "numeric"
(which parsed numbers of different types and returned int for ints,
float for floats, etc.) and "number" (which parsed numbers of int
or float type, and returned all floats) to "number" and "fnumber"
respectively. I hope the "f" prefix of "fnumber" will be a better
indicator of its internal conversion of parsed values to floats,
while the generic "number" is similar to the flexible number syntax
in other languages. Also fixed a bug in pyparsing_common.numeric
(now renamed to pyparsing_common.number), integers were parsed and
returned as floats instead of being retained as ints.

- Fixed bug in upcaseTokens and downcaseTokens introduced in 2.1.5,
when the parse action was used in conjunction with results names.
Reported by Steven Arcangeli from the dql project, thanks for your
patience, Steven!

- Major change to docs! After seeing some comments on reddit about
general issue with docs of Python modules, and thinking that I'm a
little overdue in doing some doc tuneup on pyparsing, I decided to
following the suggestions of the redditor and add more inline examples
to the pyparsing reference documentation. I hope this addition
will clarify some of the more common questions people have, especially
when first starting with pyparsing/Python.

- Deprecated ParseResults.asXML. I've never been too happy with this
method, and it usually forces some unnatural code in the parsers in
order to get decent tag names. The amount of guesswork that asXML
has to do to try to match names with values should have been a red
flag from day one. If you are using asXML, you will need to implement
your own ParseResults->XML serialization. Or consider migrating to
a more current format such as JSON (which is very easy to do:
results_as_json = json.dumps(parse_result.asDict()) Hopefully, when
I remove this code in a future version, I'll also be able to simplify
some of the craziness in ParseResults, which IIRC was only there to try
to make asXML work.

- Updated traceParseAction parse action decorator to show the repr
of the input and output tokens, instead of the str format, since
str has been simplified to just show the token list content.

(The change to ParseResults.__str__ occurred in pyparsing 2.0.4, but
it seems that didn't make it into the release notes - sorry! Too
many users, especially beginners, were confused by the
"([token_list], {names_dict})" str format for ParseResults, thinking
they were getting a tuple containing a list and a dict. The full form
can be seen if using repr().)

For tracing tokens in and out of parse actions, the more complete
repr form provides important information when debugging parse actions.


Verison 2.1.5 - June, 2016
------------------------------
- Added ParserElement.split() generator method, similar to re.split().
Includes optional arguments maxsplit (to limit the number of splits),
and includeSeparators (to include the separating matched text in the
returned output, default=False).

- Added a new parse action construction helper tokenMap, which will
apply a function and optional arguments to each element in a
ParseResults. So this parse action:

   def lowercase_all(tokens):
       return [str(t).lower() for t in tokens]
   OneOrMore(Word(alphas)).setParseAction(lowercase_all)

can now be written:

   OneOrMore(Word(alphas)).setParseAction(tokenMap(str.lower))

Also simplifies writing conversion parse actions like:

   integer = Word(nums).setParseAction(lambda t: int(t[0]))

to just:

   integer = Word(nums).setParseAction(tokenMap(int))

If additional arguments are necessary, they can be included in the
call to tokenMap, as in:

   hex_integer = Word(hexnums).setParseAction(tokenMap(int, 16))

- Added more expressions to pyparsing_common:
. IPv4 and IPv6 addresses (including long, short, and mixed forms
 of IPv6)
. MAC address
. ISO8601 date and date time strings (with named fields for year, month, etc.)
. UUID (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
. hex integer (returned as int)
. fraction (integer '/' integer, returned as float)
. mixed integer (integer '-' fraction, or just fraction, returned as float)
. stripHTMLTags (parse action to remove tags from HTML source)
. parse action helpers convertToDate and convertToDatetime to do custom parse
 time conversions of parsed ISO8601 strings

- runTests now returns a two-tuple: success if all tests succeed,
and an output list of each test and its output lines.

- Added failureTests argument (default=False) to runTests, so that
tests can be run that are expected failures, and runTests' success
value will return True only if all tests *fail* as expected. Also,
parseAll now defaults to True.

- New example numerics.py, shows samples of parsing integer and real
numbers using locale-dependent formats:

 4.294.967.295,000
 4 294 967 295,000
 4,294,967,295.000

2.1.4

------------------------------
- Split out the '==' behavior in ParserElement, now implemented
as the ParserElement.matches() method. Using '==' for string test
purposes will be removed in a future release.

- Expanded capabilities of runTests(). Will now accept embedded
comments (default is Python style, leading '' character, but
customizable). Comments will be emitted along with the tests and
test output. Useful during test development, to create a test string
consisting only of test case description comments separated by
blank lines, and then fill in the test cases. Will also highlight
ParseFatalExceptions with "(FATAL)".

- Added a 'pyparsing_common' class containing common/helpful little
expressions such as integer, float, identifier, etc. I used this
class as a sort of embedded namespace, to contain these helpers
without further adding to pyparsing's namespace bloat.

- Minor enhancement to traceParseAction decorator, to retain the
parse action's name for the trace output.

- Added optional 'fatal' keyword arg to addCondition, to indicate that
a condition failure should halt parsing immediately.

2.1.3

------------------------------
- _trim_arity fix in 2.1.2 was very version-dependent on Py 3.5.0.
Now works for Python 2.x, 3.3, 3.4, 3.5.0, and 3.5.1 (and hopefully
beyond).

2.1.2

------------------------------
- Fixed bug in _trim_arity when pyparsing code is included in a
PyInstaller, reported by maluwa.

- Fixed catastrophic regex backtracking in implementation of the
quoted string expressions (dblQuotedString, sglQuotedString, and
quotedString). Reported on the pyparsing wiki by webpentest,
good catch! (Also tuned up some other expressions susceptible to the
same backtracking problem, such as cStyleComment, cppStyleComment,
etc.)

2.1.1

---------------------------
- Added support for assigning to ParseResults using slices.

- Fixed bug in ParseResults.toDict(), in which dict values were always
converted to dicts, even if they were just unkeyed lists of tokens.
Reported on SO by Gerald Thibault, thanks Gerald!

- Fixed bug in SkipTo when using failOn, reported by robyschek, thanks!

- Fixed bug in Each introduced in 2.1.0, reported by AND patch and
unit test submitted by robyschek, well done!

- Removed use of functools.partial in replaceWith, as this creates
an ambiguous signature for the generated parse action, which fails in
PyPy. Reported by Evan Hubinger, thanks Evan!

- Added default behavior to QuotedString to convert embedded '\t', '\n',
etc. characters to their whitespace counterparts. Found during Q&A
exchange on SO with Maxim.

2.1.0

------------------------------
- Modified the internal _trim_arity method to distinguish between
TypeError's raised while trying to determine parse action arity and
those raised within the parse action itself. This will clear up those
confusing "<lambda>() takes exactly 1 argument (0 given)" error
messages when there is an actual TypeError in the body of the parse
action. Thanks to all who have raised this issue in the past, and
most recently to Michael Cohen, who sent in a proposed patch, and got
me to finally tackle this problem.

- Added compatibility for pickle protocols 2-4 when pickling ParseResults.
In Python 2.x, protocol 0 was the default, and protocol 2 did not work.
In Python 3.x, protocol 3 is the default, so explicitly naming
protocol 0 or 1 was required to pickle ParseResults. With this release,
all protocols 0-4 are supported. Thanks for reporting this on StackOverflow,
Arne Wolframm, and for providing a nice simple test case!

- Added optional 'stopOn' argument to ZeroOrMore and OneOrMore, to
simplify breaking on stop tokens that would match the repetition
expression.

It is a common problem to fail to look ahead when matching repetitive
tokens if the sentinel at the end also matches the repetition
expression, as when parsing "BEGIN aaa bbb ccc END" with:

 "BEGIN" + OneOrMore(Word(alphas)) + "END"

Since "END" matches the repetition expression "Word(alphas)", it will
never get parsed as the terminating sentinel. Up until now, this has
to be resolved by the user inserting their own negative lookahead:

 "BEGIN" + OneOrMore(~Literal("END") + Word(alphas)) + "END"

Using stopOn, they can more easily write:

 "BEGIN" + OneOrMore(Word(alphas), stopOn="END") + "END"

The stopOn argument can be a literal string or a pyparsing expression.
Inspired by a question by Lamakaha on StackOverflow (and many previous
questions with the same negative-lookahead resolution).

- Added expression names for many internal and builtin expressions, to
reduce name and error message overhead during parsing.

- Converted helper lambdas to functions to refactor and add docstring
support.

- Fixed ParseResults.asDict() to correctly convert nested ParseResults
values to dicts.

- Cleaned up some examples, fixed typo in fourFn.py identified by
aristotle2600 on reddit.

- Removed keepOriginalText helper method, which was deprecated ages ago.
Superceded by originalTextFor.

- Same for the Upcase class, which was long ago deprecated and replaced
with the upcaseTokens method.

2.0.7

------------------------------
- Simplified string representation of Forward class, to avoid memory
and performance errors while building ParseException messages. Thanks,
Will McGugan, Andrea Censi, and Martijn Vermaat for the bug reports and
test code.

- Cleaned up additional issues from enhancing the error messages for
Or and MatchFirst, handling Unicode values in expressions. Fixes Unicode
encoding issues in Python 2, thanks to Evan Hubinger for the bug report.

- Fixed implementation of dir() for ParseResults - was leaving out all the
defined methods and just adding the custom results names.

- Fixed bug in ignore() that was introduced in pyparsing 1.5.3, that would
not accept a string literal as the ignore expression.

- Added new example parseTabularData.py to illustrate parsing of data
formatted in columns, with detection of empty cells.

- Updated a number of examples to more current Python and pyparsing
forms.

2.0.6

------------------------------
- Fixed a bug in Each when multiple Optional elements are present.
Thanks for reporting this, whereswalden on SO.

- Fixed another bug in Each, when Optional elements have results names
or parse actions, reported by Max Rothman - thank you, Max!

- Added optional parseAll argument to runTests, whether tests should
require the entire input string to be parsed or not (similar to
parseAll argument to parseString). Plus a little neaten-up of the
output on Python 2 (no stray ()'s).

- Modified exception messages from MatchFirst and Or expressions. These
were formerly misleading as they would only give the first or longest
exception mismatch error message. Now the error message includes all
the alternatives that were possible matches. Originally proposed by
a pyparsing user, but I've lost the email thread - finally figured out
a fairly clean way to do this.

- Fixed a bug in Or, when a parse action on an alternative raises an
exception, other potentially matching alternatives were not always tried.
Reported by TheVeryOmni on the pyparsing wiki, thanks!

- Fixed a bug to dump() introduced in 2.0.4, where list values were shown
in duplicate.

2.0.5

-----------------------------
- (&$(&$(!!!!  Some "print" statements snuck into pyparsing v2.0.4,
breaking Python 3 compatibility! Fixed. Reported by jenshn, thanks!

2.0.4

-----------------------------
- Added ParserElement.addCondition, to simplify adding parse actions
that act primarily as filters. If the given condition evaluates False,
pyparsing will raise a ParseException. The condition should be a method
with the same method signature as a parse action, but should return a
boolean. Suggested by Victor Porton, nice idea Victor, thanks!

- Slight mod to srange to accept unicode literals for the input string,
such as "[а-яА-Я]" instead of "[\u0430-\u044f\u0410-\u042f]". Thanks
to Alexandr Suchkov for the patch!

- Enhanced implementation of replaceWith.

- Fixed enhanced ParseResults.dump() method when the results consists
only of an unnamed array of sub-structure results. Reported by Robin
Siebler, thanks for your patience and persistence, Robin!

- Fixed bug in fourFn.py example code, where pi and e were defined using
CaselessLiteral instead of CaselessKeyword. This was not a problem until
adding a new function 'exp', and the leading 'e' of 'exp' was accidentally
parsed as the mathematical constant 'e'. Nice catch, Tom Grydeland - thanks!

- Adopt new-fangled Python features, like decorators and ternary expressions,
per suggestions from Williamzjc - thanks William! (Oh yeah, I'm not
supporting Python 2.3 with this code any more...) Plus, some additional
code fixes/cleanup - thanks again!

- Added ParserElement.runTests, a little test bench for quickly running
an expression against a list of sample input strings. Basically, I got
tired of writing the same test code over and over, and finally added it
as a test point method on ParserElement.

- Added withClass helper method, a simplified version of withAttribute for
the common but annoying case when defining a filter on a div's class -
made difficult because 'class' is a Python reserved word.

2.0.3

-----------------------------
- Fixed escaping behavior in QuotedString. Formerly, only quotation
marks (or characters designated as quotation marks in the QuotedString
constructor) would be escaped. Now all escaped characters will be
escaped, and the escaping backslashes will be removed.

- Fixed regression in ParseResults.pop() - pop() was pretty much
broken after I added *improvements* in 2.0.2. Reported by Iain
Shelvington, thanks Iain!

- Fixed bug in And class when initializing using a generator.

- Enhanced ParseResults.dump() method to list out nested ParseResults that
are unnamed arrays of sub-structures.

- Fixed UnboundLocalError under Python 3.4 in oneOf method, reported
on Sourceforge by aldanor, thanks!

- Fixed bug in ParseResults __init__ method, when returning non-ParseResults
types from parse actions that implement __eq__. Raised during discussion
on the pyparsing wiki with cyrfer.

2.0.2

---------------------------
- Extended "expr(name)" shortcut (same as "expr.setResultsName(name)")
to accept "expr()" as a shortcut for "expr.copy()".

- Added "locatedExpr(expr)" helper, to decorate any returned tokens
with their location within the input string. Adds the results names
locn_start and locn_end to the output parse results.

- Added "pprint()" method to ParseResults, to simplify troubleshooting
and prettified output. Now instead of importing the pprint module
and then writing "pprint.pprint(result)", you can just write
"result.pprint()".  This method also accepts addtional positional and
keyword arguments (such as indent, width, etc.), which get passed
through directly to the pprint method
(see https://docs.python.org/2/library/pprint.htmlpprint.pprint).

- Removed deprecation warnings when using '<<' for Forward expression
assignment. '<<=' is still preferred, but '<<' will be retained
for cases where '<<=' operator is not suitable (such as in defining
lambda expressions).

- Expanded argument compatibility for classes and functions that
take list arguments, to now accept generators as well.

- Extended list-like behavior of ParseResults, adding support for
append and extend. NOTE: if you have existing applications using
these names as results names, you will have to access them using
dict-style syntax: res["append"] and res["extend"]

- ParseResults emulates the change in list vs. iterator semantics for
methods like keys(), values(), and items(). Under Python 2.x, these
methods will return lists, under Python 3.x, these methods will
return iterators.

- ParseResults now has a method haskeys() which returns True or False
depending on whether any results names have been defined. This simplifies
testing for the existence of results names under Python 3.x, which
returns keys() as an iterator, not a list.

- ParseResults now supports both list and dict semantics for pop().
If passed no argument or an integer argument, it will use list semantics
and pop tokens from the list of parsed tokens. If passed a non-integer
argument (most likely a string), it will use dict semantics and
pop the corresponding value from any defined results names. A
second default return value argument is supported, just as in
dict.pop().

- Fixed bug in markInputline, thanks for reporting this, Matt Grant!

- Cleaned up my unit test environment, now runs with Python 2.6 and
3.3.
Links

@pyup-bot
Copy link
Collaborator Author

Closing this in favor of #818

@pyup-bot pyup-bot closed this Oct 24, 2021
@cclauss cclauss deleted the pyup-update-pyparsing-2.0.1-to-2.4.7 branch October 24, 2021 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants