--------------------------
This version of pyparsing includes work on two long-standing
FAQ's: support for forcing parsing of the complete input string
(without having to explicitly append StringEnd() to the grammar),
and a method to improve the mechanism of detecting where syntax
errors occur in an input string with various optional and
alternative paths. This release also includes a helper method
to simplify definition of indentation-based grammars. With
these changes (and the past few minor updates), I thought it was
finally time to bump the minor rev number on pyparsing - so
1.5.0 is now available! Read on...
- AT LAST!!! You can now call parseString and have it raise
an exception if the expression does not parse the entire
input string. This has been an FAQ for a LONG time.
The parseString method now includes an optional parseAll
argument (default=False). If parseAll is set to True, then
the given parse expression must parse the entire input
string. (This is equivalent to adding StringEnd() to the
end of the expression.) The default value is False to
retain backward compatibility.
Inspired by MANY requests over the years, most recently by
ecir-hana on the pyparsing wiki!
- Added new operator '-' for composing grammar sequences. '-'
behaves just like '+' in creating And expressions, but '-'
is used to mark grammar structures that should stop parsing
immediately and report a syntax error, rather than just
backtracking to the last successful parse and trying another
alternative. For instance, running the following code:
port_definition = Keyword("port") + '=' + Word(nums)
entity_definition = Keyword("entity") + "{" +
Optional(port_definition) + "}"
entity_definition.parseString("entity { port 100 }")
pyparsing fails to detect the missing '=' in the port definition.
But, since this expression is optional, pyparsing then proceeds
to try to match the closing '}' of the entity_definition. Not
finding it, pyparsing reports that there was no '}' after the '{'
character. Instead, we would like pyparsing to parse the 'port'
keyword, and if not followed by an equals sign and an integer,
to signal this as a syntax error.
This can now be done simply by changing the port_definition to:
port_definition = Keyword("port") - '=' + Word(nums)
Now after successfully parsing 'port', pyparsing must also find
an equals sign and an integer, or it will raise a fatal syntax
exception.
By judicious insertion of '-' operators, a pyparsing developer
can have their grammar report much more informative syntax error
messages.
Patches and suggestions proposed by several contributors on
the pyparsing mailing list and wiki - special thanks to
Eike Welk and Thomas/Poldy on the pyparsing wiki!
- Added indentedBlock helper method, to encapsulate the parse
actions and indentation stack management needed to keep track of
indentation levels. Use indentedBlock to define grammars for
indentation-based grouping grammars, like Python's.
indentedBlock takes up to 3 parameters:
- blockStatementExpr - expression defining syntax of statement
that is repeated within the indented block
- indentStack - list created by caller to manage indentation
stack (multiple indentedBlock expressions
within a single grammar should share a common indentStack)
- indent - boolean indicating whether block must be indented
beyond the current level; set to False for block of
left-most statements (default=True)
A valid block must contain at least one indented statement.
- Fixed bug in nestedExpr in which ignored expressions needed
to be set off with whitespace. Reported by Stefaan Himpe,
nice catch!
- Expanded multiplication of an expression by a tuple, to
accept tuple values of None:
. expr*(n,None) or expr*(n,) is equivalent
to expr*n + ZeroOrMore(expr)
(read as "at least n instances of expr")
. expr*(None,n) is equivalent to expr*(0,n)
(read as "0 to n instances of expr")
. expr*(None,None) is equivalent to ZeroOrMore(expr)
. expr*(1,None) is equivalent to OneOrMore(expr)
Note that expr*(None,n) does not raise an exception if
more than n exprs exist in the input stream; that is,
expr*(None,n) does not enforce a maximum number of expr
occurrences. If this behavior is desired, then write
expr*(None,n) + ~expr
- Added None as a possible operator for operatorPrecedence.
None signifies "no operator", as in multiplying m times x
in "y=mx+b".
- Fixed bug in Each, reported by Michael Ramirez, in which the
order of terms in the Each affected the parsing of the results.
Problem was due to premature grouping of the expressions in
the overall Each during grammar construction, before the
complete Each was defined. Thanks, Michael!
- Also fixed bug in Each in which Optional's with default values
were not getting the defaults added to the results of the
overall Each expression.
- Fixed a bug in Optional in which results names were not
assigned if a default value was supplied.
- Cleaned up Py3K compatibility statements, including exception
construction statements, and better equivalence between _ustr
and basestring, and __nonzero__ and __bool__.