Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Parser/Python.asdl
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ module Python
| UnaryOp(unaryop op, expr operand)
| Lambda(arguments args, expr body)
| IfExp(expr test, expr body, expr orelse)
| Dict(expr* keys, expr* values)
| Dict(expr?* keys, expr* values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of change makes me wonder whether we should also add '+' to indicate 1 or more items. For instance nonlocal requires at least one name, same for import, global and others.

Similarly, there is no explanation of what * and ? actually mean. While they can be inferred, I would expect a caption for them. WDYT @JelleZijlstra?

| Set(expr* elts)
| ListComp(expr elt, comprehension* generators)
| SetComp(expr elt, comprehension* generators)
Expand Down
64 changes: 37 additions & 27 deletions Parser/asdl.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,15 @@
# type ::= product | sum
# product ::= fields ["attributes" fields]
# fields ::= "(" { field, "," } field ")"
# field ::= TypeId ["?" | "*"] [Id]
# field ::= TypeId { "?" | "*" } [Id]
# sum ::= constructor { "|" constructor } ["attributes" fields]
# constructor ::= ConstructorId [fields]
#
# [1] "The Zephyr Abstract Syntax Description Language" by Wang, et. al. See
# http://asdl.sourceforge.net/
#-------------------------------------------------------------------------------
from collections import namedtuple
import enum
import re

__all__ = [
Expand Down Expand Up @@ -64,34 +65,43 @@ def __init__(self, name, fields=None):
def __repr__(self):
return 'Constructor({0.name}, {0.fields})'.format(self)

class Quantifier(enum.Enum):
OPTIONAL = enum.auto()
SEQUENCE = enum.auto()

class Field(AST):
def __init__(self, type, name=None, seq=False, opt=False):
def __init__(self, type, name=None, quantifiers=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need to support more than two qualifiers? and/or shouldn't we check the consistency of those qualifiers as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently there are no more than two quantifiers on any type, as per my change to Python.asdl.

I'm just implementing the EBNF grammar described at the top of the asdl.py file. And I don't think this files does any amount of extra validation so far anyway.

self.type = type
self.name = name
self.seq = seq
self.opt = opt
self.seq = False
self.opt = False
self.quantifiers = quantifiers or []
if len(self.quantifiers) > 0:
self.seq = self.quantifiers[-1] is Quantifier.SEQUENCE
self.opt = self.quantifiers[-1] is Quantifier.OPTIONAL

def __str__(self):
if self.seq:
extra = "*"
elif self.opt:
extra = "?"
else:
extra = ""
extra = ""
for mod in self.quantifiers:
if mod is Quantifier.SEQUENCE:
extra += "*"
elif mod is Quantifier.OPTIONAL:
extra += "?"

return "{}{} {}".format(self.type, extra, self.name)

def __repr__(self):
if self.seq:
extra = ", seq=True"
elif self.opt:
extra = ", opt=True"
else:
extra = ""
extra = ""
for mod in self.quantifiers:
if mod is Quantifier.SEQUENCE:
extra += ", SEQUENCE"
elif mod is Quantifier.OPTIONAL:
extra += ", OPTIONAL"

if self.name is None:
return 'Field({0.type}{1})'.format(self, extra)
return 'Field({0.type}, quantifiers=[{1}])'.format(self, extra)
else:
return 'Field({0.type}, {0.name}{1})'.format(self, extra)
return 'Field({0.type}, {0.name}, quantifiers=[{1}])'.format(self, extra)

class Sum(AST):
def __init__(self, types, attributes=None):
Expand Down Expand Up @@ -314,10 +324,10 @@ def _parse_fields(self):
self._match(TokenKind.LParen)
while self.cur_token.kind == TokenKind.TypeId:
typename = self._advance()
is_seq, is_opt = self._parse_optional_field_quantifier()
quantifiers = self._parse_optional_field_quantifier()
id = (self._advance() if self.cur_token.kind in self._id_kinds
else None)
fields.append(Field(typename, id, seq=is_seq, opt=is_opt))
fields.append(Field(typename, id, quantifiers=quantifiers))
if self.cur_token.kind == TokenKind.RParen:
break
elif self.cur_token.kind == TokenKind.Comma:
Expand All @@ -339,14 +349,14 @@ def _parse_optional_attributes(self):
return None

def _parse_optional_field_quantifier(self):
is_seq, is_opt = False, False
if self.cur_token.kind == TokenKind.Asterisk:
is_seq = True
self._advance()
elif self.cur_token.kind == TokenKind.Question:
is_opt = True
quantifiers = []
while self.cur_token.kind in (TokenKind.Asterisk, TokenKind.Question):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check that we don't have ** for instance or ???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. But both ?? and ** are legitimate types, ** perhaps more legitimate than ?? is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, actually the current parser doesn't care. I thought that it would have warned the user. And should T** actually mean a list of list of T? is the obtained C code designed to handle this kind of grammar?

I have no strong opinion now on whether we should warn ??.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is essentially no change to the C code at all at this point. I don't know how this was handled before, since expr?* was always there implicitly, but this didn't seem to matter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think this is fine as is

if self.cur_token.kind == TokenKind.Asterisk:
quantifiers.append(Quantifier.SEQUENCE)
elif self.cur_token.kind == TokenKind.Question:
quantifiers.append(Quantifier.OPTIONAL)
self._advance()
return is_seq, is_opt
return quantifiers

def _advance(self):
""" Return the value of the current token and read the next one into
Expand Down
4 changes: 2 additions & 2 deletions Python/Python-ast.c

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading