@@ -6,6 +6,28 @@ put here.
66
77## Versionning
88
9+ ### Since 2021
10+
11+ The Typee language grammar is specified according to the formal, unambiguous
12+ and easy to read PEG format (stands for * Processing Expression Grammar* ).
13+
14+ Processing Expression Grammars have been first specified by Bryan Ford (MIT)
15+ by 2004 in his famous paper "Parsing Expression Grammars: A Recognition-Based
16+ Syntactic Foundation". This article has been published at the POPL’04
17+ conference that had been held by January 14–16, 2004 at Venice, Italy. Its a
18+ copyrighted document of ACM with number 1-58113-729-X/04/0001 and it can be
19+ easily found on the Net.
20+
21+ So, now on, all versions of Typee grammar specifications are numbered and
22+ named as ` typee_specs_PEG_v<XX>.grm ` , where ` <XX> ` belongs in
23+ interval 01 up to 99.
24+
25+
26+ ### Before 2021
27+
28+ Notice that Typee language wass specified with a formal, unambiguous, LL(1)
29+ grammar.
30+
931The very first version of these specifications is not numbered. See file
1032` typee_specs_LL1.grm ` .
1133
@@ -14,18 +36,39 @@ Next versions are all numbered: `...-v2.grm`, `...-v3.grm`, ...,
1436document.
1537
1638You'll notice that version ` v4 ` is missing. Unfortunately, it has been lost
17- before being stored here...
39+ before being stored here.
1840
19- Finally, since version ` v8 ` , a second version of the specifications is
41+ Since version ` v8 ` , a second version of the specifications is
2042provided: from ` ...-v8-EBNF.grm ` to ` ...-v10-EBNF.grm ` .
2143
22- Notice that Typee language is specified with a formal, unambiguous, LL(1)
23- grammar.
2444
2545
2646## Specification format
2747
28- ### Formal classical description
48+ ### Since 2021
49+
50+ The used syntax to describe the grammar rules is the original PEG one. So,
51+ we use notations ` <- ` and ` / ` for instance, while newer papers use ` ::= `
52+ or ` | ` instead as it is usual in CFGs (* Context Free Grammars* , which what
53+ LL(1) grammars are).
54+
55+ Notice that we use notation '##' as the starting point for comments, while
56+ Bryan Floyd was using '#' in his original paper. This is a commodity we use
57+ to get colored syntax in Notepad++. Notice also that Comments are one line
58+ comments only in the very first description of PEG grammars. The PEG
59+ specification of ** Typee** conforms to this.
60+
61+ We strongly encourage the reader to get access to the initial article from
62+ Bryan Ford. Section 2 of this paper fully explains the syntax of PEGrammars:
63+ "* Parsing Expression Grammars: A Recognition-Based Syntactic Foundation* ",
64+ ACM 1-58113-729-X/04/0001, POPL’04, January 14–16, 2004, Venice, Italy. It
65+ can easily be accessed on Internet with these references. We shall not copy
66+ here anything from it without permission.
67+
68+
69+ ### Before 2021
70+
71+ #### Formal classical description - LL(1)
2972
3073The used syntax to describe the grammar rules is very classical.
3174Rules are named between angle brackets: ` <rule name> ` . Those names may
@@ -67,7 +110,7 @@ group together successive rule names between parenthesis. Next specifications
67110are unambiguous:
68111
69112 <single string> ::= "'" <single string'> "'"
70- | '"' <ingle string"> '"'
113+ | '"' <single string"> '"'
71114
72115 <single string'> ::= <any escaped char> <single string'>
73116 | <any string quote char> <single string'>
@@ -82,7 +125,7 @@ this rule may be not derived. In the above rules, it states that empty strings
82125formed as ` '' ` or ` "" ` are legal in Typee.
83126
84127
85- ### Extended Backus-Naur Form (EBNF) description
128+ #### Extended Backus-Naur Form (EBNF) description
86129
87130This is a simplified and more easy to read format for grammar rules
88131specifications. We have been providing them since version ` v8 ` of Typee
@@ -106,12 +149,12 @@ underscore, i.e. `'_'`, preceding any series of alphanumerical character and
106149undercores, i.e. any character from group ` ['0'...'9', 'A'...'Z', 'a'...'z', '_'] ` .
107150
108151Parenthesis are also used jointly with an ending character star ` '*' ` . There,
109- they men that the derivations rules the group together may be derived from 0
110- to many times (with no limitations):
152+ they mean that the derivations rules that they group together may be derived
153+ from 0 to many times (with no limitations):
111154
112155 <identifier> ::= ( <alpha char> | '_' ) ( <alpha num char> | '_' )*
113156
114- You can get here that this kind of factorization helps easying the reading of
157+ You can get here that this kind of factorization helps easing the reading of
115158grammars specifications as well as it helps reducing their specifications
116159sizes.
117160
0 commit comments