Skip to content

Commit 87b519b

Browse files
committed
New Lexer
The lexer generates a syntax tree during the input parsing. Introduce math operators
1 parent 6cf8a10 commit 87b519b

40 files changed

+6458
-2500
lines changed

Makefile

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,16 @@
1-
.PHONY: gen-proto gen-fbs gen-tokenz test bench bench-report escape-analysis
1+
.PHONY: gen-fbs gen-lexer test bench bench-report escape-analysis
22

33
default: test
44

5-
gen-proto:
6-
@go install github.com/gogo/protobuf/protoc-gen-gogofaster
7-
@protoc --gogofaster_out=. ./ast/ast.proto
8-
95
gen-fbs:
106
@rm -f bytecode/*.go
117
@flatc -g -o . bytecode/proto.fbs
128

13-
gen-tokenz: gen-proto
14-
@ragel -Z -G2 tokenz/tokenz.go.rl -o tokenz/tokenz.go
15-
@goimports -w tokenz/tokenz.go
9+
gen-lexer:
10+
@ragel -Z -G2 lexer/lexer.go.rl -o lexer/lexer.go
11+
@goimports -w lexer/lexer.go
1612

17-
test: gen-fbs gen-tokenz
13+
test: gen-fbs gen-lexer
1814
@go test -v -cover ./...
1915

2016
bench: test

README.md

Lines changed: 27 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,15 @@ true, false
4040
[["hello"], "world!"]
4141
```
4242

43+
### Operators
44+
45+
The virtual machine supports basic math operators `+-*/`. A math expression might be surrounded by parentheses.
46+
Examples:
47+
```
48+
1 + -1
49+
1 * (2 + 3)
50+
```
51+
4352
### Delegators
4453

4554
In general, delegators are functions implemented by the hosted application.
@@ -92,70 +101,46 @@ contains([1, 2, 3], 4) // false
92101

93102
## Architecture
94103

95-
The architecture consists of 4 components:
96-
1. Tokenizer
97-
2. Syntax Tree Builder
98-
3. Compiler
99-
4. Virtual Machine
104+
The architecture consists of 3 components:
105+
1. Lexer
106+
2. Compiler
107+
3. Virtual Machine
100108

101-
**The Tokenizer** parses the input text:
109+
**The lexer** parses the input text:
102110
```
103111
join(",", ["a", "b"])
104112
```
105-
and returns the following tokens:
113+
and generates a syntax tree:
106114
```
107-
IDENT join
108-
PUNCT (
109115
STR ","
110-
PUNCT ,
111-
PUNCT [
112116
STR "a"
113-
PUNCT ,
114117
STR "b"
115-
PUNCT ]
116-
PUNCT )
117-
```
118-
119-
> The tokenizer is implemented using [Ragel State Machine Compiler](https://www.colm.net/open-source/ragel/).
120-
121-
**The Syntax Tree Builder** generates a syntax tree from tokens:
122-
```
123-
EXIT
124-
|-- CALL(join)
125-
|-- STR(",")
126-
|-- ARR
127-
|-- STR("a")
128-
|-- STR("b")
129-
```
130-
131-
> A schema of the syntax tree is described by [Protocol Buffers 3](https://developers.google.com/protocol-buffers/) to make it easy traversable by any programming language.
132-
133-
**The Compiler** makes a bytecode from the syntax tree to make it executable by **a stack-based virtual machine**:
134-
```
135-
PUSH_STR ","
136-
PUSH_STR "a"
137-
PUSH_STR "b"
138-
PUSH_VECTOR 2
139-
SYS_CALL "join" 2
140-
RET
118+
ARR 2
119+
INVOKE join 2
141120
```
121+
> The lexer is implemented using [Ragel State Machine Compiler](https://www.colm.net/open-source/ragel/).
142122
123+
**The compiler** makes a bytecode from the syntax tree to make it executable by **a stack-based virtual machine**.
143124
> The bytecode is described by [Flatbuffers](https://google.github.io/flatbuffers/flatbuffers_guide_use_go.html) to achieve high-throughput with low memory consumption.
144125
145126
## Usage
146127

147128
Compilation:
148129
```go
149-
import "github.com/regeda/expr/asm"
130+
import (
131+
"github.com/regeda/expr/compiler"
132+
"github.com/regeda/expr/lexer"
133+
)
150134

151135
code := `join(",", ["a", "b"])`
152136

153-
a := asm.New()
154-
bytecode, err := a.Assemble([]byte(code))
137+
tokens, err := lexer.Parse([]byte(code))
155138
if err != nil {
156139
panic(err)
157140
}
158141

142+
bytecode := compiler.Compile(tokens)
143+
159144
// save `bytecode` to be executed by the virtual machine
160145
```
161146

@@ -189,5 +174,5 @@ equals("foo,bar,baz", join(",", ["foo", "bar", "baz"]))
189174
```
190175
cpu: Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
191176
BenchmarkExec
192-
BenchmarkExec-8 1508277 798.5 ns/op 0 B/op 0 allocs/op
177+
BenchmarkExec-8 1635091 746.7 ns/op 0 B/op 0 allocs/op
193178
```

asm/asm.go

Lines changed: 0 additions & 25 deletions
This file was deleted.

0 commit comments

Comments
 (0)