Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
6a9705d
feat(ast): Start AST generation (handles smt)
MarcellPerger1 Nov 5, 2024
49abb1f
feat(ast): Finish expr handling in AstGen
MarcellPerger1 Nov 6, 2024
d521160
fix(ast): Fix the AST walk not doing anything
MarcellPerger1 Nov 6, 2024
e53ca64
perf(cst): Use cached CST result if possible
MarcellPerger1 Nov 6, 2024
c5467ed
fix(ast): Fix crash when finding indirect autowalker
MarcellPerger1 Nov 6, 2024
8a8d387
refactor(test): Rename CST functions in CommonTestCase
MarcellPerger1 Nov 6, 2024
e7ac0ac
refactor(test): Move assertValidParseAST to CommonTestCase
MarcellPerger1 Nov 6, 2024
3356879
test(ast): Use fuzzing for AST
MarcellPerger1 Nov 6, 2024
cbcf6ee
ci: Add option of running 4 different machines (per python version) i…
MarcellPerger1 Nov 6, 2024
6b58983
fix(cst): Fix checked_cast on named children for Getattr/Getitem
MarcellPerger1 Nov 6, 2024
f302955
feat(ast): Add AST node handling to tree_print.py
MarcellPerger1 Nov 7, 2024
e9dd389
test(ast): Test main examples AST
MarcellPerger1 Nov 7, 2024
004c064
refactor(ast): Add type annotations
MarcellPerger1 Nov 7, 2024
db9730d
refactor(ast): Extract out errors.py
MarcellPerger1 Nov 7, 2024
75cec69
feat(ast): Add function to evaluate literal strings
MarcellPerger1 Nov 7, 2024
9c6d004
test(ast): Add consistency test for eval_string
MarcellPerger1 Nov 7, 2024
3b7941e
fix(ast): Make _register_autowalk_expr work without parens
MarcellPerger1 Nov 7, 2024
76fff51
fix(snapshots): Fix the distinct lack of snapshots for AST
MarcellPerger1 Nov 7, 2024
c52f91f
refactor(ast): Add __all__ to ast_node.py
MarcellPerger1 Nov 7, 2024
54dc854
fix: Make it work for Python 3.10
MarcellPerger1 Nov 8, 2024
4c2b25e
fix(ci): Fix fuzzer action on GH mobile
MarcellPerger1 Nov 8, 2024
9b7ab37
fix: Fix workflow names
MarcellPerger1 Nov 8, 2024
a941fd6
ci: Add Python version to fuzzing workflow name
MarcellPerger1 Nov 8, 2024
5378d8a
fix(ci): Fix fuzzing workflow name
MarcellPerger1 Nov 8, 2024
aec85e2
fix(ci): Fix fuzzing workflow name, yet again
MarcellPerger1 Nov 8, 2024
05c71d9
ci: Add Python 3.12 to test workflow
MarcellPerger1 Nov 8, 2024
0a0ad70
refactor: Clarify some comments in error.py
MarcellPerger1 Nov 9, 2024
35b5ca6
test: Add some tests for bugs in eval_string
MarcellPerger1 Nov 9, 2024
8774a99
refactor: Extract eval_string to _EvalString class
MarcellPerger1 Nov 9, 2024
b999852
feat(ast): Add values to literal AST nodes
MarcellPerger1 Nov 9, 2024
8e456cc
feat(ast): Store `AstNumber.value` as int to avoid useless `.0`s
MarcellPerger1 Nov 9, 2024
733010c
fix(snapshots!): Update snapshots
MarcellPerger1 Nov 9, 2024
987d387
refactor: Remove outdated TODOs
MarcellPerger1 Nov 9, 2024
42099c8
fix: Disable grammar check is Pycharm (too many false positives)
MarcellPerger1 Nov 9, 2024
2e78a1a
refactor: Add AST to main.py
MarcellPerger1 Nov 9, 2024
8c1bb54
fix: Fix/refactor stuff in eval_literal.py from review
MarcellPerger1 Nov 10, 2024
58ca1a2
refactor: Better error msg in _detect_autowalk_type_from_annot
MarcellPerger1 Nov 10, 2024
77abb19
perf(ast): Cache types in _lookup_autowalk_fn
MarcellPerger1 Nov 10, 2024
1d0a4c6
refactor: Rearrange AstGen a bit
MarcellPerger1 Nov 10, 2024
252498e
refactor(ast): Extract _walk_conditional to method
MarcellPerger1 Nov 10, 2024
a306255
refactor(ast): Extract _walk_var_decl and tidy up code a bit
MarcellPerger1 Nov 10, 2024
906fd34
refactor: Improve type annotations, remove an `assert`
MarcellPerger1 Nov 10, 2024
23ac7d6
refactor: Rename `AstGen.walk` -> `.parse`
MarcellPerger1 Nov 10, 2024
1e9e345
fix(snapshottest): Fix regression when creating new snapshot file
MarcellPerger1 Nov 10, 2024
0bf120d
test: Make existing test into a snapshot test
MarcellPerger1 Nov 10, 2024
f5f6d83
test: Add coverage scripts
MarcellPerger1 Nov 10, 2024
972e038
test: Add over-the-top CSS/JS injection to make HTML output look decent
MarcellPerger1 Nov 10, 2024
d6b81e3
fix: Fix injection for functions/classes page
MarcellPerger1 Nov 10, 2024
1090a94
refactor: Refactor the htmlcov-enhancer
MarcellPerger1 Nov 10, 2024
f08499a
test: Add test for NopNode-handling
MarcellPerger1 Nov 10, 2024
4c751f7
test: Add some stuff as ignored
MarcellPerger1 Nov 10, 2024
595e7f9
test: Add multiprocessing/SimpleProcessPool coverage
MarcellPerger1 Nov 11, 2024
7ffad98
refactor: Remove unneeded pycharm ignore
MarcellPerger1 Nov 11, 2024
1b7119b
refactor: Rename overly long class in simple_process_pool.py to _Time…
MarcellPerger1 Nov 11, 2024
26b5e5e
fix: Increase _finalize timeout as it takes ages to shut down.
MarcellPerger1 Nov 11, 2024
50fe66e
fix: Fiz 'process didn't close' message formatting
MarcellPerger1 Nov 11, 2024
24064cb
perf: Don't test as many combinations in test_py_consistency
MarcellPerger1 Nov 11, 2024
3527128
test: Add tests for error handling in eval_literal.py
MarcellPerger1 Nov 11, 2024
19b66a2
refactor: Add coverage ignore to should-be-unreachable assert
MarcellPerger1 Nov 11, 2024
1568f83
refactor(test): Return error from assertFailsGracefully, add AST variant
MarcellPerger1 Nov 12, 2024
3d2073a
test: Add more tests for AstGen
MarcellPerger1 Nov 12, 2024
a85e7e2
test: Update snapshots
MarcellPerger1 Nov 12, 2024
41dd6e4
fix: Fix bug with AstAugAssign in astgen.py
MarcellPerger1 Nov 12, 2024
ed14ed7
fix!(snapshottest): Use UTF8 encoding
MarcellPerger1 Nov 12, 2024
dcfc26d
test: Add tests for string and autocat
MarcellPerger1 Nov 12, 2024
553d27c
test: Add test for unaries
MarcellPerger1 Nov 12, 2024
2e8c982
test(coverage): Add coverage ignores
MarcellPerger1 Nov 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[report]
exclude_also =
if __name__ == ['"]__main__['"]:
if TYPE_CHECKING:
assert 0\b
20 changes: 19 additions & 1 deletion .github/workflows/fuzzer.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,34 @@ on:
description: Number of iterations
type: number
default: 250000
do_shard:
description: Run it on 4 separate machines (`n` on each)?
type: boolean
default: false


jobs:
test:
fuzzer:
runs-on: ubuntu-latest
strategy:
matrix:
py_version:
- "3.10"
- "3.11"
- "3.12"
do_shard:
- ${{ inputs.do_shard }}
shard_index: [0, 1, 2, 3]
exclude:
- do_shard: false
include:
- do_shard: false
py_version: "3.10"
- do_shard: false
py_version: "3.11"
- do_shard: false
py_version: "3.12"
name: ${{ inputs.do_shard && format('Run fuzzer (Python {0}, shard {1})', matrix.py_version, matrix.shard_index) || format('Run fuzzer (Python {0})', matrix.py_version) }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ jobs:
py_version:
- "3.10"
- "3.11"
- "3.12"
steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand Down
1 change: 1 addition & 0 deletions .idea/inspectionProfiles/project_inspections.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 5 additions & 3 deletions fuzz.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import time

from parser.astgen.astgen import AstGen
from parser.lexer.tokenizer import Tokenizer
from parser.cst.treegen import TreeGen
from parser.common.error import BaseParseError
Expand Down Expand Up @@ -29,7 +30,7 @@ def fuzz(buf):
try:
string = buf.decode("ascii")
try:
TreeGen(Tokenizer(string)).parse()
AstGen(TreeGen(Tokenizer(string))).parse()
except BaseParseError:
pass
except UnicodeDecodeError:
Expand All @@ -39,11 +40,12 @@ def fuzz(buf):
if __name__ == '__main__':
import argparse
ap = argparse.ArgumentParser("fuzz.py", description="Runs a fuzzer for n iterations")
# Use type=float as gh mobile cannot specify integers as workflow args
ap.add_argument('-n', '--iterations', default=-1,
type=int, help="Number of iterations to run pythonfuzz for")
type=float, help="Number of iterations to run pythonfuzz for")
ap.add_argument('-i', '--infinite',
action='store_const', const=-1, dest='iterations')
args = ap.parse_args()

fuzzer = Fuzzer(fuzz, dirs=['./pythonfuzz_corpus'], timeout=30, runs=args.iterations)
fuzzer = Fuzzer(fuzz, dirs=['./pythonfuzz_corpus'], timeout=30, runs=int(args.iterations))
fuzzer.start()
31 changes: 25 additions & 6 deletions main.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import cProfile
import time

from parser.astgen.astgen import AstGen
from util import readfile
from parser.cst.treegen import TreeGen
from parser.lexer import Tokenizer, print_tokens
Expand All @@ -14,7 +15,20 @@ def make_tree(src: str):
PROFILER = True


def run(src: str, idx: int = -1):
def run(src: str, idx: int = -1, do_ast=True):
node = ast_node = None
ta1 = tp1 = ta0 = 0.0 # will be overwritten

def doit_trees():
nonlocal node, tp1, ta0, ast_node, ta1
treegen = TreeGen(tn)
node = treegen.parse()
tp1 = time.perf_counter()
if do_ast:
ta0 = time.perf_counter()
ast_node = AstGen(treegen).parse()
ta1 = time.perf_counter()

tn0 = time.perf_counter()
tn = Tokenizer(src).tokenize()
tn1 = time.perf_counter()
Expand All @@ -25,24 +39,29 @@ def run(src: str, idx: int = -1):
tp0 = time.perf_counter()
if PROFILER:
with cProfile.Profile() as p:
node = TreeGen(tn).parse()
tp1 = time.perf_counter()
doit_trees()
p.dump_stats(f'perf_dump_{idx}.prof')
else:
node = TreeGen(tn).parse()
tp1 = time.perf_counter()
doit_trees()
print('CST:')
tpr_cst0 = time.perf_counter()
tprint(node)
tpr_cst1 = time.perf_counter()
tpr_ast0 = tpr_ast1 = time.perf_counter()
if do_ast:
tprint(ast_node)
tpr_ast1 = time.perf_counter()
print(rf'Tokens done in {(tn1 - tn0) * 1000:.2f}ms')
print(rf'Tokens_print done in {(tpr_tk1 - tpr_tk0) * 1000:.2f}ms')
print(rf'CST done in {(tp1 - tp0) * 1000:.2f}ms')
print(rf'CST_print done in {(tpr_cst1 - tpr_cst0) * 1000:.2f}ms')
if do_ast:
print(rf'AST done in {(ta1 - ta0) * 1000:.2f}ms')
print(rf'AST_print done in {(tpr_ast1 - tpr_ast0) * 1000:.2f}ms')


def main():
run(readfile('main_example_0.st'), 0)
run(readfile('main_example_0.st'), 0, do_ast=False)
run(readfile('main_example_1.st'), 1)


Expand Down
Empty file added parser/astgen/__init__.py
Empty file.
181 changes: 181 additions & 0 deletions parser/astgen/ast_node.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
from __future__ import annotations

from dataclasses import dataclass
from enum import Enum

from ..common import HasRegion, StrRegion

__all__ = [
"AstNode", "AstProgramNode", "VarDeclType", "AstDeclNode", "AstRepeat",
"AstIf", "AstWhile", "AstAssign", "AstAugAssign", "AstDefine", "AstNumber",
"AstString", "AstAnyName", "AstIdent", "AstAttrName", "AstAttribute",
"AstItem", "AstCall", "AstOp", "AstBinOp", "AstUnaryOp",
]


@dataclass
class AstNode(HasRegion):
region: StrRegion
name = None # type: str
del name # So we get better error msg if we forget to add it to a class


@dataclass
class AstProgramNode(AstNode):
name = 'program'
statements: list[AstNode]


# region ---- <Statements> ----
class VarDeclType(Enum):
LET = 'let'
GLOBAL = 'global'


@dataclass
class AstDeclNode(AstNode):
name = 'var_decl'
type: VarDeclType
decls: list[tuple[AstIdent, AstNode | None]]


@dataclass
class AstRepeat(AstNode):
name = 'repeat'
count: AstNode
body: list[AstNode]


@dataclass
class AstIf(AstNode):
name = 'if'
cond: AstNode
if_body: list[AstNode]
# elseif = else{if
else_body: list[AstNode] | None = None
# ^ Separate cases for no block and empty block (can be else {} to easily
# add extra blocks in scratch interface)


@dataclass
class AstWhile(AstNode):
name = 'while'
cond: AstNode
body: list[AstNode]


@dataclass
class AstAssign(AstNode):
name = '='
target: AstNode
source: AstNode


@dataclass
class AstAugAssign(AstNode):
op: str # maybe attach a StrRegion to the location of the op??
target: AstNode
source: AstNode

@property
def name(self):
return self.op


@dataclass
class AstDefine(AstNode):
name = 'def'

ident: AstIdent
params: list[tuple[AstIdent, AstIdent]] # type, ident
body: list[AstNode]
# endregion ---- </Statements> ----


# region ---- <Expressions> ----
@dataclass
class AstNumber(AstNode):
# No real point in storing the string representation (could always StrRegion.resolve())
value: float | int


@dataclass
class AstString(AstNode):
value: str # Values with escapes, etc. resolved


@dataclass
class AstAnyName(AstNode):
id: str

def __post_init__(self):
if type(self) == AstAnyName:
raise TypeError("AstAnyName must not be instantiated directly.")


@dataclass
class AstIdent(AstAnyName):
name = 'ident'


@dataclass
class AstAttrName(AstAnyName):
name = 'attr'


@dataclass
class AstAttribute(AstNode):
name = '.'
obj: AstNode
attr: AstAttrName


@dataclass
class AstItem(AstNode):
name = 'item'
obj: AstNode
index: AstNode


@dataclass
class AstCall(AstNode):
name = 'call'
obj: AstNode
args: list[AstNode]


@dataclass
class AstOp(AstNode):
op: str


@dataclass
class AstBinOp(AstOp):
left: AstNode
right: AstNode

valid_ops = [*'+-*/%', '**', '..', '||', '&&', # ops
'==', '!=', '<', '>', '<=', '>=' # comparisons
] # type: list[str]

def __post_init__(self):
assert self.op in self.valid_ops

@property
def name(self):
return self.op


@dataclass
class AstUnaryOp(AstOp):
operand: AstNode

valid_ops = ('+', '-', '!')

def __post_init__(self):
assert self.op in self.valid_ops

@property
def name(self):
return self.op
# endregion ---- </Expressions> ----
Loading