Conversation
|
Well, something is wrong. The check for the ability to merge is not completing. |
|
@Bormotoon for some reason the check for merge is not ending. Couuld you close and re-open this please? |
KvanTTT
left a comment
There was a problem hiding this comment.
I recommend the lexer's simplifying in the suggested way.
kumir/KumirLexer.g4
Outdated
| // --- Keywords (Core Language) --- | ||
| // Keywords are case-insensitive (both lowercase and uppercase Cyrillic are matched). | ||
| MODULE : 'модуль'; | ||
| ENDMODULE : ('конец' WS 'модуля' | 'конецмодуля' | 'конец_модуля'); |
There was a problem hiding this comment.
I recommend simplifying:
| ENDMODULE : ('конец' WS 'модуля' | 'конецмодуля' | 'конец_модуля'); | |
| fragment: WS_FRAGMENT: [ \t\r\n]+; | |
| ENDMODULE : 'конец' (WS_FRAGMENT | '_')? 'модуля'; |
kumir/KumirLexer.g4
Outdated
| POST_CONDITION : 'надо'; | ||
| ASSERTION : 'утв'; | ||
| LOOP : 'нц'; | ||
| ENDLOOP_COND : ('кц' WS 'при' | 'кц_при'); |
There was a problem hiding this comment.
| ENDLOOP_COND : ('кц' WS 'при' | 'кц_при'); | |
| ENDLOOP_COND : 'кц' (WS_FRAGMENT | '_')? 'при'; |
kumir/KumirLexer.g4
Outdated
| OR : 'или'; | ||
| OUT_PARAM : 'рез'; | ||
| IN_PARAM : 'арг'; | ||
| INOUT_PARAM : ('аргрез' | 'арг' WS 'рез' | 'арг_рез'); |
There was a problem hiding this comment.
| INOUT_PARAM : ('аргрез' | 'арг' WS 'рез' | 'арг_рез'); | |
| INOUT_PARAM : 'арг' (WS_FRAGMENT | '_')? 'рез'; |
kumir/KumirLexer.g4
Outdated
| INTEGER_ARRAY_TYPE : ('цел' WS? 'таб' | 'цел_таб'); | ||
| REAL_ARRAY_TYPE : ('вещ' WS? 'таб' | 'вещ_таб'); | ||
| CHAR_ARRAY_TYPE : ('сим' WS? 'таб' | 'сим_таб'); | ||
| STRING_ARRAY_TYPE : ('лит' WS? 'таб' | 'лит_таб'); | ||
| BOOLEAN_ARRAY_TYPE : ('лог' WS? 'таб' | 'лог_таб'); |
There was a problem hiding this comment.
| INTEGER_ARRAY_TYPE : ('цел' WS? 'таб' | 'цел_таб'); | |
| REAL_ARRAY_TYPE : ('вещ' WS? 'таб' | 'вещ_таб'); | |
| CHAR_ARRAY_TYPE : ('сим' WS? 'таб' | 'сим_таб'); | |
| STRING_ARRAY_TYPE : ('лит' WS? 'таб' | 'лит_таб'); | |
| BOOLEAN_ARRAY_TYPE : ('лог' WS? 'таб' | 'лог_таб'); | |
| INTEGER_ARRAY_TYPE : 'цел' (WS_FRAGMENT | '_')? 'таб'; | |
| REAL_ARRAY_TYPE : 'вещ' (WS_FRAGMENT | '_')? 'таб'; | |
| CHAR_ARRAY_TYPE : 'сим' (WS_FRAGMENT | '_')? 'таб'; | |
| STRING_ARRAY_TYPE : 'лит' (WS_FRAGMENT | '_')? 'таб'; | |
| BOOLEAN_ARRAY_TYPE : 'лог' (WS_FRAGMENT | '_')? 'таб'; |
kumir/KumirLexer.g4
Outdated
| // Color constants | ||
| PROZRACHNIY : 'прозрачный'; | ||
| BELIY : 'белый'; | ||
| CHERNIY : 'чёрный' | 'черный'; |
There was a problem hiding this comment.
| CHERNIY : 'чёрный' | 'черный'; | |
| fragment E_OR_YO : 'ё' | 'е'; | |
| CHERNIY : 'ч' E_OR_YO рный'; |
kumir/KumirLexer.g4
Outdated
| ZELENIY : 'зелёный' | 'зеленый'; | ||
| ZHELTIY : 'жёлтый' | 'желтый'; |
There was a problem hiding this comment.
| ZELENIY : 'зелёный' | 'зеленый'; | |
| ZHELTIY : 'жёлтый' | 'желтый'; | |
| ZELENIY : 'зел' E_OR_YO 'ный'; | |
| ZHELTIY : 'ж' E_OR_YO 'лтый'; |
kumir/KumirLexer.g4
Outdated
| DOC_COMMENT : '#' ~[\r\n]* -> channel(HIDDEN); | ||
|
|
||
| // --- Whitespace --- | ||
| WS : [ \t\r\n]+ -> skip; |
There was a problem hiding this comment.
Use the previously introduced WS_FRAGMENT:
| WS : [ \t\r\n]+ -> skip; | |
| WS : WS_FRAGMENT+ -> skip; |
kumir/KumirLexer.g4
Outdated
| fragment DIGIT : [0-9]; | ||
| fragment HEX_DIGIT : [0-9a-fA-F]; | ||
| fragment LETTER : [a-zA-Zа-яА-ЯёЁ]; |
There was a problem hiding this comment.
The UPPER case is not needed since the grammar sets up caseInsensitive = true
| fragment DIGIT : [0-9]; | |
| fragment HEX_DIGIT : [0-9a-fA-F]; | |
| fragment LETTER : [a-zA-Zа-яА-ЯёЁ]; | |
| fragment DIGIT : [0-9]; | |
| fragment HEX_DIGIT : [0-9a-f]; | |
| fragment LETTER : [a-zа-яё]; |
kumir/KumirLexer.g4
Outdated
| fragment LETTER : [a-zA-Zа-яА-ЯёЁ]; | ||
| fragment DecInteger : DIGIT+; | ||
| fragment HexInteger : '$' HEX_DIGIT+; | ||
| fragment ExpFragment: [eE] [+-]? DIGIT+; |
There was a problem hiding this comment.
| fragment ExpFragment: [eE] [+-]? DIGIT+; | |
| fragment ExpFragment: [e] [+-]? DIGIT+; |
|
Is this ready to merge? |
|
There are some minor issues to fix, but generally yes. |
|
Sorry, not yet. There will be some more changes and optimizations as suggested here. |
There was a problem hiding this comment.
Pull request overview
This pull request adds comprehensive support for the Kumir programming language, a Russian algorithmic language primarily used for teaching programming in schools. The PR includes well-structured ANTLR v4 grammars (lexer and parser), extensive documentation, and a comprehensive test suite.
Key changes:
- Complete ANTLR v4 grammar implementation for Kumir with lexer and parser definitions
- 60 example programs covering various language features (loops, arrays, strings, recursion, file I/O, etc.)
- Documentation including README with language features and integration guide
Reviewed changes
Copilot reviewed 58 out of 58 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| kumir/KumirLexer.g4 | Defines lexical tokens for Kumir with case-insensitive Cyrillic keywords |
| kumir/KumirParser.g4 | Defines parsing rules for language constructs including complex algorithm names |
| kumir/README.md | Comprehensive documentation covering language features, grammar files, and usage |
| kumir/desc.xml | Configuration file for grammars-v4 test infrastructure integration |
| kumir/examples/*.kum | 60 example programs demonstrating various Kumir language features |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Kumir is a Russian algorithmic language primarily used for teaching programming in schools.