Kumir lang support by Bormotoon · Pull Request #4477 · antlr/grammars-v4

Bormotoon · 2025-04-26T13:19:41Z

Kumir is a Russian algorithmic language primarily used for teaching programming in schools.

kumir/KumirParser.g4

kumir/examples/1-empty.kum

kumir/KumirLexer.g4

kumir/KumirParser.g4

kumir/KumirLexer.g4

kumir/KumirParser.g4

… been added.

teverett · 2025-05-18T14:56:54Z

Well, something is wrong. The check for the ability to merge is not completing.

teverett · 2025-06-03T20:10:17Z

@Bormotoon for some reason the check for merge is not ending. Couuld you close and re-open this please?

KvanTTT

I recommend the lexer's simplifying in the suggested way.

KvanTTT · 2025-06-25T11:54:38Z

kumir/KumirLexer.g4

+// --- Keywords (Core Language) ---
+// Keywords are case-insensitive (both lowercase and uppercase Cyrillic are matched).
+MODULE              : 'модуль';
+ENDMODULE           : ('конец' WS 'модуля' | 'конецмодуля' | 'конец_модуля');


I recommend simplifying:

Suggested change

ENDMODULE : ('конец' WS 'модуля' | 'конецмодуля' | 'конец_модуля');

fragment: WS_FRAGMENT: [ \t\r\n]+;

ENDMODULE : 'конец' (WS_FRAGMENT | '_')? 'модуля';

KvanTTT · 2025-06-25T11:55:04Z

kumir/KumirLexer.g4

+POST_CONDITION      : 'надо';
+ASSERTION           : 'утв';
+LOOP                : 'нц';
+ENDLOOP_COND        : ('кц' WS 'при' | 'кц_при');


Suggested change

ENDLOOP_COND : ('кц' WS 'при' | 'кц_при');

ENDLOOP_COND : 'кц' (WS_FRAGMENT | '_')? 'при';

KvanTTT · 2025-06-25T11:55:48Z

kumir/KumirLexer.g4

+OR                  : 'или';
+OUT_PARAM           : 'рез';
+IN_PARAM            : 'арг';
+INOUT_PARAM         : ('аргрез' | 'арг' WS 'рез' | 'арг_рез');


Suggested change

INOUT_PARAM : ('аргрез' | 'арг' WS 'рез' | 'арг_рез');

INOUT_PARAM : 'арг' (WS_FRAGMENT | '_')? 'рез';

KvanTTT · 2025-06-25T11:56:33Z

kumir/KumirLexer.g4

+INTEGER_ARRAY_TYPE  : ('цел' WS? 'таб' | 'цел_таб');
+REAL_ARRAY_TYPE     : ('вещ' WS? 'таб' | 'вещ_таб');
+CHAR_ARRAY_TYPE     : ('сим' WS? 'таб' | 'сим_таб');
+STRING_ARRAY_TYPE   : ('лит' WS? 'таб' | 'лит_таб');
+BOOLEAN_ARRAY_TYPE  : ('лог' WS? 'таб' | 'лог_таб');


Suggested change

INTEGER_ARRAY_TYPE : ('цел' WS? 'таб' | 'цел_таб');

REAL_ARRAY_TYPE : ('вещ' WS? 'таб' | 'вещ_таб');

CHAR_ARRAY_TYPE : ('сим' WS? 'таб' | 'сим_таб');

STRING_ARRAY_TYPE : ('лит' WS? 'таб' | 'лит_таб');

BOOLEAN_ARRAY_TYPE : ('лог' WS? 'таб' | 'лог_таб');

INTEGER_ARRAY_TYPE : 'цел' (WS_FRAGMENT | '_')? 'таб';

REAL_ARRAY_TYPE : 'вещ' (WS_FRAGMENT | '_')? 'таб';

CHAR_ARRAY_TYPE : 'сим' (WS_FRAGMENT | '_')? 'таб';

STRING_ARRAY_TYPE : 'лит' (WS_FRAGMENT | '_')? 'таб';

BOOLEAN_ARRAY_TYPE : 'лог' (WS_FRAGMENT | '_')? 'таб';

KvanTTT · 2025-06-25T11:57:21Z

kumir/KumirLexer.g4

+// Color constants
+PROZRACHNIY         : 'прозрачный';
+BELIY               : 'белый';
+CHERNIY             : 'чёрный' | 'черный';


Suggested change

CHERNIY : 'чёрный' | 'черный';

fragment E_OR_YO : 'ё' | 'е';

CHERNIY : 'ч' E_OR_YO рный';

KvanTTT · 2025-06-25T11:59:45Z

kumir/KumirLexer.g4

+ZELENIY             : 'зелёный' | 'зеленый';
+ZHELTIY             : 'жёлтый' | 'желтый';


Suggested change

ZELENIY : 'зелёный' | 'зеленый';

ZHELTIY : 'жёлтый' | 'желтый';

ZELENIY : 'зел' E_OR_YO 'ный';

ZHELTIY : 'ж' E_OR_YO 'лтый';

KvanTTT · 2025-06-25T12:00:33Z

kumir/KumirLexer.g4

+DOC_COMMENT         : '#' ~[\r\n]* -> channel(HIDDEN);
+
+// --- Whitespace ---
+WS                  : [ \t\r\n]+ -> skip;


Use the previously introduced WS_FRAGMENT:

Suggested change

WS : [ \t\r\n]+ -> skip;

WS : WS_FRAGMENT+ -> skip;

KvanTTT · 2025-06-25T12:01:36Z

kumir/KumirLexer.g4

+fragment DIGIT      : [0-9];
+fragment HEX_DIGIT  : [0-9a-fA-F];
+fragment LETTER     : [a-zA-Zа-яА-ЯёЁ];


The UPPER case is not needed since the grammar sets up caseInsensitive = true

Suggested change

fragment DIGIT : [0-9];

fragment HEX_DIGIT : [0-9a-fA-F];

fragment LETTER : [a-zA-Zа-яА-ЯёЁ];

fragment DIGIT : [0-9];

fragment HEX_DIGIT : [0-9a-f];

fragment LETTER : [a-zа-яё];

KvanTTT · 2025-06-25T12:01:48Z

kumir/KumirLexer.g4

+fragment LETTER     : [a-zA-Zа-яА-ЯёЁ];
+fragment DecInteger : DIGIT+;
+fragment HexInteger : '$' HEX_DIGIT+;
+fragment ExpFragment: [eE] [+-]? DIGIT+;


Suggested change

fragment ExpFragment: [eE] [+-]? DIGIT+;

fragment ExpFragment: [e] [+-]? DIGIT+;

teverett · 2025-06-26T16:34:03Z

Is this ready to merge?

KvanTTT · 2025-06-26T16:35:37Z

There are some minor issues to fix, but generally yes.

Bormotoon · 2025-06-26T16:36:38Z

Sorry, not yet. There will be some more changes and optimizations as suggested here.

Copilot

Pull request overview

This pull request adds comprehensive support for the Kumir programming language, a Russian algorithmic language primarily used for teaching programming in schools. The PR includes well-structured ANTLR v4 grammars (lexer and parser), extensive documentation, and a comprehensive test suite.

Key changes:

Complete ANTLR v4 grammar implementation for Kumir with lexer and parser definitions
60 example programs covering various language features (loops, arrays, strings, recursion, file I/O, etc.)
Documentation including README with language features and integration guide

Reviewed changes

Copilot reviewed 58 out of 58 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
kumir/KumirLexer.g4	Defines lexical tokens for Kumir with case-insensitive Cyrillic keywords
kumir/KumirParser.g4	Defines parsing rules for language constructs including complex algorithm names
kumir/README.md	Comprehensive documentation covering language features, grammar files, and usage
kumir/desc.xml	Configuration file for grammars-v4 test infrastructure integration
kumir/examples/*.kum	60 example programs demonstrating various Kumir language features

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

kumir/examples/str-func.kum

kumir/examples/matr-rand.kum

kumir/examples/downto.kum

kumir/examples/arr-input.kum

kumir/KumirParser.g4

kumir/KumirLexer.g4

Kumir lang support

657d280

kaby76 reviewed Apr 27, 2025

View reviewed changes

kumir/KumirParser.g4 Outdated Show resolved Hide resolved

kaby76 reviewed Apr 27, 2025

View reviewed changes

kumir/examples/1-empty.kum Outdated Show resolved Hide resolved

KvanTTT requested changes Apr 27, 2025

View reviewed changes

kaby76 reviewed Apr 27, 2025

View reviewed changes

kumir/KumirLexer.g4 Outdated Show resolved Hide resolved

kaby76 reviewed Apr 27, 2025

View reviewed changes

kumir/KumirParser.g4 Show resolved Hide resolved

teverett added kumir new-grammar New grammar issue or pull request labels May 4, 2025

Updated according to recommendations; examples from the textbook have…

66566da

… been added.

kaby76 mentioned this pull request May 6, 2025

Python runtime: NoViableAltException at <EOF> reported by listener despite successful exit from start rule in trace antlr/antlr4#4830

Closed

Another update with some fixes from recommendations

10cfb4a

kaby76 approved these changes May 18, 2025

View reviewed changes

teverett closed this Jun 17, 2025

teverett reopened this Jun 17, 2025

KvanTTT reviewed Jun 25, 2025

View reviewed changes

Some fixes as suggested.

354c096

Copilot AI review requested due to automatic review settings December 18, 2025 08:23

Copilot started reviewing on behalf of Bormotoon December 18, 2025 08:23 View session

Copilot AI reviewed Dec 18, 2025

View reviewed changes

Fixed stuff that copilot found

68bf09e

Bormotoon requested a review from KvanTTT December 18, 2025 09:26

KvanTTT approved these changes Dec 18, 2025

View reviewed changes

	ENDMODULE : ('конец' WS 'модуля' \| 'конецмодуля' \| 'конец_модуля');
	fragment: WS_FRAGMENT: [ \t\r\n]+;
	ENDMODULE : 'конец' (WS_FRAGMENT \| '_')? 'модуля';

	ENDLOOP_COND : ('кц' WS 'при' \| 'кц_при');
	ENDLOOP_COND : 'кц' (WS_FRAGMENT \| '_')? 'при';

	INOUT_PARAM : ('аргрез' \| 'арг' WS 'рез' \| 'арг_рез');
	INOUT_PARAM : 'арг' (WS_FRAGMENT \| '_')? 'рез';

	CHERNIY : 'чёрный' \| 'черный';
	fragment E_OR_YO : 'ё' \| 'е';
	CHERNIY : 'ч' E_OR_YO рный';

		ZELENIY : 'зелёный' \| 'зеленый';
		ZHELTIY : 'жёлтый' \| 'желтый';

	fragment ExpFragment: [eE] [+-]? DIGIT+;
	fragment ExpFragment: [e] [+-]? DIGIT+;

Conversation

Bormotoon commented Apr 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

teverett commented May 18, 2025

Uh oh!

teverett commented Jun 3, 2025

Uh oh!

KvanTTT left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

teverett commented Jun 26, 2025

Uh oh!

KvanTTT commented Jun 26, 2025

Uh oh!

Bormotoon commented Jun 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants