Skip to content

Conversation

JohannesGezachew
Copy link

This change allows you to use Amharic keywords alongside standard English keywords in Python source code.

Modifications include:

  1. Parser Generator (Tools/peg_generator/pegen/c_generator.py):

    • I updated the _setup_keywords method to include Amharic keywords in the reserved_keywords C array generated for the parser.
    • Amharic keywords are mapped to the token types of their corresponding English keywords (e.g., "ከሆነ" maps to the IF token type).
    • Keywords are now grouped by their UTF-8 byte length to ensure correct handling of multibyte Amharic characters in the C keyword array.
    • The n_keyword_lists (max keyword byte length) is also calculated based on UTF-8 byte lengths.
  2. Grammar (Grammar/python.gram):

    • I updated parsing rules to include Amharic keywords as alternatives to their English counterparts. For example, if_stmt can now be introduced by either 'if' or 'ከሆነ'.
    • I applied this change to all keywords specified in the issue, including those for simple and compound statements, expressions, and operators.
    • I also updated corresponding "invalid" rules in the grammar for consistency.

To use this feature:

  1. Regenerate the parser and related files: make regen-all
  2. Recompile CPython: make

This enables direct execution of Python code written with Amharic keywords, for example:

ተግባር አስላ(, ):
  ከሆነ  > :
    መልስ  - 
  አለበለዚያ:
    መልስ  - 

print(አስላ(5, 10))

This change allows you to use Amharic keywords alongside standard English
keywords in Python source code.

Modifications include:

1.  **Parser Generator (`Tools/peg_generator/pegen/c_generator.py`):**
    *   I updated the `_setup_keywords` method to include Amharic keywords in the `reserved_keywords` C array generated for the parser.
    *   Amharic keywords are mapped to the token types of their corresponding English keywords (e.g., "ከሆነ" maps to the IF token type).
    *   Keywords are now grouped by their UTF-8 byte length to ensure correct handling of multibyte Amharic characters in the C keyword array.
    *   The `n_keyword_lists` (max keyword byte length) is also calculated based on UTF-8 byte lengths.

2.  **Grammar (`Grammar/python.gram`):**
    *   I updated parsing rules to include Amharic keywords as alternatives to their English counterparts. For example, `if_stmt` can now be introduced by either 'if' or 'ከሆነ'.
    *   I applied this change to all keywords specified in the issue, including those for simple and compound statements, expressions, and operators.
    *   I also updated corresponding "invalid" rules in the grammar for consistency.

To use this feature:
1.  Regenerate the parser and related files: `make regen-all`
2.  Recompile CPython: `make`

This enables direct execution of Python code written with Amharic keywords, for example:

```python
ተግባር አስላ(ሀ, ለ):
  ከሆነ ሀ > ለ:
    መልስ ሀ - ለ
  አለበለዚያ:
    መልስ ለ - ሀ

print(አስላ(5, 10))
```
@python-cla-bot
Copy link

The following commit authors need to sign the Contributor License Agreement:

  • 161369871+google-labs-jules[bot]@users.noreply.github.com

CLA signed

@bedevere-app
Copy link

bedevere-app bot commented May 26, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@picnixz
Copy link
Member

picnixz commented May 26, 2025

This is a non-trivial feature that I don't think it's useful. Please read https://devguide.python.org/ before submitting pull requests with major changes.

@picnixz picnixz closed this May 26, 2025
@JohannesGezachew JohannesGezachew deleted the feat-amharic-keywords branch May 26, 2025 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants