- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 33.2k
gh-126807: pygettext: Do not attempt to extract messages from function definitions. #126808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. This commit fixes that by keeping track of the previous token and checking if it's 'def' or 'class'.
| @@ -1,11 +1,10 @@ | |||
| #! /usr/bin/env python3 | |||
| # -*- coding: iso-8859-1 -*- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no other file that uses this encoding, I think it's safe (and more practical) to use utf-8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not related change, so please keep the coding cookie.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! I'll revert :) Would you accept a separate (perhaps not backported) PR that removes the coding and the commented-out code or do you think it's not worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll accept it if there are pygettext tests for files with non-UTF-8 encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, I'll add it to my todo list :)
        
          
                Tools/i18n/pygettext.py
              
                Outdated
          
        
      | if ( | ||
| ttype == tokenize.NAME and tstring in opts.keywords | ||
| and (not self.__prev_token or not _is_def_or_class_keyword(self.__prev_token)) | ||
| ): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new logic is, if we see one of the gettext keywords and the previous token is not def or class, only then we transition to __keywordseen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that no warnings are emitted if option --docstrings is used. I think that we can use a similar approach. We can add
            if ttype == tokenize.NAME and tstring in ('class', 'def'):
                self.__state = self.__ignorenext
                return
where __ignorenext simply sets self.__state = self.__waiting.
| @@ -1,11 +1,10 @@ | |||
| #! /usr/bin/env python3 | |||
| # -*- coding: iso-8859-1 -*- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not related change, so please keep the coding cookie.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 👍
| @@ -1,11 +1,10 @@ | |||
| #! /usr/bin/env python3 | |||
| # -*- coding: iso-8859-1 -*- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll accept it if there are pygettext tests for files with non-UTF-8 encoding.
| Thanks @tomasr8 for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13. | 
…unction definitions. (pythonGH-126808) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. (cherry picked from commit 9a45638) Co-authored-by: Tomas R. <[email protected]>
| GH-126846 is a backport of this pull request to the 3.13 branch. | 
…unction definitions. (pythonGH-126808) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. (cherry picked from commit 9a45638) Co-authored-by: Tomas R. <[email protected]>
| GH-126847 is a backport of this pull request to the 3.12 branch. | 
…function definitions. (GH-126808) (GH-126847) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. (cherry picked from commit 9a45638) Co-authored-by: Tomas R <[email protected]>
…function definitions. (GH-126808) (GH-126846) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning. (cherry picked from commit 9a45638) Co-authored-by: Tomas R <[email protected]>
…unction definitions. (pythonGH-126808) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning.
…unction definitions. (pythonGH-126808) Fixes a bug where pygettext would attempt to extract a message from a code like this: def _(x): pass This is because pygettext only looks at one token at a time and '_(x)' looks like a function call. However, since 'x' is not a string literal, it would erroneously issue a warning.
Fixes a bug where pygettext would attempt to extract a message from a code like this:
This is because pygettext only looks at one token at a time and
_(x)looks like a function call.However, since
xis not a string literal, it would erroneously issue a warning.This PR fixes that by keeping track of the previous token and checking if it's
deforclass.