- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 33.2k
gh-72902: improve Fraction(str) speed (don't use regexp's) #133994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
      
        
              This comment was marked as resolved.
        
        
      
    
  This comment was marked as resolved.
      
        
              This comment was marked as resolved.
        
        
      
    
  This comment was marked as resolved.
      
        
              This comment was marked as resolved.
        
        
      
    
  This comment was marked as resolved.
Co-authored-by: Wolfgang Maier <[email protected]>
Co-authored-by: Pieter Eendebak <[email protected]>
3687e34    to
    05e9110      
    Compare
  
    
      
        
              This comment was marked as resolved.
        
        
      
    
  This comment was marked as resolved.
Co-authored-by: Pieter Eendebak <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skirpichev Could you also add a test case for Fraction('232e\t2') (this should fail). One optimization I tried passes all the tests, but fails on this one.
Tests pass, and all paths in the new code are needed, so approving. The new code is not very easy to read, but neither is the regular expression that is currently in main.
| 
 Done. 
 I think that the later is more clear. Also, it has more nice tracebacks. So, I'm still not sure about this. Speed gain is x2 at most. Maybe regexps can be eventually optimized? | 
| I was surprised that manual parsing is faster than a regexp-based approach. The regular expression is not optimal, so I tried to optimize it, but the benefit was minuscule. The regexp engine needs to be optimized. Can we please wait a few months with these changes while I try to optimize the regexp engine for such case? This seems so typical that a lot of code would benefit from optimization. | 
| 
 But not too much. 
 Sure. I'm not familiar with the regexp module code and I would be interested in such fix, if it's possible at all. BTW, so far this pr was not approved by core dev. @serhiy-storchaka, but what about other change, i.e. moving down numbers.Rational check? Maybe I should factor out this to a separate pr? See benchmarks (third column) and my arguments for this. | 
| Ok, now this has only optimization for str inputs. Other goes to #134320 | 
For reviewers. I'm not sure if it worth (code seems more complex, IMO), but speedup for string parsing seems noticeable (1.3-1.5x).