- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 33.2k
gh-106670: Allow Pdb to move between chained exceptions #106676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Addressed some of the comments in the linked issue, rebased pushed, and updated description of this issue to reflect the codebase status. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments. For this feature to be merged, you'll need to work on the docs as well (it's a new command). We need review from a core dev, especially on the interface. The main concern I have is about the conflict between backward-compatibility and readability:
def post_mortem(traceback=None):
    # We allow(actually prefer) the input traceback to be an Exception.How should we deal with that? Changing the argument name traceback would break some backward compatibility, enforcing the argument to be an Exception would break a lot.
This is a more serious concern for public interfaces, but also on private methods, for example, Pdb.interaction takes a tb argument which can be an Exception now. This is not good for readability.
Could @iritkatriel or @brandtbucher share some thoughts on this? Thanks!
        
          
                Misc/NEWS.d/next/Library/2023-07-12-10-59-08.gh-issue-106670.goQ2Sy.rst
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | 
 I'm' not a particularly huge fan of the following but we can do: That way we only emit on use that use the keyword way, and this avoid using *args, **kwargs and custom hard to understand logic. Then in N versions you can go to  We can try to apply something similar to  | 
| 
 This might be too complicated. I'm personally leaning to the more aggressive way - rename the argument to  I did a preliminary search on github with  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the overall direction is great and we should have this feature in pdb. However, we do need opinions from core devs, especially for the backward compatibility part. I'll wait for a couple of days and reach out to core devs if no one volunteers :)
| Thanks for the feedback and patience, I'm back from taking some time off and updated to reflect your comment. I'd appreciate if you can get a core dev review. | 
| Rebased on main (except last commit to allow to just see the changes since  I now to a breadth first listing of the context/cause tree of exceptions. I've rephrased the first commit description as well as the PR description to mention  I think we could have a better UI to see the chain as a tree when doing  | 
        
          
                Lib/pdb.py
              
                Outdated
          
        
      | If given an integer as argument, switch to the exception at that index. | ||
| """ | ||
| if not arg: | ||
| nwid = len(str(len(self._chained_exceptions) - 1)) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just use 3 and somehow limit the number of exceptions in a chain you browse to 999. Truncate (with some msg) if there are more or something like that.
Which reminds me that we also need to handle cycles in the chain. Look at the traceback module to see how it's done there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, hardcoded to 3 digits (999) set on a private class variable of Pdb just in case.
At startup Pm stops after 999 exception and print a message that only the first 999 will be browsable with exceptions. I was not sure it was worth to make that lazy until called for in do_exceptions.
While I was also dealing with cycles, I added support for exceptions groups as well and added test with various exceptions, groups, cause, context...
        
          
                Lib/pdb.py
              
                Outdated
          
        
      | traceback, exception = tb_or_exc.__traceback__, tb_or_exc | ||
| current = exception | ||
| while current: | ||
| currents = [exception] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow - what it currents, and what is _exceptions, and why do we need both?  Are you trying to allow navigating into exception groups? I think that would require some more design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I might have taken your following comment too literally:
Which reminds me that we also need to handle cycles in the chain. Look at the traceback module to see how it's done there.
In the tb module you use a queue and a set of _seen traceback object ids. I assumed that what you wanted me to do, but the queue there can only old multiple elements because it handles exceptions group I belive. So I assumed you pointed me there for me to also see the exception group.
I've remove the handling of exception groups, and reverted to the previous logic, simply check if the cause or context is in the exceptions we've already visited.
Unless I still misunderstand what you asked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I wasn't clear. This is what I meant. This seems fine from the exceptions POV, but I'd like @gaogaotiantian to think about implications for PDB. In particular, we are now saving a list of exceptions, which can create references to practically everything in the program through tracebacks (which reference frames). So what are the implications re memory leaks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option is to store a traceback.TracebackException object instead of the exception itself. These are objects that don't hold references to the whole world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can use a weaklist, as the result is anyway derived from sys.last_exc, there is at least this that keeps everything around and it is unlikely to disapear untl we exit the pm ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding onto this immediately, it looks like most setup/forget are in pairs, if we want to do a refactor that I would prefer to do as a separate PR, I would change the setup/forget pair into a context manager.
I also pushed a commit that remove references to the exceptions after the command loop in interaction, and put a comment as to why it's not in forget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, setup() is to setup the frames and exceptions needs to setup a new set of frames when switching. Never mind.
Can we still pack the code setting up the exceptions to a separate function and call it from interaction? The cleaning part is compact and easy to follow at this point, but the setup piece is a bit large.
I actually have a separate question for @iritkatriel - we expect this feature mainly used with pm right? What's the most likely usage from the users? I would guess it's executing the code with CLI - python -m pdb the_code.py. However, the current implementation won't work on it because the main() function is not changed for the new feature. Should we change the main() function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change the
main()function?
I don't see why not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then let's change the main function as well, so that python -m pdb raise_exception.py could enter pm with exception list. Another minor issue - when there's no exception, exceptions command does nothing. I think we should tell the user that there's no exceptions or something like that - so the users know that they are not supposed to use the command. I know break does the same thing - I don't like that either but let's deal with it later. We should do something like display - let the users know what happened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Message added in do_exceptions if no chained exception found.
- Main function modified to pass exception to post_mortem by default.
- Extracted the logic to create self._chained_exceptionsinto both a utility function and a context manager to make sure references are properly released.
This lets Pdb receive an exception, instead of a traceback, and when
this is the case and the exception are chained, the new `exceptions` command
allows to both list (no arguments) and move between the chained exceptions.
That is to say if you have something like
    def out():
        try:
            middle()                                # B
        except Exception as e:
            raise ValueError("foo(): bar failed")   # A
    def middle():
        try:
            return inner(0)                         # D
        except Exception as e:
            raise ValueError("Middle fail") from e  # C
    def inner(x):
        1 / x                                       # E
Only A was reachable after calling `out()` and doing post mortem debug.
With this all A-E points are reachable with a combination of up/down,
and ``exception <number>``.
This also change the default behavior of ``pdb.pm()``, as well as
`python -m pdb <script.py>` to receive `sys.last_exc` so that chained
exception navigation is enabled.
We do follow the logic of the ``traceback`` module and handle the
``_context__`` and ``__cause__`` in the same way. That is to say, we try
``__cause__`` first, and if not present chain with ``__context__``. In
the same vein, if we encounter an exception that has
``__suppress_context__`` (like when ``raise ... from None``), we do stop
walking the chain.
Some implementation notes:
 - We do handle cycle in exceptions
 - cleanup of references to tracebacks are not cleared in ``forget()``, as
   ``setup()`` and ``forget()`` are both for setting a single
   exception.
 - We do not handle sub-exceptions of exception groups.
Closes pythongh-106670
    Also move the release of the list of exception to a context manager for security
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just need to resolve the warning in the news file.
Co-authored-by: Alex Waygood <[email protected]>
        
          
                Doc/whatsnew/3.13.rst
              
                Outdated
          
        
      |  | ||
| * Add ability to move between chained exceptions during post mortem debugging in :func:`~pdb.pm` using | ||
| the new ``exceptions [exc_number]`` command for Pdb. (Contributed by Matthias | ||
| Bussonnier in :gh:`106676`.) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry one last nit. The modules here are sorted alphabetically, so this needs to be between path lib and sqlite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, done and rebased/squashed a bit for a cleaner history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, done and rebased/squashed a bit for a cleaner history.
Please don't use rebases or squashes on CPython PRs, when you can help it :) It interacts badly with the GitHub UI, making it hard for reviewers to see what changed between commits. Everything in CPython is squash-merged anyway, so we really don't care about messy commit history within PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
In the future don't squash. We squash on merge anyway, so the commit history doesn't end up in the commit log. But it's useful to have it on the PR, and it's easier to review just the last commit and see what you've changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can also un-squash as well I have the reflog.
Head branch was pushed to by a user without write access
| @iritkatriel FYI the latest force-push automatically disabled auto-merge | 
| Thanks all for the reviews and the merge let me know if there is anything I can help with. | 
Closes ipython#13982 This is a "backport" of python/cpython#106676 See documentation there
Closes ipython#13982 This is a "backport" of python/cpython#106676 See documentation there
Closes #13982 This is a "backport" of python/cpython#106676 See documentation there
The introduction of chained exception in pythongh-106676 would lead to File .../Lib/pdb.py", line 298, in setup self.curframe = self.stack[self.curindex][0] ~~~~~~~~~~^^^^^^^^^^^^^^^ IndexError: list index out of range This fixes that by filtering exceptions that that do not have a stack. Update tests to not use stack-less exceptions when testing another feature, and add an explicit test on how we handle stackless exceptions.
The introduction of chained exception in pythongh-106676 would lead to File .../Lib/pdb.py", line 298, in setup self.curframe = self.stack[self.curindex][0] ~~~~~~~~~~^^^^^^^^^^^^^^^ IndexError: list index out of range This fixes that by filtering exceptions that that do not have a stack. Update tests to not use stack-less exceptions when testing another feature, and add an explicit test on how we handle stackless exceptions.
The introduction of chained exception in pythongh-106676 would lead to File .../Lib/pdb.py", line 298, in setup self.curframe = self.stack[self.curindex][0] ~~~~~~~~~~^^^^^^^^^^^^^^^ IndexError: list index out of range This fixes that by filtering exceptions that that do not have a stack. Update tests to not use stack-less exceptions when testing another feature, and add an explicit test on how we handle stackless exceptions.
The introduction of chained exception in pythongh-106676 would sometime lead to File .../Lib/pdb.py", line 298, in setup self.curframe = self.stack[self.curindex][0] ~~~~~~~~~~^^^^^^^^^^^^^^^ IndexError: list index out of range This fixes that by filtering exceptions that that do not have a stack/traceback. Update tests to not use stack-less exceptions when testing another feature, and add an explicit test on how we handle stackless exceptions.
| I seemed to have missed and edge case that make the debugger crash, see #108865 as a followup. | 
| Actually there's a bug in the documentation, it says you can use  | 
| @user202729 thank you for pointing this out. Fixed in #132277 | 
Allow Pdb to move between chained exception.
This lets Pdb receive an exception, instead of a traceback, and when
this is the case and the exception are chained, the new
exceptionscommandallows to both list (no arguments) and move between the chained exceptions.
That is to say if you have something like
Only A was reachable after calling
out()and doing post mortem debug.With this all A-E points are reachable with a combination of up/down,
and
exceptions <number>.This also change the default behavior of
pdb.pm(), as well aspython -m pdb <script.py>to receivesys.last_excso that chainedexception navigation is enabled.
We do follow the logic of the
tracebackmodule and handle the_context__and__cause__in the same way. That is to say, we try__cause__first, and if not present chain with__context__. Inthe same vein, if we encounter an exception that has
__suppress_context__(like whenraise ... from None), we do stopwalking the chain.
Some implementation notes:
forget(), assetup()andforget()are both for setting a singleexception.
context manager.
maximum number we allow
Closes gh-106670