Avoid blindly shadowing existing exceptions through `PyErr_Format`

# Feature or enhancement

### Proposal:

**TL;DR**: I am **not** suggesting adding the assert at runtime. I am only suggesting that we avoid setting exceptions if another exception is already set, unless we explicitly clear that one before calling `PyErr_Format`.

**TL;DR 2**: This is NOT an easy work and I plan to work on it myself.

I've seen many times where `PyErr_*` functions are called to set an exception while another exception is already set. It doesn't cause a crash because we always clear the error indicator in `PyErr_Format` since we need to use `PyUnicode_FromFormatV`. However, this means that we may possibly be hiding some exceptions that actually shouldn't be hidden.

One example is `PyLong_FromUnicodeObject` which calls `PyLong_FromString` and sets an exception on failure that is *identical* to the one that could be set by `PyLong_FromString`. In particular, we actually do not propagate a MemoryError from `PyLong_FromString` (which happens if `PyUnicode_FromStringAndSize` called in `PyLong_FromString` fails).

I am currently investigating the other cases, but many tests would be failing if I add an `assert(!PyErr_Occurred());` inside `PyErr_Format()` (mostly because of double-setting the same exception). I would suggest that we first try to change the functions that actually set double exceptions, and then explicitly document that `PyErr_Format` shouldn't be called with an exception set as it will replace the latter.

## Analysis

I was more or less able to fix `assert(!PyErr_Occurred());` and the patterns I've seen until are the following:

### Exact matching exceptions overwritten

Sometimes, we explicitly match some exception types and then overwrite the exception. This happens in the numerical API with the `_range_error` function. 

I've created something locally for that one but won't open a PR for now. The idea is to just look at places that match some specific exceptions, clear it, and then set the new exception. The intent here was anyway to clear any existing exception.

### Exceptions redundantly overwritten 

Sometimes, we overwrite `TypeError` or `ValueError` exceptions with either the ... same exception message (because one C API function calls another C API, and both of them need to set exceptions as they can be used distinctly). The solution to that is to read the code and only set a "fresh" exception in the caller if no error is already set (it's cheaper than checking if the callee's exception matches a specific type).

### Exceptions in `Python/getargs.c`

The parsing process in `getargs.c` is messy on the exception handling side. Whenever there is an error, we use some `converterr` function, which is only here for creating error messages. However, we don't "chain" possible exceptions here. If during the conversion, the C API is called and failed, then we overwrite that error with a more general error saying "cannot convert" or something like this. I think I'll need to rewrite the entire logic of setting exceptions to know whether we should set a new exception message or not depending on what we are currently parsing. This will be a separate PR.

### Exceptions from `PyBuffer_GetObject`

The most common "bad" occurrences I found are related to the usage of `PyBuffer_GetObject`. As per PEP-3118, any object implementing the Buffer API in C **must** use `PyExc_BuffreError` if the request cannot be fulfilled and return -1 (and it's highlighted in the docs https://docs.python.org/3/c-api/buffer.html#c.PyObject_GetBuffer with a huge **MUST**, so it's not a "maybe", it's explicit). In other words, if a custom implementation uses the C API, all their exceptions must be converted into `PyExc_BufferError`. I'm not sure that all implementations respect this, as it's easy to simply propagate the exception.

Unfortunately, almost *all* places that use `PyBuffer_GetObject` and set an exception afterwards disregard a possible `PyExc_BuffreError` that could have happened and overwrite any exception by usually raising a simple TypeError. Otherwise, they just return -1 and propagate any existing exception (which is fine). For those overwriting the exception, this is not good. Instead, I decided on the following approach:

- If `PyBuffer_GetObject` raised a `TypeError`, we can safely overwrite it. We may lose information though if it was raised not from the base implementation of `PyBuffer_GetObject` but if it was raised from a custom implementation. However, there is currently no way to distinguish between an exception set outside CPython and an exception set by CPython. We could try checking against the matched messages but if we were to change the error message, then the entire machinery collapses.
- If `PyBuffer_GetObject` raised a non-`TypeError` exception and we actually overwrite with a `TypeError`, I suggest adding that exception as a `__context__` of that `TypeError`. There shouldn't be cycles here because the `TypeError` would be newly created. I doubt people would be happy if I were to add some eager assert about the fact that only `BufferError` can be raised from `PyBufferProcs.bf_getbuffer` so I'll abstain for now. Maybe adding a warning would be a good idea in the first place (but it may become annoying as most of the time, this warning would come up in extension modules that extensively use the Buffer API, and usually outside the user's control).


### Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

### Links to previous discussion of this feature:

There was a previous suggestion here: https://github.com/python/cpython/issues/67759 where it was mentioned that "`PyUnicode_FromFormatV()` must not be called with an exception set". 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Avoid blindly shadowing existing exceptions through `PyErr_Format` #139447

Feature or enhancement

Proposal:

Analysis

Exact matching exceptions overwritten

Exceptions redundantly overwritten

Exceptions in `Python/getargs.c`

Exceptions from `PyBuffer_GetObject`

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Avoid blindly shadowing existing exceptions through PyErr_Format #139447

Description

Feature or enhancement

Proposal:

Analysis

Exact matching exceptions overwritten

Exceptions redundantly overwritten

Exceptions in Python/getargs.c

Exceptions from PyBuffer_GetObject

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Avoid blindly shadowing existing exceptions through `PyErr_Format` #139447

Exceptions in `Python/getargs.c`

Exceptions from `PyBuffer_GetObject`