add descriptive error message if re.split fails in numpy #784

tobiscode · 2025-03-14T08:58:10Z

Hi,
I ran into the same issue as reported in #758, and have created a fork to implement the solution proposed by @dhermangh. In its current form, it prints the affected content and the header it's under, but since the individual docstring conversion function does not know which function/class etc. it is operating on, it can't give that info (or a line number in the original file, for that matter). If there is a way to know the latter, I'm happy to add it as well, but for now, it at least suggest to the user that a type descriptor might be missing.
Does this look good to everyone? Happy to include more fixes.
Cheers,
Tobias

mhils · 2025-03-14T09:07:45Z

Thanks! I'd suggest we generically catch all exceptions in convert, and then do something like this:

except AnyException:
    raise RuntimeError(f"Docstring processing failed:\n{content=}\n{source_file=}\n{docformat=}") from e

(see pdoc.extract for AnyException - we can move it somewhere else)

tobiscode · 2025-03-17T08:33:34Z

Hi,

got it - do you want that in addition to the numpy-specific catch block, or instead of? The advantage of keeping both would be that the specific block can point exactly to the part of the docstring that is problematic, which a catch block in convert by itself could not. It might become too verbose for the numpy case, but if I define a NumPyParseException or somthing similar, I could also handle that properly.

Cheers

mhils · 2025-03-17T08:52:26Z

Instead of. The point of that is to help debugging and fixing pdoc.

I'm happy to merge a second PR that fixes the actual bug in the numpy docstring conversion.

tobiscode · 2025-03-17T09:37:26Z

that fixes the actual bug in the numpy docstring conversion.

I'm not sure it is a bug - the NumPy docstring convention requires there to be a type for both in- and output parameters. Sphinx has an extension that takes advantage of python type hints but I think it's just for convenience rather than a newer docstring convention. How would you have suggested to fix the parsing bug otherwise?

mhils · 2025-03-17T14:41:30Z

pdoc.docstrings.convert is not supposed to raise an exception under any circumstances. It may produce a warning, but it must not error.

tobiscode · 2025-03-17T15:43:00Z

pdoc.docstrings.convert is not supposed to raise an exception under any circumstances. It may produce a warning, but it must not error.

Now I'm even more confused, since in your first comment you suggested to catch AnyException in pdoc.docstrings.convert and then turn it into a RuntimeError. Don't get me wrong, I'm happy to implement whatever solution you prefer, it's just unclear to me what that is.

If you meant pdoc.docstrings.**numpy** is not supposed to raise an exception, then I could modify the re.split operation such that it throws a warning instead. In that case, I would need to know what the code is supposed to do for that section - just ignore it (in which case the Returns section would be empty) or add a dummy variable name (maybe _ in keeping with some Python standards, this would allow the description to still appear in the generated documentation).

mhils · 2025-03-17T16:07:36Z

Now I'm even more confused, since in your first comment you suggested to catch AnyException in pdoc.docstrings.convert and then turn it into a RuntimeError. Don't get me wrong, I'm happy to implement whatever solution you prefer, it's just unclear to me what that is.

It's not meant to ever throw an exception (but if it does, it should come with useful context for debugging). Phrased differently: If the code has a bug, except AnyException is supposed to attach the full docstring to the traceback for debuggability and then reraise.

If you meant pdoc.docstrings.numpy is not supposed to raise an exception,

Correct, it's not supposed to. If it does, it's a bug. Generally speaking, if something does not parse as a particular format, we should leave the text as-is. I haven't looked super close into the specific numpy case. In either case, that's a separate PR with a test case for more clarity.

Thanks!

tobiscode · 2025-03-17T16:12:28Z

Okay now I think I get it, thanks for the clarification. What I'll do then is:

move the try/except block from the numpy subfunction to the convert function, and update this PR with that commit, and
make a minimal test case that for the NumPy-specific case mentioned in pdoc should display which dostring line is causing the issue if a ValueError is raised on re.split #758 produces different test outputs. I'll write that to the issue first, and then after deciding how to handle it, I'll make a new PR.

mhils · 2025-03-17T16:15:46Z

Perfect, thanks!

tobiscode · 2025-03-17T16:44:59Z

Okay so it seems the test coverage action fails on these lines:

    except AnyException as e:
        raise RuntimeError(

How should I address this? By making a test case in test_docstrings.py which will raise that exception?

mhils · 2025-03-17T17:12:51Z

Yes please. The test needs to mock one of the subfunctions to raise. pytest's monkeypatch is great for that.

tobiscode · 2025-03-18T09:01:19Z

Okay, so since I wasn't familiar with monkeypatch I imitated how the AnyException catches were tested in test_extract.py and did something similar in test_docstrings.py. I hope that's okay, otherwise I can try to do it the monkeypatch way. With that, all tests pass, and the branch could be merged.

There is still a test warning appearing in the logs, however, which seems unrelated to my changes:

 test/test_doc_types.py::test_eval_fail2
  D:\a\pdoc\pdoc\pdoc\doc_types.py:148: UserWarning: Error parsing type annotation xyz for a. Import of xyz failed: name 'xyz' is not defined
    warnings.warn(

It looks to me in doc_types.py that this is one of the possible outcomes, but not the one it's testing for. I don't know how to fix this since I don't understand the logic flow in this test, but happy to make some changes if someone explains it to me. (And probably make a new PR for that.)

mhils · 2025-03-18T09:09:26Z

test/test_docstrings.py

    assert not s or content or options


+def test_convert_exception():


This is not a useful test because it tests unwanted behavior (we don't want any known cases of raising). Let's use pytest's monkeypatch to replace rst with a method that raises an exception, and then test with that as you currently do right now (with pytest.raises() is great).

Ahhh I see, that makes more sense.

Does the new commit look good now?

mhils

Thanks! 🍰 😃

add descriptive error message if re.split fails in numpy

1538ea8

Tobias Köhne and others added 2 commits March 17, 2025 16:31

move error catching from numpy() to convert(), reusing AnyException

242a252

[autofix.ci] apply automated fixes

6c26ce9

Tobias Köhne added 2 commits March 18, 2025 08:42

add error-raising test for docstring.convert

1bdfc83

fix test_smoke after moving AnyException

7a0c36e

Merge branch 'main' into split-error-trace

e17d523

mhils reviewed Mar 18, 2025

View reviewed changes

change test_convert_exception to use monkeypatch to raise error

3f386e9

mhils approved these changes Mar 18, 2025

View reviewed changes

mhils merged commit fd74e2e into mitmproxy:main Mar 18, 2025
14 checks passed

tobiscode mentioned this pull request Mar 18, 2025

pdoc should display which dostring line is causing the issue if a ValueError is raised on re.split #758

Closed

		assert not s or content or options


		def test_convert_exception():

Uh oh!

add descriptive error message if re.split fails in numpy #784

add descriptive error message if re.split fails in numpy #784

Uh oh!

Conversation

tobiscode commented Mar 14, 2025

Uh oh!

mhils commented Mar 14, 2025

Uh oh!

tobiscode commented Mar 17, 2025

Uh oh!

mhils commented Mar 17, 2025

Uh oh!

tobiscode commented Mar 17, 2025

Uh oh!

mhils commented Mar 17, 2025

Uh oh!

tobiscode commented Mar 17, 2025

Uh oh!

mhils commented Mar 17, 2025

Uh oh!

tobiscode commented Mar 17, 2025

Uh oh!

mhils commented Mar 17, 2025

Uh oh!

tobiscode commented Mar 17, 2025

Uh oh!

mhils commented Mar 17, 2025

Uh oh!

tobiscode commented Mar 18, 2025

Uh oh!

mhils Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tobiscode Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

tobiscode Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

mhils left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mhils Mar 18, 2025 •

edited

Loading