Skip to content

Conversation

@ievgen-kapinos
Copy link

@ievgen-kapinos ievgen-kapinos commented Oct 27, 2025

Closes #2610

In this PR added execution for all python code snippets in documentation via sphinx.ext.doctest

In this PR:

  1. Added a new CI step with Sphinx's doctest build. Effectively we test:
    • code blocks marked with testcode directive in *.md files
    • code blocks from python's docstrings imported via autoclass directive in *.rst files
  2. Added documentation how to execute code snippets locally
  3. Made all output files unique. So it is possible to check output of all snippets in docs/_build/doctest/pypdf_test directory
  4. Reconfigured CI pipeline steps for all Sphinx builds (html and doctest). Now sphinx-build tool is run in docs directory and we have output file structure in docs/_build that match:
  5. Code snippets now executable. But it make sense to review all used input files and make sure that content match text and images used in documentation. It is not always the case. I consider it as out of scope of this task.
  6. Few tests marked as :skipif: True
    • We do not test code which require access to Cloud: AWS and GCP
    • We do not test code which require additional dependencies: svgwrite and pdfminer.six. These can be added in requirements/docs.in. Please let me know if we need them.

P.S. Read the Docs has nice feature to view diff between main and PR branch. See diff for this PR

@codecov
Copy link

codecov bot commented Oct 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.10%. Comparing base (3b5c85f) to head (9cefe93).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3507   +/-   ##
=======================================
  Coverage   97.10%   97.10%           
=======================================
  Files          57       57           
  Lines        9711     9711           
  Branches     1759     1759           
=======================================
  Hits         9430     9430           
  Misses        168      168           
  Partials      113      113           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ievgen-kapinos
Copy link
Author

Hi @j-t-1 and @stefan6419846

May I ask you to pre-review overall approach?

As for now implemented only execution via CI. So far we have 72 failures. See full list here

If in general this approach is acceptable, then in scope of this PR I'll:

  • fix all examples in docs. I expect PR will be very big after it. So I'd request pre-review
  • add description in documentation on how it works and how we can execute it locally

@ievgen-kapinos
Copy link
Author

FYI Code Blocks in HTML rendered as before after I've replaced python with {testcode}

sphinx-build --nitpicky --fail-on-warning --keep-going --show-traceback --builder html docs build/sphinx/html
- name: Test docs examples
run: |
sphinx-build --nitpicky --fail-on-warning --keep-going --show-traceback --builder doctest docs build/sphinx/html
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all of these parameters required for this call? I am especially not sure about the output directory, as just running doctests should not write any HTML in theory?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These parameters do nothing. I've removed them. Thanks!

As for output directory - it is necessary. Extension sphinx.ext.doctest creates .doctrees sub-folder with *.doctree files. I suppose these files contain documentation in some universal format which than can be transformed to variety of different formats. Previous CI step (when we run sphinx-build ... --builder html ...) also creates this sub-folder. As I see Sphinx is smart enough and looks for outdated files. So it makes sense to point to build/sphinx/html folder.

@ievgen-kapinos ievgen-kapinos marked this pull request as ready for review November 1, 2025 10:36
@ievgen-kapinos
Copy link
Author

Hi @stefan6419846 and @j-t-1

This PR is ready for review. Please check out updated description first.

Copy link
Collaborator

@stefan6419846 stefan6419846 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

I have left some reviews comments where your changes were not clear to me. Please note that some comments might cover other files with similar patterns as well.

Copy link
Author

@ievgen-kapinos ievgen-kapinos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefan6419846 Thanks a lot for the review! It is big and requires a lot of effort from your side. I appreciate it.

I've covered all your questions. Many of them were just fixed (and marked with 👍). As for the rest, I've added explanations and sometimes questions. Let me now what need to be done next in this PR

page.merge_page(stamp, over=False) # here set to False for watermarking
writer.write("out.pdf")
writer.write("out-underlay.pdf")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use "underlay" somewhere else as well? Otherwise I would go with consistent naming like "stamp" for over and "watermark" for under, which seems to be the current naming scheme?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

want the text. You might use pdfminer.six as a fallback and do this:

```python
% We prefer not to execute doc examples for third-party package "pdfminer" used only in one code snippet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package is named pdfminer.six:

Suggested change
% We prefer not to execute doc examples for third-party package "pdfminer" used only in one code snippet
% We prefer not to execute doc examples for third-party package "pdfminer.six" used in one code snippet only

Copy link
Author

@ievgen-kapinos ievgen-kapinos Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... missed. It was mentioned 3 lines above 👍

file = os.path.join(dst_root_dir, file_name)
if os.path.isfile(file):
if not has_files:
print("Docs page was not configured propery for running code examples")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these should print to stderr instead of stdout?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of stderr doctest ignores output see official doc. Just double-checked, message appears in output, but test not fail. My intension is opposite, force to use {testsetup} directive if doc examples spit output files.

docs/conf.py Outdated
print("Deleting unexpected file(s) in " + dst_root_dir)
has_files = True
print(f"- {{file_name}}")
os.remove(file) # We should not affect other tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
os.remove(file) # We should not affect other tests
os.remove(file) # Avoid side effects on other tests

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

docs/conf.py Outdated
has_files = False
for file_name in os.listdir(dst_root_dir):
file = os.path.join(dst_root_dir, file_name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid using the (previously) builtin name file to avoid confusion.

Copy link
Author

@ievgen-kapinos ievgen-kapinos Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done... thanks. I've learned something new 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Execute docs examples in CI

2 participants