Skip to content

Expandable sidebar sections with huge sphinx sites (eg large API reference) results in a very slow buildΒ #364

@jorisvandenbossche

Description

@jorisvandenbossche

Now that the collapsible sections in the sidebar are in master, I am noticing that this gives some challenges in the pandas docs, and especially the API reference part.

If you have a huge site with many pages, the change in #349 has the consequence that each page is now included in the toctree (we use collapse=False now), which means that 1) the build is much slower and 2) the size of the html build artefact is much bigger.

Specifically for the pandas API docs (https://pandas.pydata.org/docs/dev/reference/index.html), we have more than 1000 API pages, and since the docs are built with #349 being merged, the build time went from ~15min to ~55min and the build artifact from 180Mb to 1.1GB. The html of a single page now contains 15,000 lines for the sidebar ..

(aside: we have too many API pages in pandas, as some are identical (duplicated) pages for shared methods accross subclasses, this is something we can improve and deduplicate. But even with less API pages, you will still notice the difference)

Brainstorming some ways this issue could be addressed (in general, and for pandas specifically):

  • Speed-wise: could we cache some results? Since each page shares almost the same html, but not exactly (the current/active page is different), I am not sure this is possible
  • The fact that autosummary automatically adds those pages to the toctree, so they end up in the sidebar, is IMO not really necessary (eg see https://pandas.pydata.org/docs/dev/reference/io.html, all functions are already listed on the page, so that they are also included in the sidebar is a bit duplicative, and it makes the sidebar also very long). But I am not sure how avoid this using sphinx.
  • We could set collapse=True specifically in the API reference section of the pandas docs. Pandas could do this by overriding the template and depending on the pagename set collapse to True or False. Or would there be another way to configure this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions