Skip to content

Consider using content hashes for change detection instead of modification time #11556

@dbitouze

Description

@dbitouze

Continuous integration with Sphinx-doc, as described here, works well.

Unfortunately, on gitlab.com, it is Docker based and clones each repository fresh when it starts running continuous integration. So even if a single source file is modified, all the corresponding HTML pages of all the .rst source files are rebuilt (although the cache claims to be restored and, indeed, so is the doctree directory). This isn't a problem if there are only a few source files, but it becomes unusable if there are a lot (more than 1,200 in my real-life use case: the rebuild takes more than 15 minutes and a lot of resources are consumed unnecessarily).

This problem may be due to the fact that Git, unlike other version control systems, does not preserve the original timestamp of committed files. So relying on git-restore-mtime should be a solution. But this is not the case, as you can see with the following sandbox repository:

https://gitlab.com/denisbitouze/minimal-sphinx-minimal/

where the commit changes (only) the source test.rst file but triggers also the rebuild of the index.html file corresponding to the index.rst source file that hasn't been changed.

I've had a look at the code but can't work out how sphinx change detection works. Would it be possible to document this? It would be very useful, especially nowadays when CI/CD is becoming more and more popular and useful.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions