-
Notifications
You must be signed in to change notification settings - Fork 54
Cache
This wiki page is about caching code chunks.
We do not recommend that you set the chunk
option cache = TRUEglobally in a document. Caching can be fairly tricky. Instead, we recommend that you enable caching only on individual code chunks that are surely time-consuming and do not have side effects.
We will follow this advice. We will not set set the chunk
option cache = TRUE globally in a document.
Instead, we will use the cache=TRUE option for each individual chunk
that we want to cache. We will cache code chunks that:
-
Load data
-
Run time consuming calculations
Further following the advice in R Markdown Cookbook, Chapter 11, Chunk Options, we will use our own chunk option to ensure that the cache is invalidated (and the chunk is run again) whenever the data is updated. Specifically,
You have to let knitr know if the data file has been changed. One way to do it is to add another chunk option
cache.extra = file.mtime('my-precious.csv')or more rigorously,cache.extra = tools::md5sum('my-precious.csv'). The former means if the modification time of the file has been changed, we need to invalidate the cache. The latter means if the content of the file has been modified, we update the cache. Note thatcache.extrais not a built-in knitr chunk option. You can use any other name for this option, as long as it does not conflict with built-in option names.
Instead of calling our chunk option cache.extra, which isn’t very
descriptive, we will call it cache.invalidate.if. So, the entire chunk
option will be written as:
cache = TRUE, cache.invalidate.if = tools::md5sum('file_name.ext'
Actually, it seems like the
cache.invalidate.if = tools::md5sum('file_name.ext' might be causing
the code chunks to invalidate every time we build the book. We will
remove that code chunk option for now and see if it causes any problems.