-
Help
DescriptionI have a model pipeline with a large model (GB+). This model runs 6 times per day, and each time I have it create a new cache, so I can come back to it hours or days later. To save disk space, I read the model into an object outside the pipeline so that it doesn't get cached in Example data, which only takes ~1s to read in. Example pipeline, which takes ~1s to read in the data, negligible time to run the target, but a whopping 30s to finish the pipeline. If I move the model inside the pipeline, it runs much faster: |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
The targets in the pipeline depend on the global objects in memory that library(targets)
library(tarchetypes)
list(
tar_target(large_model_file, "large_model.rds", format = "file"),
tar_target(large_model, readRDS(large_model_file))
) |
Beta Was this translation helpful? Give feedback.


The targets in the pipeline depend on the global objects in memory that
_targets.Rpopulates. Most of those objects are functions, but in your case, you also have a large model. To properly track dependencies,targetshashes those in-memory dependencies. So if you create the model as an object in memory, every call totar_make()will hash the object. I would recommend including it in the pipeline and tracking the input file for changes.