[help] `targets` slow when a large object is read outside the plan #1551

eheinzen · 2025-11-26T17:13:02Z

eheinzen
Nov 26, 2025

Help

I understand and agree to https://books.ropensci.org/targets/help.html.

Description

I have a model pipeline with a large model (GB+). This model runs 6 times per day, and each time I have it create a new cache, so I can come back to it hours or days later. To save disk space, I read the model into an object outside the pipeline so that it doesn't get cached in _targets/. To this point, I've been on targets 1.9.1 and it's working well. However, when I upgraded to the latest targets, it seems to hang before starting the pipeline. I've made a mock-up model object in a smaller size and still see a similar issue (but it actually finishes, instead of hanging). Any ideas here?

Example data, which only takes ~1s to read in.

saveRDS(lapply(1:20, function(x) {
  list(
    matrix = matrix(rnorm(1e6), nrow = 1e4, ncol = 100),
    integers = matrix(rpois(1e5, 20), nrow = 1e4, ncol = 10),
    info = data.frame(
      id = 1:1e4,
      dates = Sys.Date() - 1:1e4
    )
  )
}), "large_model.rds")

Example pipeline, which takes ~1s to read in the data, negligible time to run the target, but a whopping 30s to finish the pipeline.

library(targets)
library(tarchetypes)

cat("Starting to read it\n")
large_model <- readRDS("large_model.rds")
cat("Done reading it.\n")

tar_plan(
  one = 1
)

Flame graph:

If I move the model inside the pipeline, it runs much faster:

library(targets)
library(tarchetypes)

tar_plan(
  one = 1,
  large_model = readRDS("large_model.rds")
)

Flame graph

Answered by wlandau

Dec 1, 2025

The targets in the pipeline depend on the global objects in memory that _targets.R populates. Most of those objects are functions, but in your case, you also have a large model. To properly track dependencies, targets hashes those in-memory dependencies. So if you create the model as an object in memory, every call to tar_make() will hash the object. I would recommend including it in the pipeline and tracking the input file for changes.

library(targets)
library(tarchetypes)

list(
  tar_target(large_model_file, "large_model.rds", format = "file"),
  tar_target(large_model, readRDS(large_model_file))
)

View full answer

wlandau · 2025-12-01T14:19:29Z

wlandau
Dec 1, 2025
Maintainer

The targets in the pipeline depend on the global objects in memory that _targets.R populates. Most of those objects are functions, but in your case, you also have a large model. To properly track dependencies, targets hashes those in-memory dependencies. So if you create the model as an object in memory, every call to tar_make() will hash the object. I would recommend including it in the pipeline and tracking the input file for changes.

library(targets)
library(tarchetypes)

list(
  tar_target(large_model_file, "large_model.rds", format = "file"),
  tar_target(large_model, readRDS(large_model_file))
)

1 reply

eheinzen Dec 1, 2025
Author

I figured it was something like that. Brilliant idea with the file-target. Thank you so much! And thank you as always for a phenomenal package--you wouldn't believe how much it's transformed our workflows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[help] `targets` slow when a large object is read outside the plan #1551

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[help] targets slow when a large object is read outside the plan #1551

Uh oh!

eheinzen Nov 26, 2025

Help

Description

Replies: 1 comment · 1 reply

Uh oh!

wlandau Dec 1, 2025 Maintainer

Uh oh!

eheinzen Dec 1, 2025 Author

[help] `targets` slow when a large object is read outside the plan #1551

eheinzen
Nov 26, 2025

Replies: 1 comment 1 reply

wlandau
Dec 1, 2025
Maintainer

eheinzen Dec 1, 2025
Author