-
Notifications
You must be signed in to change notification settings - Fork 434
Description
What would you like to happen?
Goal
Speed up development and debugging cycles in Apache Hop pipelines by reusing a step output as a “checkpoint”, avoiding reprocessing the entire upstream on every run.
Concept (proposal)
The user marks a step as a checkpoint.
Hop stores the step output (rows/dataset) in a local cache.
On later runs, Hop can skip upstream steps and start from the checkpoint, loading data from cache.
The user explicitly chooses:
Run from the beginning
Run from the last checkpoint
Note
Unlike a “test pipeline”, the idea is not to create dataset metadata. This is a development-only cache feature to speed up iteration, especially when upstream extraction (e.g., from a database) is slow. The user can define where to store the cache and how many rows to save. At runtime, they can run from the beginning or resume from the last checkpoint, without creating extra metadata or separate test pipelines.
Issue Priority
Priority: 3
Issue Component
Component: Hop Gui