Skip to content

Commit 13a530e

Browse files
fix(cluster): propagate checkpoint save failure in saveCheckpoint
Capture writeJSON return value and throw on failure with logged context so callers are aware when checkpoint persistence fails.
1 parent 7a03ba1 commit 13a530e

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

packages/tm-core/src/modules/cluster/cluster-execution-domain.ts

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -202,7 +202,20 @@ export class ClusterExecutionDomain {
202202
taskStatuses
203203
};
204204

205-
await writeJSON(checkpointPath, checkpoint);
205+
const ok = await writeJSON(checkpointPath, checkpoint);
206+
207+
if (!ok) {
208+
this.logger.error('Failed to save checkpoint', {
209+
checkpointPath,
210+
checkpointId: checkpoint.currentClusterId,
211+
checkpointState: {
212+
completedClusters: checkpoint.completedClusters.length,
213+
completedTasks: checkpoint.completedTasks.length,
214+
failedTasks: checkpoint.failedTasks.length
215+
}
216+
});
217+
throw new Error(`Failed to persist checkpoint to ${checkpointPath}`);
218+
}
206219

207220
this.logger.info('Checkpoint saved', { checkpointPath });
208221
}

0 commit comments

Comments
 (0)