Skip to content

Commit 6064368

Browse files
zhengruifengsrowen
authored andcommitted
[SPARK-27018][CORE] Fix incorrect removal of checkpointed file in PeriodicCheckpointer
## What changes were proposed in this pull request? remove the oldest checkpointed file only if next checkpoint exists. I think this patch needs back-porting. ## How was this patch tested? existing test local check in spark-shell with following suite: ``` import org.apache.spark.ml.linalg.Vectors import org.apache.spark.ml.classification.GBTClassifier case class Row(features: org.apache.spark.ml.linalg.Vector, label: Int) sc.setCheckpointDir("/checkpoints") val trainingData = sc.parallelize(1 to 2426874, 256).map(x => Row(Vectors.dense(x, x + 1, x * 2 % 10), if (x % 5 == 0) 1 else 0)).toDF val classifier = new GBTClassifier() .setLabelCol("label") .setFeaturesCol("features") .setProbabilityCol("probability") .setMaxIter(100) .setMaxDepth(10) .setCheckpointInterval(2) classifier.fit(trainingData) ``` Closes apache#24870 from zhengruifeng/ck_update. Authored-by: zhengruifeng <[email protected]> Signed-off-by: Sean Owen <[email protected]>
1 parent 0671395 commit 6064368

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

core/src/main/scala/org/apache/spark/util/PeriodicCheckpointer.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ private[spark] abstract class PeriodicCheckpointer[T](
100100
var canDelete = true
101101
while (checkpointQueue.size > 1 && canDelete) {
102102
// Delete the oldest checkpoint only if the next checkpoint exists.
103-
if (isCheckpointed(checkpointQueue.head)) {
103+
if (isCheckpointed(checkpointQueue(1))) {
104104
removeCheckpointFile()
105105
} else {
106106
canDelete = false

0 commit comments

Comments
 (0)