Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Commit c76153c

Browse files
committed
[SPARK-18608][ML][FOLLOWUP] Fix double caching for PySpark OneVsRest.
## What changes were proposed in this pull request? apache#19197 fixed double caching for MLlib algorithms, but missed PySpark ```OneVsRest```, this PR fixed it. ## How was this patch tested? Existing tests. Author: Yanbo Liang <[email protected]> Closes apache#19220 from yanboliang/SPARK-18608.
1 parent 66cb72d commit c76153c

File tree

1 file changed

+2
-4
lines changed

1 file changed

+2
-4
lines changed

python/pyspark/ml/classification.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1773,8 +1773,7 @@ def _fit(self, dataset):
17731773
multiclassLabeled = dataset.select(labelCol, featuresCol)
17741774

17751775
# persist if underlying dataset is not persistent.
1776-
handlePersistence = \
1777-
dataset.rdd.getStorageLevel() == StorageLevel(False, False, False, False)
1776+
handlePersistence = dataset.storageLevel == StorageLevel(False, False, False, False)
17781777
if handlePersistence:
17791778
multiclassLabeled.persist(StorageLevel.MEMORY_AND_DISK)
17801779

@@ -1928,8 +1927,7 @@ def _transform(self, dataset):
19281927
newDataset = dataset.withColumn(accColName, initUDF(dataset[origCols[0]]))
19291928

19301929
# persist if underlying dataset is not persistent.
1931-
handlePersistence = \
1932-
dataset.rdd.getStorageLevel() == StorageLevel(False, False, False, False)
1930+
handlePersistence = dataset.storageLevel == StorageLevel(False, False, False, False)
19331931
if handlePersistence:
19341932
newDataset.persist(StorageLevel.MEMORY_AND_DISK)
19351933

0 commit comments

Comments
 (0)