Skip to content

Commit 0fcde87

Browse files
chihhanyuHyukjinKwon
authored andcommitted
[SPARK-21658][SQL][PYSPARK] Add default None for value in na.replace in PySpark
## What changes were proposed in this pull request? JIRA issue: https://issues.apache.org/jira/browse/SPARK-21658 Add default None for value in `na.replace` since `Dataframe.replace` and `DataframeNaFunctions.replace` are alias. The default values are the same now. ``` >>> df = sqlContext.createDataFrame([('Alice', 10, 80.0)]) >>> df.replace({"Alice": "a"}).first() Row(_1=u'a', _2=10, _3=80.0) >>> df.na.replace({"Alice": "a"}).first() Row(_1=u'a', _2=10, _3=80.0) ``` ## How was this patch tested? Existing tests. cc viirya Author: byakuinss <[email protected]> Closes apache#18895 from byakuinss/SPARK-21658.
1 parent 6847e93 commit 0fcde87

File tree

1 file changed

+11
-1
lines changed

1 file changed

+11
-1
lines changed

python/pyspark/sql/dataframe.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1403,6 +1403,16 @@ def replace(self, to_replace, value=None, subset=None):
14031403
|null| null|null|
14041404
+----+------+----+
14051405
1406+
>>> df4.na.replace('Alice').show()
1407+
+----+------+----+
1408+
| age|height|name|
1409+
+----+------+----+
1410+
| 10| 80|null|
1411+
| 5| null| Bob|
1412+
|null| null| Tom|
1413+
|null| null|null|
1414+
+----+------+----+
1415+
14061416
>>> df4.na.replace(['Alice', 'Bob'], ['A', 'B'], 'name').show()
14071417
+----+------+----+
14081418
| age|height|name|
@@ -1837,7 +1847,7 @@ def fill(self, value, subset=None):
18371847

18381848
fill.__doc__ = DataFrame.fillna.__doc__
18391849

1840-
def replace(self, to_replace, value, subset=None):
1850+
def replace(self, to_replace, value=None, subset=None):
18411851
return self.df.replace(to_replace, value, subset)
18421852

18431853
replace.__doc__ = DataFrame.replace.__doc__

0 commit comments

Comments
 (0)