File tree Expand file tree Collapse file tree 1 file changed +36
-0
lines changed
docs/source/user-guide/common-operations Expand file tree Collapse file tree 1 file changed +36
-0
lines changed Original file line number Diff line number Diff line change @@ -129,3 +129,39 @@ The function :py:func:`~datafusion.functions.in_list` allows to check a column f
129129 .limit(20 )
130130 .to_pandas()
131131 )
132+
133+
134+ Handling Missing Values
135+ =====================
136+
137+ DataFusion provides methods to handle missing values in DataFrames:
138+
139+ fill_null
140+ ---------
141+
142+ The ``fill_null() `` method replaces NULL values in specified columns with a provided value:
143+
144+ .. code-block :: python
145+
146+ # Fill all NULL values with 0 where possible
147+ df = df.fill_null(0 )
148+
149+ # Fill NULL values only in specific string columns
150+ df = df.fill_null(" missing" , subset = [" name" , " category" ])
151+
152+ The fill value will be cast to match each column's type. If casting fails for a column, that column remains unchanged.
153+
154+ fill_nan
155+ --------
156+
157+ The ``fill_nan() `` method replaces NaN values in floating-point columns with a provided numeric value:
158+
159+ .. code-block :: python
160+
161+ # Fill all NaN values with 0 in numeric columns
162+ df = df.fill_nan(0 )
163+
164+ # Fill NaN values in specific numeric columns
165+ df = df.fill_nan(99.9 , subset = [" price" , " score" ])
166+
167+ This only works on floating-point columns (float32, float64). The fill value must be numeric (int or float).
You can’t perform that action at this time.
0 commit comments