@@ -87,88 +87,64 @@ Methods that support User-Defined Functions
8787
8888User-Defined Functions can be applied across various pandas methods:
8989
90- +----------------------------+------------------------+--------------------------+---------------------------------------------------------------------------+
91- | Method | Function Input | Function Output | Description |
92- +============================+========================+==========================+===========================================================================+
93- | :meth: `map ` | Scalar | Scalar | Apply a function to each element |
94- +----------------------------+------------------------+--------------------------+---------------------------------------------------------------------------+
95- | :meth: `apply ` (axis=0) | Column (Series) | Column (Series) | Apply a function to each column |
96- +----------------------------+------------------------+--------------------------+---------------------------------------------------------------------------+
97- | :meth: `apply ` (axis=1) | Row (Series) | Row (Series) | Apply a function to each row |
98- +----------------------------+------------------------+--------------------------+---------------------------------------------------------------------------+
99- | :meth: `agg ` | Series/DataFrame | Scalar or Series | Aggregate and summarizes values, e.g., sum or custom reducer |
100- +----------------------------+------------------------+--------------------------+---------------------------------------------------------------------------+
101- | :meth: `transform ` | Series/DataFrame | Same shape as input | Apply a function while preserving shape; raises error if shape changes |
102- +----------------------------+------------------------+--------------------------+---------------------------------------------------------------------------+
103- | :meth: `filter ` | - | - | Return rows that satisfy a boolean condition |
104- +----------------------------+------------------------+--------------------------+---------------------------------------------------------------------------+
105- | :meth: `pipe ` | Series/DataFrame | Series/DataFrame | Chain functions together to apply to Series or Dataframe |
106- +----------------------------+------------------------+--------------------------+---------------------------------------------------------------------------+
90+ +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
91+ | Method | Function Input | Function Output | Description |
92+ +============================+========================+==========================+==============================================================================================================================================+
93+ | :meth: `map ` | Scalar | Scalar | Apply a function to each element |
94+ +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
95+ | :meth: `apply ` (axis=0) | Column (Series) | Column (Series) | Apply a function to each column |
96+ +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
97+ | :meth: `apply ` (axis=1) | Row (Series) | Row (Series) | Apply a function to each row |
98+ +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
99+ | :meth: `agg ` | Series/DataFrame | Scalar or Series | Aggregate and summarizes values, e.g., sum or custom reducer |
100+ +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
101+ | :meth: `transform ` (axis=0) | Column (Series) | Column(Series) | Same as :meth: `apply ` with (axis=0), but it raises an exception if the function changes the shape of the data |
102+ +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
103+ | :meth: `transform ` (axis=1) | Row (Series) | Row (Series) | Same as :meth: `apply ` with (axis=1), but it raises an exception if the function changes the shape of the data |
104+ +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
105+ | :meth: `filter ` | Series or DataFrame | Boolean | Only accepts UDFs in group by. Function is called for each group, and the group is removed from the result if the function returns ``False `` |
106+ +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
107+ | :meth: `pipe ` | Series/DataFrame | Series/DataFrame | Chain functions together to apply to Series or Dataframe |
108+ +----------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
107109
108- .. note ::
109- Some of these methods are can also be applied to groupby, resample, and various window objects.
110- See :ref: `groupby `, :ref: `resample()<timeseries> `, :ref: `rolling()<window> `, :ref: `expanding()<window> `,
111- and :ref: `ewm()<window> ` for details.
112-
113-
114- Choosing the Right Method
115- -------------------------
116110When applying UDFs in pandas, it is essential to select the appropriate method based
117111on your specific task. Each method has its strengths and is designed for different use
118112cases. Understanding the purpose and behavior of each method will help you make informed
119113decisions, ensuring more efficient and maintainable code.
120114
121- Below is a table overview of all methods that accept UDFs:
122-
123- +------------------+--------------------------------------+---------------------------+--------------------+------------------------------------------+
124- | Method | Purpose | Supports UDFs | Keeps Shape | Recommended Use Case |
125- +==================+======================================+===========================+====================+==========================================+
126- | :meth: `apply ` | General-purpose function | Yes | Yes (when axis=1) | Custom row-wise or column-wise operations|
127- +------------------+--------------------------------------+---------------------------+--------------------+------------------------------------------+
128- | :meth: `agg ` | Aggregation | Yes | No | Custom aggregation logic |
129- +------------------+--------------------------------------+---------------------------+--------------------+------------------------------------------+
130- | :meth: `transform`| Transform without reducing dimensions| Yes | Yes | Broadcast element-wise transformations |
131- +------------------+--------------------------------------+---------------------------+--------------------+------------------------------------------+
132- | :meth: `map ` | Element-wise mapping | Yes | Yes | Simple element-wise transformations |
133- +------------------+--------------------------------------+---------------------------+--------------------+------------------------------------------+
134- | :meth: `pipe ` | Functional chaining | Yes | Yes | Building clean operation pipelines |
135- +------------------+--------------------------------------+---------------------------+--------------------+------------------------------------------+
136- | :meth: `filter ` | Row/Column selection | Not directly | Yes | Subsetting based on conditions |
137- +------------------+--------------------------------------+---------------------------+--------------------+------------------------------------------+
115+ .. note ::
116+ Some of these methods are can also be applied to groupby, resample, and various window objects.
117+ See :ref: `groupby `, :ref: `resample()<timeseries> `, :ref: `rolling()<window> `, :ref: `expanding()<window> `,
118+ and :ref: `ewm()<window> ` for details.
119+
138120
139121:meth: `DataFrame.apply `
140122~~~~~~~~~~~~~~~~~~~~~~~
141123
142- The :meth: `DataFrame. apply ` allows you to apply UDFs along either rows or columns. While flexible,
124+ The :meth: `apply ` method allows you to apply UDFs along either rows or columns. While flexible,
143125it is slower than vectorized operations and should be used only when you need operations
144126that cannot be achieved with built-in pandas functions.
145127
146- When to use: :meth: `DataFrame. apply ` is suitable when no alternative vectorized method or UDF method is available,
128+ When to use: :meth: `apply ` is suitable when no alternative vectorized method or UDF method is available,
147129but consider optimizing performance with vectorized operations wherever possible.
148130
149- Documentation can be found at :meth: `~DataFrame.apply `.
150-
151131:meth: `DataFrame.agg `
152132~~~~~~~~~~~~~~~~~~~~~
153133
154- If you need to aggregate data, :meth: `DataFrame. agg ` is a better choice than apply because it is
134+ If you need to aggregate data, :meth: `agg ` is a better choice than apply because it is
155135specifically designed for aggregation operations.
156136
157- When to use: Use :meth: `DataFrame. agg ` for performing custom aggregations, where the operation returns
137+ When to use: Use :meth: `agg ` for performing custom aggregations, where the operation returns
158138a scalar value on each input.
159139
160- Documentation can be found at :meth: `~DataFrame.agg `.
161-
162140:meth: `DataFrame.transform `
163141~~~~~~~~~~~~~~~~~~~~~~~~~~~
164142
165- The transform method is ideal for performing element-wise transformations while preserving the shape of the original DataFrame.
143+ The :meth: ` transform ` method is ideal for performing element-wise transformations while preserving the shape of the original DataFrame.
166144It is generally faster than apply because it can take advantage of pandas' internal optimizations.
167145
168146When to use: When you need to perform element-wise transformations that retain the original structure of the DataFrame.
169147
170- Documentation can be found at :meth: `~DataFrame.transform `.
171-
172148.. code-block :: python
173149
174150 from sklearn.linear_model import LinearRegression
@@ -193,11 +169,11 @@ Documentation can be found at :meth:`~DataFrame.transform`.
193169:meth: `DataFrame.filter `
194170~~~~~~~~~~~~~~~~~~~~~~~~
195171
196- The :meth: `DataFrame. filter ` method is used to select subsets of the DataFrame’s
172+ The :meth: `filter ` method is used to select subsets of the DataFrame’s
197173columns or row. It is useful when you want to extract specific columns or rows that
198174match particular conditions.
199175
200- When to use: Use :meth: `DataFrame. filter ` when you want to use a UDF to create a subset of a DataFrame or Series
176+ When to use: Use :meth: `filter ` when you want to use a UDF to create a subset of a DataFrame or Series
201177
202178.. note ::
203179 :meth: `DataFrame.filter ` does not accept UDFs, but can accept
@@ -223,27 +199,20 @@ When to use: Use :meth:`DataFrame.filter` when you want to use a UDF to create a
223199 Since filter does not directly accept a UDF, you have to apply the UDF indirectly,
224200for example, by using list comprehensions.
225201
226- Documentation can be found at :meth: `~DataFrame.filter `.
227-
228202:meth: `DataFrame.map `
229203~~~~~~~~~~~~~~~~~~~~~
230204
231- :meth: `DataFrame.map ` is used specifically to apply element-wise UDFs and is better
232- for this purpose compared to :meth: `DataFrame.apply ` because of its better performance.
205+ The :meth: `map ` method is used specifically to apply element-wise UDFs.
233206
234- When to use: Use map for applying element-wise UDFs to DataFrames or Series.
235-
236- Documentation can be found at :meth: `~DataFrame.map `.
207+ When to use: Use :meth: `map ` for applying element-wise UDFs to DataFrames or Series.
237208
238209:meth: `DataFrame.pipe `
239210~~~~~~~~~~~~~~~~~~~~~~
240211
241- The pipe method is useful for chaining operations together into a clean and readable pipeline.
212+ The :meth: ` pipe ` method is useful for chaining operations together into a clean and readable pipeline.
242213It is a helpful tool for organizing complex data processing workflows.
243214
244- When to use: Use pipe when you need to create a pipeline of operations and want to keep the code readable and maintainable.
245-
246- Documentation can be found at :meth: `~DataFrame.pipe `.
215+ When to use: Use :meth: `pipe ` when you need to create a pipeline of operations and want to keep the code readable and maintainable.
247216
248217
249218Performance
@@ -255,7 +224,7 @@ consider using built-in ``NumPy`` or ``pandas`` functions instead of UDFs
255224for common operations.
256225
257226.. note ::
258- If performance is critical, explore **vectorizated operations ** before resorting
227+ If performance is critical, explore **vectorized operations ** before resorting
259228 to UDFs.
260229
261230Vectorized Operations
@@ -283,9 +252,9 @@ Measuring how long each operation takes:
283252
284253 Vectorized operations in pandas are significantly faster than using :meth: `DataFrame.apply `
285254with UDFs because they leverage highly optimized C functions
286- via NumPy to process entire arrays at once. This approach avoids the overhead of looping
255+ via `` NumPy `` to process entire arrays at once. This approach avoids the overhead of looping
287256through rows in Python and making separate function calls for each row, which is slow and
288- inefficient. Additionally, NumPy arrays benefit from memory efficiency and CPU-level
257+ inefficient. Additionally, `` NumPy `` arrays benefit from memory efficiency and CPU-level
289258optimizations, making vectorized operations the preferred choice whenever possible.
290259
291260
@@ -306,10 +275,10 @@ especially for computationally heavy tasks.
306275Using :meth: `DataFrame.pipe ` for Composable Logic
307276~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
308277
309- Another useful pattern for improving readability and composability— especially when mixing
310- vectorized logic with UDFs— is to use the :meth: `DataFrame.pipe ` method.
278+ Another useful pattern for improving readability and composability, especially when mixing
279+ vectorized logic with UDFs, is to use the :meth: `DataFrame.pipe ` method.
311280
312- The `` .pipe `` method doesn't improve performance directly, but it enables cleaner
281+ :meth: ` DataFrame .pipe ` doesn't improve performance directly, but it enables cleaner
313282method chaining by passing the entire object into a function. This is especially helpful
314283when chaining custom transformations:
315284
@@ -327,8 +296,8 @@ when chaining custom transformations:
327296 )
328297
329298 This is functionally equivalent to calling ``add_ratio_column(df) ``, but keeps your code
330- clean and composable. The function you pass to `` .pipe ` ` can use vectorized operations,
331- row-wise UDFs, or any other logic—`` .pipe ` ` is agnostic.
299+ clean and composable. The function you pass to :meth: ` DataFrame .pipe ` can use vectorized operations,
300+ row-wise UDFs, or any other logic; :meth: ` DataFrame .pipe ` is agnostic.
332301
333302.. note ::
334303 While :meth: `DataFrame.pipe ` does not improve performance on its own,
0 commit comments