@@ -38,21 +38,24 @@ Methods that support User-Defined Functions
38
38
39
39
User-Defined Functions can be applied across various pandas methods:
40
40
41
- * :meth: `DataFrame.apply ` - A flexible method that allows applying a function to Series,
41
+ * :meth: `~ DataFrame.apply ` - A flexible method that allows applying a function to Series,
42
42
DataFrames, or groups of data.
43
- * :meth: `DataFrame.agg ` (Aggregate) - Used for summarizing data, supporting multiple
43
+ * :meth: `~ DataFrame.agg ` (Aggregate) - Used for summarizing data, supporting multiple
44
44
aggregation functions.
45
- * :meth: `DataFrame.transform ` - Applies a function to groups while preserving the shape of
45
+ * :meth: `~ DataFrame.transform ` - Applies a function to groups while preserving the shape of
46
46
the original data.
47
- * :meth: `DataFrame.filter ` - Filters groups based on a list of Boolean conditions.
48
- * :meth: `DataFrame.map ` - Applies an element-wise function to a Series, useful for
47
+ * :meth: `~ DataFrame.filter ` - Filters groups based on a list of Boolean conditions.
48
+ * :meth: `~ DataFrame.map ` - Applies an element-wise function to a Series, useful for
49
49
transforming individual values.
50
- * :meth: `DataFrame.pipe ` - Allows chaining custom functions to process entire DataFrames or
50
+ * :meth: `~ DataFrame.pipe ` - Allows chaining custom functions to process entire DataFrames or
51
51
Series in a clean, readable manner.
52
52
53
53
All of these pandas methods can be used with both Series and DataFrame objects, providing versatile
54
54
ways to apply UDFs across different pandas data structures.
55
55
56
+ .. note ::
57
+ Some of these methods are can also be applied to Groupby Objects. Refer to :ref: `groupby `.
58
+
56
59
57
60
Choosing the Right Method
58
61
-------------------------
@@ -70,7 +73,7 @@ Below is a table overview of all methods that accept UDFs:
70
73
+------------------+--------------------------------------+---------------------------+--------------------+---------------------------+------------------------------------------+
71
74
| :meth: `agg ` | Aggregation | Yes | No | Fast (if using built-ins) | Custom aggregation logic |
72
75
+------------------+--------------------------------------+---------------------------+--------------------+---------------------------+------------------------------------------+
73
- | :meth: `transform`| Transform without reducing dimensions| Yes | Yes | Fast (if vectorized) | Broadcast Element -wise transformations |
76
+ | :meth: `transform`| Transform without reducing dimensions| Yes | Yes | Fast (if vectorized) | Broadcast element -wise transformations |
74
77
+------------------+--------------------------------------+---------------------------+--------------------+---------------------------+------------------------------------------+
75
78
| :meth: `map ` | Element-wise mapping | Yes | Yes | Moderate | Simple element-wise transformations |
76
79
+------------------+--------------------------------------+---------------------------+--------------------+---------------------------+------------------------------------------+
@@ -89,7 +92,7 @@ that cannot be achieved with built-in pandas functions.
89
92
When to use: :meth: `DataFrame.apply ` is suitable when no alternative vectorized method is available, but consider
90
93
optimizing performance with vectorized operations wherever possible.
91
94
92
- Examples of usage can be found at :meth: ` DataFrame. apply`
95
+ Examples of usage can be found :ref: ` here<api.dataframe. apply> `.
93
96
94
97
:meth: `DataFrame.agg `
95
98
~~~~~~~~~~~~~~~~~~~~~
@@ -100,7 +103,7 @@ specifically designed for aggregation operations.
100
103
When to use: Use :meth: `DataFrame.agg ` for performing aggregations like sum, mean, or custom aggregation
101
104
functions across groups.
102
105
103
- Examples of usage can be found at :meth: ` DataFrame.agg <api.dataframe.agg> `
106
+ Examples of usage can be found :ref: ` here <api.dataframe.agg> `.
104
107
105
108
:meth: `DataFrame.transform `
106
109
~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -110,7 +113,7 @@ It’s generally faster than apply because it can take advantage of pandas' inte
110
113
111
114
When to use: When you need to perform element-wise transformations that retain the original structure of the DataFrame.
112
115
113
- Documentation: DataFrame. transform
116
+ Documentation can be found :ref: ` here<api.dataframe. transform> `.
114
117
115
118
Attempting to use common aggregation functions such as ``mean `` or ``sum `` will result in
116
119
values being broadcasted to the original dimensions:
@@ -162,6 +165,9 @@ When to use: Use :meth:`DataFrame.filter` when you want to use a UDF to create a
162
165
df_filtered = df[[col for col in df.columns if is_long_name(col)]]
163
166
print (df_filtered)
164
167
168
+ Since filter does not direclty accept a UDF, you have to apply the UDF indirectly,
169
+ such as by using list comprehensions.
170
+
165
171
:meth: `DataFrame.map `
166
172
~~~~~~~~~~~~~~~~~~~~~
167
173
@@ -170,7 +176,7 @@ for this purpose compared to :meth:`DataFrame.apply` because of its better perfo
170
176
171
177
When to use: Use map for applying element-wise UDFs to DataFrames or Series.
172
178
173
- Documentation: DataFrame. map
179
+ Documentation can be found :ref: ` here<api.dataframe. map> `.
174
180
175
181
:meth: `DataFrame.pipe `
176
182
~~~~~~~~~~~~~~~~~~~~~~
@@ -180,7 +186,7 @@ It is a helpful tool for organizing complex data processing workflows.
180
186
181
187
When to use: Use pipe when you need to create a pipeline of transformations and want to keep the code readable and maintainable.
182
188
183
- Documentation: DataFrame. pipe
189
+ Documentation can be found :ref: ` here<api.dataframe. pipe> `.
184
190
185
191
186
192
Best Practices
@@ -198,9 +204,9 @@ for common operations.
198
204
Vectorized Operations
199
205
~~~~~~~~~~~~~~~~~~~~~
200
206
201
- Below is an example of vectorized operations in pandas :
207
+ Below is a comparison of using UDFs versus using Vectorized Operations :
202
208
203
- .. code-block :: text
209
+ .. code-block :: python
204
210
205
211
# User-defined function
206
212
def calc_ratio (row ):
@@ -215,8 +221,8 @@ Measuring how long each operation takes:
215
221
216
222
.. code-block :: text
217
223
218
- Vectorized: 0.0043 secs
219
224
User-defined function: 5.6435 secs
225
+ Vectorized: 0.0043 secs
220
226
221
227
Vectorized operations in pandas are significantly faster than using :meth: `DataFrame.apply `
222
228
with UDFs because they leverage highly optimized C functions
0 commit comments