Skip to content

Commit fe67ec8

Browse files
committed
added agg, transform and filter
1 parent bf984ca commit fe67ec8

File tree

1 file changed

+80
-20
lines changed

1 file changed

+80
-20
lines changed

doc/source/user_guide/user_defined_functions.rst

Lines changed: 80 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,6 @@ flexibility when built-in methods are not sufficient. These functions can be
1313
applied at different levels: element-wise, row-wise, column-wise, or group-wise,
1414
depending on the method used.
1515

16-
.. .. note::
17-
18-
.. User-Defined Functions will be abbreviated to UDFs throughout this guide.
19-
2016
Why Use User-Defined Functions?
2117
-------------------------------
2218

@@ -36,7 +32,7 @@ needs go beyond standard aggregation, transformation, or filtering. UDFs allow y
3632
What functions support User-Defined Functions
3733
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3834

39-
UDFs can be applied across various pandas methods that work with Series and DataFrames:
35+
User-Defined Functions can be applied across various pandas methods that work with Series and DataFrames:
4036

4137
* :meth:`DataFrame.apply` - A flexible method that allows applying a function to Series,
4238
DataFrames, or groups of data.
@@ -60,7 +56,6 @@ ways to apply user-defined functions across different pandas data structures.
6056
The :meth:`DataFrame.apply` allows applying a user-defined functions along either axis (rows or columns):
6157

6258
.. ipython:: python
63-
6459
import pandas as pd
6560
6661
# Sample DataFrame
@@ -71,8 +66,8 @@ The :meth:`DataFrame.apply` allows applying a user-defined functions along eithe
7166
return x + 1
7267
7368
# Apply function
74-
df_transformed = df.apply(add_one)
75-
print(df_transformed)
69+
df_applied = df.apply(add_one)
70+
print(df_applied)
7671
7772
# This works with lambda functions too
7873
df_lambda = df.apply(lambda x : x + 1)
@@ -82,9 +77,6 @@ The :meth:`DataFrame.apply` allows applying a user-defined functions along eithe
8277
:meth:`DataFrame.apply` also accepts dictionaries of multiple user-defined functions:
8378

8479
.. ipython:: python
85-
86-
import pandas as pd
87-
8880
# Sample DataFrame
8981
df = pd.DataFrame({'A': [1, 2, 3], 'B': [1, 2, 3]})
9082
@@ -96,8 +88,8 @@ The :meth:`DataFrame.apply` allows applying a user-defined functions along eithe
9688
return x + 2
9789
9890
# Apply function
99-
df_transformed = df.apply({"A": add_one, "B": add_two})
100-
print(df_transformed)
91+
df_applied = df.apply({"A": add_one, "B": add_two})
92+
print(df_applied)
10193
10294
# This works with lambda functions too
10395
df_lambda = df.apply({"A": lambda x : x + 1, "B": lambda x : x + 2})
@@ -106,9 +98,6 @@ The :meth:`DataFrame.apply` allows applying a user-defined functions along eithe
10698
:meth:`DataFrame.apply` works with Series objects as well:
10799

108100
.. ipython:: python
109-
110-
import pandas as pd
111-
112101
# Sample Series
113102
s = pd.Series([1, 2, 3])
114103
@@ -117,8 +106,8 @@ The :meth:`DataFrame.apply` allows applying a user-defined functions along eithe
117106
return x + 1
118107
119108
# Apply function
120-
s_transformed = s.apply(add_one)
121-
print(df_transformed)
109+
s_applied = s.apply(add_one)
110+
print(s_applied)
122111
123112
# This works with lambda functions too
124113
s_lambda = s.apply(lambda x : x + 1)
@@ -127,10 +116,9 @@ The :meth:`DataFrame.apply` allows applying a user-defined functions along eithe
127116
:meth:`DataFrame.agg`
128117
---------------------
129118

130-
When working with grouped data, user-defined functions can be used within :meth:`DataFrame.agg`:
119+
The :meth:`DataFrame.agg` allows aggregation with a user-defined function along either axis (rows or columns):
131120

132121
.. ipython:: python
133-
134122
# Sample DataFrame
135123
df = pd.DataFrame({
136124
'Category': ['A', 'A', 'B', 'B'],
@@ -145,6 +133,78 @@ When working with grouped data, user-defined functions can be used within :meth:
145133
grouped_result = df.groupby('Category')['Values'].agg(group_mean)
146134
print(grouped_result)
147135
136+
In terms of the API, :meth:`DataFrame.agg` has similar usage to :meth:`DataFrame.apply`,
137+
but it is primarily used for **aggregation**, applying functions that summarize or reduce data.
138+
Typically, the result of :meth:`DataFrame.agg` reduces the dimensions of data as shown
139+
in the above example. Conversely, :meth:`DataFrame.apply` is more general and allows for both
140+
transformations and custom row-wise or element-wise operations.
141+
142+
:meth:`DataFrame.transform`
143+
---------------------------
144+
145+
The :meth:`DataFrame.transform` allows transforms a Dataframe, Series or Grouped object
146+
while preserving the original shape of the object.
147+
148+
.. ipython:: python
149+
# Sample DataFrame
150+
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
151+
152+
# User-Defined Function
153+
def double(x):
154+
return x * 2
155+
156+
# Apply transform
157+
df_transformed = df.transform(double)
158+
print(df_transformed)
159+
160+
# This works with lambda functions too
161+
df_lambda = df.transform(lambda x: x * 2)
162+
print(df_lambda)
163+
164+
Attempting to use common aggregation functions such as `mean` or `sum` will result in
165+
values being broadcasted to the original dimensions:
166+
167+
.. ipython:: python
168+
# Sample DataFrame
169+
df = pd.DataFrame({
170+
'Category': ['A', 'A', 'B', 'B', 'B'],
171+
'Values': [10, 20, 30, 40, 50]
172+
})
173+
174+
# Using transform with mean
175+
df['Mean_Transformed'] = df.groupby('Category')['Values'].transform('mean')
176+
177+
# Using transform with sum
178+
df['Sum_Transformed'] = df.groupby('Category')['Values'].transform('sum')
179+
180+
# Result broadcasted to DataFrame
181+
print(df)
182+
183+
:meth:`DataFrame.filter`
184+
------------------------
185+
186+
The :meth:`DataFrame.filter` method is used to select subsets of the DataFrame’s
187+
columns or rows and accepts user-defined functions. Specifically, these functions
188+
return boolean values to filter columns or rows. It is useful when you want to
189+
extract specific columns or rows that match particular conditions.
190+
191+
.. ipython:: python
192+
# Sample DataFrame
193+
df = pd.DataFrame({
194+
'A': [1, 2, 3],
195+
'B': [4, 5, 6],
196+
'C': [7, 8, 9],
197+
'D': [10, 11, 12]
198+
})
199+
200+
# Define a function that filters out columns where the name is longer than 1 character
201+
df_filtered_func = df.filter(items=lambda x: len(x) > 1)
202+
print(df_filtered_func)
203+
204+
Unlike the methods discussed earlier, :meth:`DataFrame.filter` does not accept
205+
functions that do not return boolean values, such as `mean` or `sum`.
206+
207+
148208
Performance Considerations
149209
--------------------------
150210

0 commit comments

Comments
 (0)