You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: py-src/data_formulator/agents/agent_data_rec.py
+6-2Lines changed: 6 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -52,7 +52,7 @@
52
52
(4) "visualization_fields" should be no more than 3 (for x,y,legend).
53
53
(5) "chart_type" must be one of "point", "bar", "line", or "boxplot"
54
54
55
-
2. Then, write a python function based on the inferred goal, the function input is a dataframe "df" and the output is the transformed dataframe "transformed_df". "transformed_df" should contain all "output_fields" from the refined goal.
55
+
2. Then, write a python function based on the inferred goal, the function input is a dataframe "df" (or multiple dataframes based on tables presented in the [CONTEXT] section) and the output is the transformed dataframe "transformed_df". "transformed_df" should contain all "output_fields" from the refined goal.
56
56
The python function must follow the template provided in [TEMPLATE], do not import any other libraries or modify function name. The function should be as simple as possible and easily readable.
57
57
If there is no data transformation needed based on "output_fields", the transformation function can simply "return df".
58
58
@@ -63,11 +63,15 @@
63
63
import collections
64
64
import numpy as np
65
65
66
-
def transform_data(df):
66
+
def transform_data(df1, df2, ...):
67
67
# complete the template here
68
68
return transformed_df
69
69
```
70
70
71
+
note:
72
+
- if the user provided one table, then it should be def transform_data(df1), if the user provided multiple tables, then it should be def transform_data(df1, df2, ...) and you should consider the join between tables to derive the output.
73
+
- try to use table names to refer to the input dataframes, for example, if the user provided two tables city and weather, you can use `transform_data(df_city, df_weather)` to refer to the two dataframes.
74
+
71
75
3. The [OUTPUT] must only contain a json object representing the refined goal and a python code block representing the transformation code, do not add any extra text explanation.
Copy file name to clipboardExpand all lines: py-src/data_formulator/agents/agent_data_transform_v2.py
+21-3Lines changed: 21 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -45,7 +45,7 @@
45
45
}
46
46
```
47
47
48
-
2. Then, write a python function based on the refined goal, the function input is a dataframe "df" and the output is the transformed dataframe "transformed_df". "transformed_df" should contain all "output_fields" from the refined goal.
48
+
2. Then, write a python function based on the refined goal, the function input is a dataframe "df" (or multiple dataframes based on tables presented in the [CONTEXT] section) and the output is the transformed dataframe "transformed_df". "transformed_df" should contain all "output_fields" from the refined goal.
49
49
The python function must follow the template provided in [TEMPLATE], do not import any other libraries or modify function name. The function should be as simple as possible and easily readable.
50
50
If there is no data transformation needed based on "output_fields", the transformation function can simply "return df".
51
51
@@ -56,11 +56,15 @@
56
56
import collections
57
57
import numpy as np
58
58
59
-
def transform_data(df):
59
+
def transform_data(df1, df2, ...):
60
60
# complete the template here
61
61
return transformed_df
62
62
```
63
63
64
+
note:
65
+
- if the user provided one table, then it should be def transform_data(df1), if the user provided multiple tables, then it should be def transform_data(df1, df2, ...) and you should consider the join between tables to derive the output.
66
+
- try to use table names to refer to the input dataframes, for example, if the user provided two tables city and weather, you can use `transform_data(df_city, df_weather)` to refer to the two dataframes.
67
+
64
68
3. The [OUTPUT] must only contain a json object representing the refined goal (including "detailed_instruction", "output_fields", "visualization_fields" and "reason") and a python code block representing the transformation code, do not add any extra text explanation.
0 commit comments