Skip to content

Commit 56ba15a

Browse files
authored
fix: refine prompt, equal lightgbm, discourage over hypertuning (#1072)
* feat: add mount_path parameter to run command * feat: add runtime environment info and dynamic timeouts to DS runners
1 parent 9554f40 commit 56ba15a

File tree

9 files changed

+31
-5
lines changed

9 files changed

+31
-5
lines changed

rdagent/app/utils/ws.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010

1111

1212
@app.command()
13-
def run(competition: str, cmd: str, local_path: str = "./"):
13+
def run(competition: str, cmd: str, local_path: str = "./", mount_path: str | None = None):
1414
"""
1515
Launch the data-science environment for a specific competition and run the
1616
provided command.
@@ -44,6 +44,9 @@ def run(competition: str, cmd: str, local_path: str = "./"):
4444
enable_cache=False,
4545
)
4646

47+
if mount_path is not None:
48+
env.conf.mount_path = mount_path
49+
4750
env.run(entry=cmd, local_path=local_path)
4851

4952

rdagent/components/coder/data_science/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ class DSCoderCoSTEERSettings(CoSTEERSettings):
1919
class Config:
2020
env_prefix = "DS_Coder_CoSTEER_"
2121

22-
max_seconds: int = 2400
22+
max_seconds: int = DS_RD_SETTING.debug_timeout * 4
2323
env_type: str = "docker"
2424
# TODO: extract a function for env and conf.
2525

rdagent/components/coder/data_science/pipeline/prompts.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ pipeline_coder:
8282
```
8383
In debug mode, you should only sample ten percent of the training data and run the minimum epochs to quickly test the correctness of the code.
8484
In debug mode, you should implement a timer to measure the time taken for your debug configuration and estimate the time required for the full run.
85+
In debug mode, your code should run faster, so the environment will set a shorter time limit than the standard time limit for your code.
8586
For example, you can sample ten percent of the training data and run for one epoch, then the full run with ten epochs will take one hundred times the time taken for the debug run. The scale is calculated by yourself depending on the data sampling and epoch number you choose. If your full run enables early stopping, the scale should be smaller considering the early stopping will stop the training earlier than the full epochs.
8687
You should sample the data after train valid split. When you split the data after sampling, you might get a class with only one sample which might cause the split strategy to fail.
8788
Your debug code should run exactly the same as the full run, except for the data sampling and epoch number, to ensure the correctness of the code.
@@ -133,7 +134,9 @@ pipeline_coder:
133134
134135
{% if latest_code %}
135136
# Former code
137+
```
136138
{{ latest_code }}
139+
```
137140
{% if latest_code_feedback is not none %}
138141
## Feedback to former code
139142
{{ latest_code_feedback }}
@@ -270,7 +273,11 @@ pipeline_eval:
270273
{{ spec }}
271274
272275
# Code
276+
```
273277
{{ code }}
278+
```
274279
275280
## Execution Output
281+
```
276282
{{ stdout }}
283+
```

rdagent/core/scenario.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,12 @@ def get_scenario_all_desc(
5252
The scenario description varies based on the task being performed.
5353
"""
5454

55+
@abstractmethod
56+
def get_runtime_environment(self) -> str:
57+
"""
58+
Get the runtime environment information
59+
"""
60+
5561
@property
5662
def experiment_setting(self) -> str | None:
5763
"""Get experiment setting and return as rich text string"""

rdagent/scenarios/data_science/dev/runner/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ class DSRunnerCoSTEERSettings(DSCoderCoSTEERSettings):
3131
class Config:
3232
env_prefix = "DS_Runner_CoSTEER_"
3333

34-
max_seconds: int = 3600
34+
max_seconds: int = DS_RD_SETTING.full_timeout
3535
env_type: str = "docker"
3636
# TODO: extract a function for env and conf.
3737

rdagent/scenarios/data_science/dev/runner/eval.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,7 @@ def evaluate(
133133
scenario=self.scen.get_scenario_all_desc(eda_output=implementation.file_dict.get("EDA.md", None)),
134134
is_sub_enabled=test_eval.is_sub_enabled(self.scen.competition),
135135
task_desc=target_task.get_task_information(),
136+
runtime_environment=self.scen.get_runtime_environment(),
136137
)
137138
user_prompt = T(".prompts:DSCoSTEER_eval.user").r(
138139
code=implementation.all_codes,

rdagent/scenarios/data_science/dev/runner/prompts.yaml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,21 @@ DSCoSTEER_eval:
99
The task is as follows:
1010
{{ task_desc }}
1111
12+
You have following environment to run the code:
13+
{{ runtime_environment }}
14+
1215
The whole workflow includes multiple stages, such as:
1316
- Data loading
1417
- Feature engineering
1518
- Model training
1619
- Ensembling
1720
1821
The user will provide you the time spent on the whole code execution and the timeout of the code execution. You should decide whether the hyperparameter is reasonable based on the time.
19-
For example, if the code only spent ten percent of the timeout and the hyperparameter like `n_estimators` or 'epochs' is very small or batch size is small you should suggest to increase these hyperparameter.
22+
For example, if the code uses only a small portion of the allowed time, and hyperparameters like `n_estimators` or `epochs` have low values, with early stopping not being triggered and possible signs of underfitting, you should suggest increasing these hyperparameters.
23+
24+
You should also notice other resources utilization hyper-parameters,
25+
For example, if you are using a GPU with large memory, and the batch size is set very low, you should suggest increasing the batch size if it is not reasonable.
26+
2027
Please provide your feedback in two key-value pairs:
2128
"hyperparameter_tuning_decision": <true/false>
2229
"hyperparameter_tuning_suggestion": <suggestion in plain text for hyperparameter tuning, e.g., increase n_estimators to 1000, increase epochs to 100, increase batch size to 64, give an empty string if decide not to tune the hyperparameter>

rdagent/scenarios/data_science/scen/runtime_info.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,8 @@ def get_gpu_info():
6262
"numpy",
6363
"scikit-learn",
6464
"scipy",
65+
"xgboost",
66+
"sklearn",
6567
"lightgbm",
6668
"vtk",
6769
"opencv-python",

rdagent/scenarios/data_science/test_eval.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ def valid(self, competition: str, workspace: FBWorkspace) -> tuple[str, int]:
2323

2424
@abstractmethod
2525
def enabled(self, competition) -> bool:
26-
"""able to eval or not"""
26+
"""support `eval` & `valid` or not"""
2727

2828
@abstractmethod
2929
def get_sample_submission_name(self, competition: str) -> str:

0 commit comments

Comments
 (0)