Skip to content

Commit c670d0a

Browse files
authored
Fix taskset and eval_tasksets path (#367)
1 parent aa32213 commit c670d0a

File tree

25 files changed

+46
-42
lines changed

25 files changed

+46
-42
lines changed

docs/sphinx_doc/source/tutorial/example_async_mode.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ buffer:
3131
taskset:
3232
name: gsm8k
3333
storage_type: file
34-
path: 'openai/gsm8k'
34+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
3535
subset_name: 'main'
3636
split: train
3737
format:
@@ -79,7 +79,7 @@ buffer:
7979
taskset:
8080
name: gsm8k
8181
storage_type: file
82-
path: 'openai/gsm8k'
82+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
8383
subset_name: 'main'
8484
format:
8585
prompt_key: 'question'
@@ -143,7 +143,7 @@ buffer:
143143
taskset: # important
144144
name: gsm8k
145145
storage_type: file
146-
path: 'openai/gsm8k'
146+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
147147
subset_name: 'main'
148148
format:
149149
prompt_key: 'question'

docs/sphinx_doc/source/tutorial/example_reasoning_basic.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ buffer:
6969
taskset:
7070
name: gsm8k
7171
storage_type: file
72-
path: 'openai/gsm8k'
72+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
7373
subset_name: 'main'
7474
split: 'train'
7575
format:
@@ -81,7 +81,7 @@ buffer:
8181
eval_tasksets:
8282
- name: gsm8k-eval
8383
storage_type: file
84-
path: 'openai/gsm8k'
84+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
8585
subset_name: 'main'
8686
split: 'test'
8787
format:

docs/sphinx_doc/source/tutorial/trinity_configs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -275,7 +275,7 @@ The configuration for each task dataset is defined as follows:
275275
- `file`: The dataset is stored in `jsonl`/`parquet` files. The data file organization is required to meet the huggingface standard. *We recommand using this storage type for most cases.*
276276
- `sql`: The dataset is stored in a SQL database. *This type is unstable and will be optimized in the future versions.*
277277
- `path`: The path to the task dataset.
278-
- For `file` storage type, the path points to the directory that contains the task dataset files.
278+
- For `file` storage type, the path points to the directory that contains the task dataset files. It supports loading both local and remote data files in a compatible format with [`datasets.load_dataset()`](https://huggingface.co/docs/datasets/main/en/package_reference/loading_methods#datasets.load_dataset) function.
279279
- For `sql` storage type, the path points to the sqlite database file.
280280
- `subset_name`: The subset name of the task dataset, corresponding to the `name` parameter in huggingface datasets `load_dataset` function. Default is `None`.
281281
- `split`: The split of the task dataset, corresponding to the `split` parameter in huggingface datasets `load_dataset` function. Default is `train`.

docs/sphinx_doc/source_zh/tutorial/example_async_mode.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ buffer:
3131
taskset:
3232
name: gsm8k
3333
storage_type: file
34-
path: 'openai/gsm8k'
34+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
3535
subset_name: 'main'
3636
split: train
3737
format:
@@ -79,7 +79,7 @@ buffer:
7979
taskset:
8080
name: gsm8k
8181
storage_type: file
82-
path: 'openai/gsm8k'
82+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
8383
subset_name: 'main'
8484
format:
8585
prompt_key: 'question'
@@ -143,7 +143,7 @@ buffer:
143143
taskset: # important
144144
name: gsm8k
145145
storage_type: file
146-
path: 'openai/gsm8k'
146+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
147147
subset_name: 'main'
148148
format:
149149
prompt_key: 'question'

docs/sphinx_doc/source_zh/tutorial/example_reasoning_basic.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ buffer:
6969
taskset:
7070
name: gsm8k
7171
storage_type: file
72-
path: 'openai/gsm8k'
72+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
7373
subset_name: 'main'
7474
split: 'train'
7575
format:
@@ -81,7 +81,7 @@ buffer:
8181
eval_tasksets:
8282
- name: gsm8k-eval
8383
storage_type: file
84-
path: 'openai/gsm8k'
84+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
8585
subset_name: 'main'
8686
split: 'test'
8787
format:

docs/sphinx_doc/source_zh/tutorial/trinity_configs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -272,7 +272,7 @@ buffer:
272272
- `file`: 数据集存储在 `jsonl`/`parquet` 文件中。数据文件组织需符合 HuggingFace 标准。*建议大多数情况下使用此存储类型。*
273273
- `sql`: 数据集存储在 SQL 数据库中。*此类型尚不稳定,将在未来版本中优化。*
274274
- `path`: 任务数据集的路径。
275-
- 对于 `file` 类型,路径指向包含任务数据集文件的目录。
275+
- 对于 `file` 类型,路径指向包含任务数据集文件的目录。它支持使用与 [`datasets.load_dataset()`](https://huggingface.co/docs/datasets/main/en/package_reference/loading_methods#datasets.load_dataset) 函数兼容的格式加载本地和远程数据文件。
276276
- 对于 `sql` 类型,路径指向 sqlite 数据库文件。
277277
- `subset_name`: 任务数据集的子集名称,对应 huggingface datasets `load_dataset` 函数中的 `name` 参数。默认为 `None`
278278
- `split`: 任务数据集的划分。对应 huggingface datasets `load_dataset` 函数中的 `split` 参数。默认为 `train`

examples/agentscope_react/gsm8k.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ buffer:
2121
taskset:
2222
name: gsm8k
2323
storage_type: file
24-
path: 'openai/gsm8k'
24+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
2525
subset_name: 'main'
2626
split: 'train'
2727
format:

examples/agentscope_tool_react/agentscopev0_tool_react_gsm8k.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ buffer:
2121
taskset:
2222
name: gsm8k
2323
storage_type: file
24-
path: 'openai/gsm8k'
24+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
2525
subset_name: 'main'
2626
split: 'train'
2727
format:

examples/agentscope_tool_react/agentscopev1_tool_react_dapo.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ buffer:
2121
taskset:
2222
name: dapo
2323
storage_type: file
24-
path: open-r1/DAPO-Math-17k-Processed
24+
path: ${oc.env:TRINITY_TASKSET_PATH,open-r1/DAPO-Math-17k-Processed}
2525
subset_name: en
2626
split: train
2727
format:

examples/asymre_gsm8k/gsm8k.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ buffer:
2828
taskset:
2929
name: gsm8k
3030
storage_type: file
31-
path: 'openai/gsm8k'
31+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
3232
subset_name: 'main'
3333
split: train
3434
format:
@@ -40,7 +40,7 @@ buffer:
4040
eval_tasksets:
4141
- name: gsm8k-eval
4242
storage_type: file
43-
path: 'openai/gsm8k'
43+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
4444
subset_name: 'main'
4545
split: test
4646
format:

0 commit comments

Comments
 (0)