Skip to content

Commit c42b604

Browse files
refactor: refactor data format to support multi-modal input
1 parent 7643b9f commit c42b604

File tree

24 files changed

+261
-147
lines changed

24 files changed

+261
-147
lines changed

examples/configs/cot_config.yaml

Lines changed: 0 additions & 33 deletions
This file was deleted.

examples/configs/multi_hop_config.yaml

Lines changed: 0 additions & 34 deletions
This file was deleted.

examples/configs/vqa_config.yaml

Lines changed: 0 additions & 32 deletions
This file was deleted.

examples/extract/extract_schema_guided.sh renamed to examples/extract/extract_schema_guided/extract_schema_guided.sh

File renamed without changes.

examples/configs/schema_guided_extraction_config.yaml renamed to examples/extract/extract_schema_guided/schema_guided_extraction_config.yaml

File renamed without changes.

examples/generate/generate_aggregated_qa/aggregated_config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ nodes:
4949
- quiz
5050
execution_params:
5151
replicas: 1
52-
batch_size: 16
52+
batch_size: 128
5353

5454
- id: partition
5555
op_name: partition
@@ -71,7 +71,7 @@ nodes:
7171
- partition
7272
execution_params:
7373
replicas: 1
74-
batch_size: 16
74+
batch_size: 128
7575
params:
7676
method: aggregated # atomic, aggregated, multi_hop, cot, vqa
7777
data_format: ChatML # Alpaca, Sharegpt, ChatML

examples/generate/generate_atomic_qa/atomic_config.yaml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,20 +8,25 @@ nodes:
88
dependencies: []
99
params:
1010
input_path:
11-
- resources/input_examples/json_demo.json
11+
- examples/input_examples/json_demo.json
1212

1313
- id: chunk
1414
op_name: chunk
1515
type: map_batch
1616
dependencies:
1717
- read
18+
execution_params:
19+
replicas: 4
1820
params:
1921
chunk_size: 1024
2022
chunk_overlap: 100
2123

2224
- id: build_kg
2325
op_name: build_kg
2426
type: map_batch
27+
execution_params:
28+
replicas: 1
29+
batch_size: 128
2530
dependencies:
2631
- chunk
2732

@@ -40,6 +45,9 @@ nodes:
4045
type: map_batch
4146
dependencies:
4247
- partition
48+
execution_params:
49+
replicas: 1
50+
batch_size: 128
4351
params:
4452
method: atomic
4553
data_format: Alpaca

examples/generate/generate_cot.sh

Lines changed: 0 additions & 3 deletions
This file was deleted.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Generate CoT QAs
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
global_params:
2+
working_dir: cache
3+
4+
nodes:
5+
- id: read
6+
op_name: read
7+
type: source
8+
dependencies: []
9+
params:
10+
input_path:
11+
- examples/input_examples/txt_demo.txt
12+
13+
- id: chunk
14+
op_name: chunk
15+
type: map_batch
16+
dependencies:
17+
- read
18+
execution_params:
19+
replicas: 4
20+
params:
21+
chunk_size: 1024
22+
chunk_overlap: 100
23+
24+
- id: build_kg
25+
op_name: build_kg
26+
type: map_batch
27+
execution_params:
28+
replicas: 1
29+
batch_size: 128
30+
dependencies:
31+
- chunk
32+
33+
- id: partition
34+
op_name: partition
35+
type: aggregate
36+
dependencies:
37+
- build_kg
38+
params:
39+
method: leiden
40+
method_params:
41+
max_size: 20
42+
use_lcc: false
43+
random_seed: 42
44+
45+
- id: generate
46+
op_name: generate
47+
type: map_batch
48+
dependencies:
49+
- partition
50+
execution_params:
51+
replicas: 1
52+
batch_size: 128
53+
params:
54+
method: cot
55+
data_format: Sharegpt

0 commit comments

Comments
 (0)