You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,7 +37,7 @@ SyGra Framework is created to generate synthetic data. As it is a complex proces
37
37
- Define a task, which involves graph node configuration, flow between nodes and conditions between the node
38
38
- Define the output location to dump the generated data
39
39
40
-
Seed data can be pulled from either Huggingface or file system. Once the seed data is loaded, SyGra platform allows datagen users to write any data processing using the data transformation module. When the data is ready, users can define the data flow with various types of nodes. A node can also be a subgraph defined in another yaml file.
40
+
Seed data can be pulled from various data source, few examples are Huggingface, File system, ServiceNow Instance. Once the seed data is loaded, SyGra platform allows datagen users to write any data processing using the data transformation module. When the data is ready, users can define the data flow with various types of nodes. A node can also be a subgraph defined in another yaml file.
41
41
42
42
Each node can be defined with preprocessing, post processing, and LLM prompt with model parameters. Prompts can use seed data as python template keys.
43
43
Edges define the flow between nodes, which can be conditional or non-conditional, with support for parallel and one-to-many flows.
@@ -114,8 +114,10 @@ workflow.run(num_records=1)
114
114
The SyGra architecture is composed of multiple components. The following diagrams illustrate the four primary components and their associated modules.
115
115
116
116
### Data Handler
117
-
Data handler is used for reading and writing the data. Currently, it supports file handler with various file types and huggingface handler.
118
-
When reading data from huggingface, it can read the whole dataset and process, or it can stream chunk of data.
117
+
Data handler is used for reading and writing the data. Currently, it supports following handlers:
118
+
- File handler with various file types like JSON, JSONL, CSV, Parquet, Folder with supported type.
119
+
- Huggingface handler: When reading data from huggingface, it can read the whole dataset and process, or it can stream chunk of data.
120
+
- ServiceNow Handler to connect a ServiceNow instance : Currently it reads or writes into a single table per dataset configuration.
0 commit comments