Skip to content

Commit cff1134

Browse files
authored
Merge pull request #27 from FederatedAI/dev-2.1.1
Dev 2.1.1
2 parents ff20184 + 11fccd7 commit cff1134

17 files changed

+602
-126
lines changed

RELEASE.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
## Release 2.1.1
2+
### Major Features and Improvments
3+
> Fate-Test: FATE Automated Testing Tool
4+
* Add new subcommand `llmsuite` for FATE-LLM training and evaluation
5+
16
## Release 2.1.0
27
### Major Features and Improvements
38
> Fate-Test: FATE Automated Testing Tool

doc/fate_test.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ A collection of useful tools to running FATE tests and PipeLine tasks.
99
```bash
1010
pip install -e python/fate_test
1111
```
12-
2. edit default fate\_test\_config.yaml
12+
2. edit default fate\_test\_config.yaml; edit path to fate base/data base accordingly
1313

1414
```bash
1515
# edit priority config file with system default editor
@@ -88,4 +88,16 @@ shown in last step
8888
```bash
8989
fate_test data generate -i <path contains *performance.yaml> -ng 10000 -fg 10 -fh 10 -m 1.0 --upload-data
9090
fate_test performance -i <path contains *performance.yaml> --skip-data
91-
```
91+
```
92+
93+
- [llm-suite](./fate_test_command.md#llmsuite): used for running FATE-Llm testsuites, collection of FATE-Llm jobs and/or evaluations
94+
95+
Before running llmsuite for the first time, make sure to install FATE-Llm and allow its import in FATE-Test scripts:
96+
97+
```bash
98+
fate_test config include fate-llm
99+
```
100+
101+
```bash
102+
fate_test llmsuite -i <path contains *llmsuite.yaml>
103+
```

doc/fate_test_command.md

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -867,3 +867,155 @@ fate_test data --help
867867
data after generate and upload dataset in testsuites
868868
*path1*
869869

870+
871+
## Llmsuite
872+
873+
Llmsuite is used for running a collection of FATE-Llm jobs in sequence and then evaluate them on user-specified tasks.
874+
It also allows users to compare the results of different llm jobs.
875+
876+
### command options
877+
878+
```bash
879+
fate_test llmsuite --help
880+
```
881+
882+
1. include:
883+
884+
```bash
885+
fate_test llmsuite -i <path1 contains *llmsuite.yaml>
886+
```
887+
888+
will run llm testsuites in
889+
*path1*
890+
891+
2. exclude:
892+
893+
```bash
894+
fate_test llmsuite -i <path1 contains *llmsuite.yaml> -e <path2 to exclude> -e <path3 to exclude> ...
895+
```
896+
897+
will run llm testsuites in *path1* but not in *path2* and *path3*
898+
899+
3. glob:
900+
901+
```bash
902+
fate_test llmsuite -i <path1 contains *llmsuite.yaml> -g "hetero*"
903+
```
904+
905+
will run llm testsuites in sub directory start with *hetero* of
906+
*path1*
907+
908+
4. algorithm-suite:
909+
910+
```bash
911+
fate_test llmsuite -a "pellm"
912+
```
913+
914+
will run built-in 'pellm' llm testsuite, which will train and evaluate a FATE-Llm model and a zero-shot model
915+
916+
5. timeout:
917+
918+
```bash
919+
fate_test llmsuite -i <path1 contains *llmsuite.yaml> -m 3600
920+
```
921+
922+
will run llm testsuites in *path1* and timeout when job does not finish
923+
within 3600s; if tasks need more time, use a larger threshold
924+
925+
6. task-cores
926+
927+
```bash
928+
fate_test llmsuite -i <path1 contains *llmsuite.yaml> -p 4
929+
```
930+
931+
will run llm testsuites in *path1* with script config "task-cores" set to 4
932+
933+
7. eval-config:
934+
935+
```bash
936+
fate_test llmsuite -i <path1 contains *llmsuite.yaml> --eval-config <path2>
937+
```
938+
939+
will run llm testsuites in *path1* with evaluation configuration set to *path2*
940+
941+
8. skip-evaluate:
942+
943+
```bash
944+
fate_test llmsuite -i <path1 contains *llmsuite.yaml> --skip-evaluate
945+
```
946+
947+
will run llm testsuites in *path1* without running evaluation
948+
949+
9. provider:
950+
951+
```bash
952+
fate_test llmsuite -i <path1 contains *llmsuite.yaml> --provider <provider_name>
953+
```
954+
955+
will run llm testsuites in *path1* with FATE provider set to *provider_name*
956+
957+
10. yes:
958+
959+
```bash
960+
fate_test llmsuite -i <path1 contains *llmsuite.yaml> --yes
961+
```
962+
963+
will run llm testsuites in *path1* directly, skipping double check
964+
965+
966+
### FATE-Llm job configuration
967+
968+
Configuration of jobs should be specified in a llm testsuite whose
969+
file name ends with "\*llmsuite.yaml". For llm testsuite example,
970+
please refer [here](https://github.com/FederatedAI/FATE-LLM).
971+
972+
A FATE-Llm testsuite includes the following elements:
973+
974+
- job group: each group includes arbitrary number of jobs with paths
975+
to corresponding script and configuration
976+
977+
- job: name of evaluation job to be run, must be unique within each group
978+
list
979+
980+
- script: path to [testing script](#testing-script), should be
981+
relative to testsuite, optional for evaluation-only jobs;
982+
note that pretrained model, if available, should be returned at the end of the script
983+
- conf: path to job configuration file for script, should be
984+
relative to testsuite, optional for evaluation-only jobs
985+
- pretrained: path to pretrained model, should be either model name from Huggingface or relative path to
986+
testsuite, optional for jobs needed to run FATE-Llm training job, where the
987+
script should return path to the pretrained model
988+
- peft: path to peft file, should be relative to testsuite,
989+
optional for jobs needed to run FATE-Llm training job
990+
- tasks: list of tasks to be evaluated, optional for jobs skipping evaluation
991+
- include_path: should be specified if tasks are user-defined
992+
- eval_conf: path to evaluation configuration file, should be
993+
relative to testsuite; if not provided, will use default conf
994+
995+
```yaml
996+
bloom_lora:
997+
pretrained: "models/bloom-560m"
998+
script: "./test_bloom_lora.py"
999+
conf: "./bloom_lora_config.yaml"
1000+
peft_path_format: "{{fate_base}}/fate_flow/model/{{job_id}}/guest/{{party_id}}/{{model_task_name}}/0/output/output_model/model_directory"
1001+
tasks:
1002+
- "dolly-15k"
1003+
1004+
```
1005+
1006+
- llm suite
1007+
1008+
```yaml
1009+
hetero_nn_sshe_binary_0:
1010+
bloom_lora:
1011+
pretrained: "bloom-560m"
1012+
script: "./test_bloom_lora.py"
1013+
conf: "./bloom_lora_config.yaml"
1014+
peft_path_format: "{{fate_base}}/fate_flow/model/{{job_id}}/guest/{{party_id}}/{{model_task_name}}/0/output/output_model/model_directory"
1015+
tasks:
1016+
- "dolly-15k"
1017+
bloom_zero_shot:
1018+
pretrained: "bloom-560m"
1019+
tasks:
1020+
- "dolly-15k"
1021+
```

python/fate_test/_config.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,20 +36,20 @@
3636
# st_config_directory: examples/flow_test_template/hetero_lr/flow_test_config.yaml
3737
3838
# directory stores testsuite file with min_test data sets to upload,
39-
# default location={FATE}/examples/data/upload_config/min_test_data_testsuite.json
40-
min_test_data_config: examples/data/upload_config/min_test_data_testsuite.json
39+
# default location={FATE}/examples/data/upload_config/min_test_data_testsuite.yaml
40+
min_test_data_config: examples/data/upload_config/min_test_data_testsuite.yaml
4141
# directory stores testsuite file with all example data sets to upload,
42-
# default location={FATE}/examples/data/upload_config/all_examples_data_testsuite.json
43-
all_examples_data_config: examples/data/upload_config/all_examples_data_testsuite.json
42+
# default location={FATE}/examples/data/upload_config/all_examples_data_testsuite.yaml
43+
all_examples_data_config: examples/data/upload_config/all_examples_data_testsuite.yaml
4444
4545
# directory where FATE code locates, default installation location={FATE}/fate
4646
# python/ml -> $fate_base/python/ml
47-
fate_base: path(FATE)/fate
47+
fate_base: path(FATE)/
4848
4949
# whether to delete data in suites after all jobs done
5050
clean_data: true
5151
52-
# participating parties' id and correponding flow service ip & port information
52+
# participating parties' id and corresponding flow service ip & port information
5353
parties:
5454
guest: ['9999']
5555
host: ['10000', '9999']

python/fate_test/_flow_client.py

Lines changed: 25 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,29 @@ def __init__(self,
4141
def set_address(self, address):
4242
self.address = address
4343

44+
def bind_table(self, data: Data, callback=None):
45+
conf = data.config
46+
conf['file'] = os.path.join(str(self._data_base_dir), conf.get('file'))
47+
path = Path(conf.get('file'))
48+
if not path.exists():
49+
raise Exception('The file is obtained from the fate flow client machine, but it does not exist, '
50+
f'please check the path: {path}')
51+
response = self._client.table.bind_path(path=str(path),
52+
namespace=data.namespace,
53+
name=data.table_name)
54+
try:
55+
if callback is not None:
56+
callback(response)
57+
status = str(response['message']).lower()
58+
else:
59+
status = response["message"]
60+
code = response["code"]
61+
if code != 0:
62+
raise RuntimeError(f"Return code {code} != 0, bind path failed")
63+
except BaseException:
64+
raise ValueError(f"Bind path failed, response={response}")
65+
return status
66+
4467
def transform_local_file_to_dataframe(self, data: Data, callback=None, output_path=None):
4568
#data_warehouse = self.upload_data(data, callback, output_path)
4669
#status = self.transform_to_dataframe(data.namespace, data.table_name, data_warehouse, callback)
@@ -82,44 +105,6 @@ def upload_file_and_convert_to_dataframe(self, data: Data, callback=None, output
82105
self._awaiting(job_id, "local", 0)
83106
return status
84107

85-
"""def upload_data(self, data: Data, callback=None, output_path=None):
86-
response, file_path = self._upload_data(data, output_path=output_path)
87-
try:
88-
if callback is not None:
89-
callback(response)
90-
code = response["code"]
91-
if code != 0:
92-
raise ValueError(f"Return code {code}!=0")
93-
94-
namespace = response["data"]["namespace"]
95-
name = response["data"]["name"]
96-
job_id = response["job_id"]
97-
except BaseException:
98-
raise ValueError(f"Upload data fails, response={response}")
99-
# self.monitor_status(job_id, role=self.role, party_id=self.party_id)
100-
self._awaiting(job_id, "local", 0)
101-
102-
return dict(namespace=namespace, name=name)
103-
104-
def transform_to_dataframe(self, namespace, table_name, data_warehouse, callback=None):
105-
response = self._client.data.dataframe_transformer(namespace=namespace,
106-
name=table_name,
107-
data_warehouse=data_warehouse)
108-
109-
try:
110-
if callback is not None:
111-
callback(response)
112-
status = self._awaiting(response["job_id"], "local", 0)
113-
status = str(status).lower()
114-
else:
115-
status = response["retmsg"]
116-
117-
except Exception as e:
118-
raise RuntimeError(f"upload data failed") from e
119-
job_id = response["job_id"]
120-
self._awaiting(job_id, "local", 0)
121-
return status"""
122-
123108
def delete_data(self, data: Data):
124109
try:
125110
table_name = data.config['table_name'] if data.config.get(
@@ -154,27 +139,6 @@ def _awaiting(self, job_id, role, party_id, callback=None):
154139
callback(response)
155140
time.sleep(1)
156141

157-
"""def _upload_data(self, data, output_path=None, verbose=0, destroy=1):
158-
conf = data.config
159-
# if conf.get("engine", {}) != "PATH":
160-
if output_path is not None:
161-
conf['file'] = os.path.join(os.path.abspath(output_path), os.path.basename(conf.get('file')))
162-
else:
163-
if _config.data_switch is not None:
164-
conf['file'] = os.path.join(str(self._cache_directory), os.path.basename(conf.get('file')))
165-
else:
166-
conf['file'] = os.path.join(str(self._data_base_dir), conf.get('file'))
167-
path = Path(conf.get('file'))
168-
if not path.exists():
169-
raise Exception('The file is obtained from the fate flow client machine, but it does not exist, '
170-
f'please check the path: {path}')
171-
response = self._client.data.upload(file=str(path),
172-
head=data.head,
173-
meta=data.meta,
174-
extend_sid=data.extend_sid,
175-
partitions=data.partitions)
176-
return response, conf["file"]"""
177-
178142
def _output_data_table(self, job_id, role, party_id, task_name):
179143
response = self._client.output.data_table(job_id, role=role, party_id=party_id, task_name=task_name)
180144
if response.get("code") is not None:
@@ -223,7 +187,7 @@ def get_version(self):
223187
"""def _add_notes(self, job_id, role, party_id, notes):
224188
data = dict(job_id=job_id, role=role, party_id=party_id, notes=notes)
225189
response = AddNotesResponse(self._post(url='job/update', json=data))
226-
return response"""
190+
return response
227191
228192
def _table_bind(self, data):
229193
response = self._post(url='table/bind', json=data)
@@ -235,6 +199,7 @@ def _table_bind(self, data):
235199
except Exception as e:
236200
raise RuntimeError(f"table bind error: {response}") from e
237201
return response
202+
"""
238203

239204

240205
class Status(object):

python/fate_test/_io.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,10 @@ def echo(cls, message, **kwargs):
3232
click.secho(message, **kwargs)
3333
click.secho(message, file=cls._file, **kwargs)
3434

35+
@classmethod
36+
def sep_line(cls):
37+
click.secho("-------------------------------------------------")
38+
3539
@classmethod
3640
def file(cls, message, **kwargs):
3741
click.secho(message, file=cls._file, **kwargs)

0 commit comments

Comments
 (0)