Skip to content

Commit a75c289

Browse files
YunnglinJintao-Huang
authored andcommitted
fix evalscope config (#5899)
1 parent 0e9a394 commit a75c289

File tree

4 files changed

+9
-7
lines changed

4 files changed

+9
-7
lines changed

docs/source/Instruction/评测.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ SWIFT的eval能力使用了魔搭社区[评测框架EvalScope](https://github.co
1010
1111
目前我们支持了**标准评测集**的评测流程,以及**用户自定义**评测集的评测流程。其中**标准评测集**由三个评测后端提供支持:
1212

13-
下面展示所支持的数据集名称,若需了解数据集的详细信息,请参考[所有支持的数据集](https://evalscope.readthedocs.io/zh-cn/latest/get_started/supported_dataset.html)
13+
下面展示所支持的数据集名称,若需了解数据集的详细信息,请参考[所有支持的数据集](https://evalscope.readthedocs.io/zh-cn/latest/get_started/supported_dataset/index.html)
1414

1515
1. Native(默认):
1616

docs/source_en/Instruction/Evaluation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ SWIFT's eval capability utilizes the EvalScope evaluation framework from the Mag
1010
1111
Currently, we support the evaluation process of **standard evaluation datasets** as well as the evaluation process of **user-defined** evaluation datasets. The **standard evaluation datasets** are supported by three evaluation backends:
1212

13-
Below are the names of the supported datasets. For detailed information on the datasets, please refer to [all supported datasets](https://evalscope.readthedocs.io/en/latest/get_started/supported_dataset.html).
13+
Below are the names of the supported datasets. For detailed information on the datasets, please refer to [all supported datasets](https://evalscope.readthedocs.io/en/latest/get_started/supported_dataset/index.html).
1414

1515
1. Native (default):
1616

swift/llm/eval/utils.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,9 +99,10 @@ def collect_model_arg(name: str) -> Optional[Any]:
9999
# Extract required model parameters
100100
self.model = collect_model_arg('model') # model path or identifier
101101
self.template = collect_model_arg('template') # conversation template
102+
self.max_batch_size = collect_model_arg('max_batch_size') # maximum batch size
102103

103104
# Initialize the inference engine with batch support
104-
self.engine = PtEngine.from_model_template(self.model, self.template, max_batch_size=self.config.batch_size)
105+
self.engine = PtEngine.from_model_template(self.model, self.template, max_batch_size=self.max_batch_size)
105106

106107
def generate(
107108
self,

swift/trainers/mixin.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -779,16 +779,17 @@ def _compute_acc(self, outputs, labels) -> None:
779779

780780
@torch.no_grad()
781781
def _evalscope_eval(self):
782-
from ..llm.eval.utils import EvalModel # registry here
782+
from ..llm.eval.utils import EvalModel
783783
from evalscope import TaskConfig, run_task
784784

785785
self.model.eval()
786-
786+
# prepare task config
787787
task_config_kwargs = dict(
788-
model=f'model-step{self.state.global_step}',
789-
model_args=dict(
788+
model=EvalModel(
789+
model_name=f'model-step{self.state.global_step}',
790790
model=self.model,
791791
template=self.template,
792+
max_batch_size=self.args.per_device_eval_batch_size,
792793
),
793794
eval_type='swift_custom',
794795
datasets=self.args.eval_dataset,

0 commit comments

Comments
 (0)