Skip to content

Commit 0e9a394

Browse files
committed
[docs] update rejected_tools (#5878)
1 parent 54ddad3 commit 0e9a394

File tree

4 files changed

+5
-2
lines changed

4 files changed

+5
-2
lines changed

docs/source/Customization/自定义数据集.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ alpaca格式:
8888

8989
> 注: RM 额外支持 margin 列,参考[RM文档](../Instruction/人类对齐.md#rm)
9090
91-
当然,你也可以直接使用`rejected_messages`,而不是只提供`rejected_response`/`rejected_images`(需ms-swift>=3.8),这将提供更大的灵活度(例如多模态/agent场景)。在多模态场景下,若使用rejected_messages,你需要额外传入"rejected_images","rejected_audios","rejected_videos"等内容。数据格式例子如下
91+
当然,你也可以直接使用`rejected_messages`,而不是只提供`rejected_response`/`rejected_images`(需ms-swift>=3.8),这将提供更大的灵活度(例如多模态/agent场景)。若使用rejected_messages,在多模态场景下,你需要额外传入"rejected_images","rejected_audios","rejected_videos"等内容;在Agent场景下,你需要额外传入"rejected_tools"等内容。多模态数据格式例子如下
9292

9393
```jsonl
9494
{"messages": [{"role": "user", "content": "<image>这是什么"}, {"role": "assistant", "content": "这是一只小猫咪。"}], "images": ["cat.png"], "rejected_messages": [{"role": "user", "content": "<image>这是什么"}, {"role": "assistant", "content": "这是一只小狗。"}], "rejected_images": ["cat.png"]}

docs/source_en/Customization/Custom-dataset.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ The format of multimodal data should follow the specifications in [Multimodal Da
8888

8989
> Note: RM additionally supports the margin column. For details, refer to the [RM documentation](../Instruction/RLHF.md#rm).
9090
91-
Sure, you can also directly use `rejected_messages` instead of only providing `rejected_response` / `rejected_images` (requires ms-swift>=3.8), which offers greater flexibility (e.g., for multimodal or agent scenarios). In multimodal cases, if you use `rejected_messages`, you need to additionally provide fields such as `"rejected_images"`, `"rejected_audios"`, `"rejected_videos"`, etc. An example of the data format is as follows:
91+
Sure, you can also directly use `rejected_messages` instead of only providing `rejected_response` / `rejected_images` (requires ms-swift>=3.8), which offers greater flexibility (e.g., for multimodal or agent scenarios). If you use "rejected_messages", then in multimodal scenarios you must also provide "rejected_images", "rejected_audios", "rejected_videos", etc.; in Agent scenarios you must also provide "rejected_tools", etc. An example of the multimodal data format is as follows:
9292

9393
```jsonl
9494
{"messages": [{"role": "user", "content": "<image>What is this?"}, {"role": "assistant", "content": "This is a kitten."}], "images": ["kitten.png"], "rejected_messages": [{"role": "user", "content": "<image>What is this?"}, {"role": "assistant", "content": "This is a puppy."}], "rejected_images": ["kitten.png"]}

swift/llm/model/model/qwen.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from transformers import AutoConfig, AutoTokenizer, BitsAndBytesConfig, PreTrainedTokenizerBase
77
from transformers.dynamic_module_utils import get_class_from_dynamic_module
88
from transformers.models.auto.tokenization_auto import get_tokenizer_config
9+
from transformers.utils.versions import require_version
910

1011
from swift.llm import TemplateType
1112
from swift.utils import get_device_count, get_dist_setting, get_env_args, get_logger
@@ -700,6 +701,7 @@ def get_model_tokenizer_qwen2_vl(*args, **kwargs):
700701
patch_get_input_embeddings(base_model.visual, 'patch_embed')
701702

702703
from qwen_vl_utils import vision_process
704+
require_version('qwen_vl_utils<0.0.12')
703705
global_vars = patch_qwen_vl_utils(vision_process)
704706
tokenizer.global_vars = global_vars # In order to have different hashes for the template.
705707
return model, tokenizer

swift/megatron/trainers/dpo_trainer.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ def setup_model_and_optimizer(self, model_provider_func, model_type, *_args, **k
5757
def _forward_step_helper(model, inputs):
5858
args = get_args()
5959
if mpu.is_pipeline_first_stage():
60+
assert args.padding_free, 'Currently `rlhf_type="dpo"` only supports padding_free.'
6061
micro_batch_size = 1 # use qkv_format 'thd'
6162
seq_length = inputs['input_ids'].shape[1]
6263
if args.sequence_parallel:

0 commit comments

Comments
 (0)