Skip to content

Commit efb5253

Browse files
author
root
committed
readme美观
1 parent 97240d0 commit efb5253

File tree

1 file changed

+18
-5
lines changed

1 file changed

+18
-5
lines changed

README.md

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,19 +4,26 @@
44
https://github.com/microsoft/agent-lightning
55
源码改动:
66
注释掉agentlightning/runner.py 115行
7+
```
78
if trace_spans:
89
triplets = self.triplet_exporter.export(trace_spans)
10+
```
911
agentlightning/verl/daemon.py 338行
12+
```
1013
trace_list = [
1114
{"prompt_ids": t.prompt.get("token_ids", []), "response_ids": t.response.get("token_ids", []), "reward": t.reward}
1215
for t in rollout.triplets
1316
]
17+
```
1418
agentlightning/verl/daemon.py 418行
1519
注释掉
20+
```
1621
reward_list.append(sample_info["reward"])
22+
```
1723
改为
24+
```
1825
reward_list.append(trace["reward"])
19-
26+
```
2027
添加examples/werewolf 实现
2128

2229
和agentscope(458e8eedc94bba89bc3e4c6756e35fb4defbc0ac,Sep 15, 2025)实现的一个中文狼人杀agent-rl训练的案例
@@ -27,6 +34,7 @@ https://github.com/af-74413592/agentscope
2734
需做如下改动:
2835
src/agentscope/model/_openai_model.py 371行
2936
改为
37+
```
3038
if choice.message.content:
3139
try:
3240
thinking_part = choice.message.content.split("<think>")[1].split("</think>")[0]
@@ -50,8 +58,9 @@ except:
5058
text=response.choices[0].message.content,
5159
),
5260
)
53-
61+
```
5462
处理过长的prompt:src/agentscope/model/_openai_model.py OpenAIChatModel 的__call__ 函数
63+
```
5564
conversations = [{"role":msg["role"], "content":msg["content"][0]['text'] if type(msg["content"]) == list else msg["content"]} for msg in messages]
5665
input_ids = self.tokenizer.apply_chat_template(
5766
conversations,
@@ -67,17 +76,21 @@ while len(input_ids) > 10000: (比maxlen稍微小一点)
6776
add_generation_prompt=True,
6877
tokenize=True,
6978
)
70-
79+
```
7180
verlv0.5.0 改动
7281

7382
注释掉 verl trainer/ppo/ray_trainer.py 415-418行
83+
```
7484
real_train_batch_size = config.data.train_batch_size * config.actor_rollout_ref.rollout.n
7585
assert real_train_batch_size % minimal_bsz == 0, (
7686
f"real_train_batch_size ({real_train_batch_size}) must be divisible by minimal possible batch size "
7787
f"({minimal_bsz})"
7888
)
79-
注释掉 verl trainer/ppo/ray_trainer.py 500 行 # assert config.data.train_batch_size >= config.actor_rollout_ref.actor.ppo_mini_batch_size
80-
89+
```
90+
注释掉 verl trainer/ppo/ray_trainer.py 500 行
91+
```
92+
assert config.data.train_batch_size >= config.actor_rollout_ref.actor.ppo_mini_batch_size
93+
```
8194

8295
####################################################################
8396

0 commit comments

Comments
 (0)