Skip to content

Commit 4150cc8

Browse files
authored
Fix some typos (#447)
1 parent d2be0b7 commit 4150cc8

File tree

14 files changed

+23
-23
lines changed

14 files changed

+23
-23
lines changed

benchmark/bench.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -197,7 +197,7 @@ def prepare_configs(args, rank, current_time):
197197
)
198198
eval_taskset_config = config["buffer"]["explorer_input"]["eval_tasksets"]
199199
if len(eval_taskset_config) > 0:
200-
# TODO: support seperately set path for eval taskset
200+
# TODO: support separately set path for eval taskset
201201
for eval_taskset_config in eval_taskset_config:
202202
eval_taskset_config["path"] = taskset_config["path"]
203203
if args.lr:

benchmark/config/guru_math-template.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ buffer:
3636
format:
3737
prompt_key: question
3838
response_key: ground_truth
39-
system_prompt: "You are a helpful assistant. To answer a query from the user, please first thinks through the question step-by-step inside <think>...</think>, then provides the final response to user."
39+
system_prompt: "You are a helpful assistant. To answer a query from the user, please first think through the question step-by-step inside <think>...</think>, then provides the final response to user."
4040
reply_prefix: "<think>"
4141
rollout_args:
4242
temperature: 1.0

examples/agentscope_frozenlake/agent.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ def get_prompt(self, observation: str) -> str:
4141
)
4242
if self.current_step > 0 and self.last_action is not None:
4343
if self.last_observation == observation:
44-
prompt += "\nYour last response is invalid. Your position didn't change at all. You may need to recheck your thinking process, action outputted, and the format of response. Remember, you should only output the NEXT ACTION at each interation in the ``` ```. For example, if you want to move up, you should output ```Up```."
44+
prompt += "\nYour last response is invalid. Your position didn't change at all. You may need to recheck your thinking process, action outputted, and the format of response. Remember, you should only output the NEXT ACTION at each iteration in the ``` ```. For example, if you want to move up, you should output ```Up```."
4545

4646
if self.max_steps is not None and self.max_steps - self.current_step > 0:
4747
prompt += (

examples/agentscope_frozenlake/utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@
5656
5757
You will be provided the current observation, please decide on the next Action.
5858
You should show your thought process and then input the final action in ``` ```.
59-
You should only output the NEXT ACTION at each interation in the ``` ```. For example, if you want to move up, you should output ```Up```.
59+
You should only output the NEXT ACTION at each iteration in the ``` ```. For example, if you want to move up, you should output ```Up```.
6060
You should plan ahead and need to achieve it in minimum number of steps.
6161
You should be aware that frozen tiles can be slippery, but the chance is small and you should not overthink it.
6262

examples/grpo_alfworld/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ The SFT data should be named as `<TRINITY_SFT_DATASET_PATH>/data.json`, followin
1616
"messages": [
1717
{
1818
"role": "system", # fixed, align with the grpo workflow: alfworld_workflow.
19-
"content": "\nYou are an agent interacting with a virtual test-based environments.\n\n## Notes:\nAt each step, you should first think then perform action to fulfill the instruction. You should ALWAYS wrap your thinking with the tag and wrap your action with the tag.\nYou should ALWAYS take one action each step. \nYou should finish the task and buy the item within 15 steps.\nDONOT try to interact with the user at anytime. Finish the task and buy the item by yourself.\n\n## Action Format:\nBelow are the available commands you can use:\n look: look around your current location\n inventory: check your current inventory(you can only have 1 item in your inventory)\n go to (receptacle): move to a receptacle\n open (receptacle): open a receptacle\n close (receptacle): close a receptacle\n take (object) from (receptacle): take an object from a receptacle\n move (object) to (receptacle): place an object in or on a receptacle\n examine (something): examine a receptacle or an object\n use (object): use an object\n heat (object) with (receptacle): heat an object using a receptacle\n clean (object) with (receptacle): clean an object using a receptacle\n cool (object) with (receptacle): cool an object using a receptacle\n slice (object) with (object): slice an object using a sharp object\n\nFor example your output should be like this:\n To solve the task, I need first to ... go to cabinet 1\n"
19+
"content": "\nYou are an agent interacting with a virtual text-based environment.\n\n## Notes:\nAt each step, you should first think then perform action to fulfill the instruction. You should ALWAYS wrap your thinking with the tag and wrap your action with the tag.\nYou should ALWAYS take one action each step. \nYou should finish the task and buy the item within 15 steps.\nDO NOT try to interact with the user at anytime. Finish the task and buy the item by yourself.\n\n## Action Format:\nBelow are the available commands you can use:\n look: look around your current location\n inventory: check your current inventory(you can only have 1 item in your inventory)\n go to (receptacle): move to a receptacle\n open (receptacle): open a receptacle\n close (receptacle): close a receptacle\n take (object) from (receptacle): take an object from a receptacle\n move (object) to (receptacle): place an object in or on a receptacle\n examine (something): examine a receptacle or an object\n use (object): use an object\n heat (object) with (receptacle): heat an object using a receptacle\n clean (object) with (receptacle): clean an object using a receptacle\n cool (object) with (receptacle): cool an object using a receptacle\n slice (object) with (object): slice an object using a sharp object\n\nFor example your output should be like this:\n To solve the task, I need first to ... go to cabinet 1\n"
2020
},
2121
{
2222
"role": "user",

examples/mix_chord/get_openr1_data.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@
3636
MAX_TOKEN_LENGTH = 8196
3737
SFT_SAMPLE_SIZE = 5000
3838
RL_SAMPLE_SIZE = 20000
39-
SYSTEM_PROMPT = """You are a helpful assistant that solves MATH problems. You should first thinks about the reasoning process in mind and then provides the user with the answer. You should present your reasoning process using the format: <think>\n ...your reasoning process here... </think>\n first. You should always include your final answer in \\boxed{} as closed-form results."""
39+
SYSTEM_PROMPT = """You are a helpful assistant that solves MATH problems. You should first think about the reasoning process in mind and then provides the user with the answer. You should present your reasoning process using the format: <think>\n ...your reasoning process here... </think>\n first. You should always include your final answer in \\boxed{} as closed-form results."""
4040

4141

4242
def can_convert_to_int(answer):

tests/common/vllm_test.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -470,7 +470,7 @@ async def test_api(self):
470470
471471
You will be provided the current observation, please decide on the next Action.
472472
You should show your thought process and then input the final action in ``` ```.
473-
You should only output the NEXT ACTION at each interation in the ``` ```. For example, if you want to move up, you should output ```Up```.
473+
You should only output the NEXT ACTION at each iteration in the ``` ```. For example, if you want to move up, you should output ```Up```.
474474
You should plan ahead and need to achieve it in minimum number of steps.
475475
You should be aware that frozen tiles can be slippery, but the chance is small and you should not overthink it.
476476

trinity/common/rewards/dapo_reward.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -53,11 +53,11 @@ def compute_overlong_penalty(self, response_token):
5353
), "max_response_length must be greater than cache_length"
5454

5555
response_len = len(response_token)
56-
excepted_len = self.max_response_length - self.cache_length
56+
expected_len = self.max_response_length - self.cache_length
5757

58-
if response_len < excepted_len:
58+
if response_len < expected_len:
5959
return 0.0
6060
elif response_len > self.max_response_length:
6161
return -self.penalty_factor
6262
else:
63-
return (excepted_len - response_len) / self.cache_length * self.penalty_factor
63+
return (expected_len - response_len) / self.cache_length * self.penalty_factor

trinity/common/rewards/naive_dapo_score.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
"""
2-
This file contaims the naive dapo reward function for math tasks.
2+
This file contains the naive dapo reward function for math tasks.
33
Adapted from https://github.com/LLM360/Reasoning360/blob/main/verl/utils/reward_score/naive_dapo.py
44
"""
55

trinity/common/workflows/envs/alfworld/alfworld_workflow.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,13 @@
1515
Your task is to: put some spraybottle on toilet.
1616
1717
Response:
18-
<think>To solve the task, I need to find and take a sparybottle, then put it on toilet. First I need to find a spraybottle. A spraybottle is more likely to appear in cabinet (1-4), countertop (1), toilet (1), sinkbasin (1-2), garbagecan (1). I can check one by one, starting with cabinet 1.</think><action>go to cabinet 1</action>
18+
<think>To solve the task, I need to find and take a spraybottle, then put it on toilet. First I need to find a spraybottle. A spraybottle is more likely to appear in cabinet (1-4), countertop (1), toilet (1), sinkbasin (1-2), garbagecan (1). I can check one by one, starting with cabinet 1.</think><action>go to cabinet 1</action>
1919
2020
Observation:
2121
On the cabinet 1, you see a cloth 1, a soapbar 1, a soapbottle 1.
2222
2323
Response:
24-
<think>The sparybottle is not here, I should go to cabinet2 to check one by one</think><action>go to cabinet 2</action>
24+
<think>The spraybottle is not here, I should go to cabinet2 to check one by one</think><action>go to cabinet 2</action>
2525
2626
Observation:
2727
The cabinet 2 is closed.
@@ -49,7 +49,7 @@
4949
"""
5050

5151
AlfWORLD_SYSTEM_PROMPT = """
52-
You are an agent interacting with a virtual test-based environments.
52+
You are an agent interacting with a virtual text-based environment.
5353
5454
## Notes:
5555
At each step, you should first think then perform action to fulfill the instruction. You should ALWAYS wrap your thinking with the <think> </think> tag and wrap your action with the <action> </action> tag.

0 commit comments

Comments
 (0)