Skip to content

Commit b134240

Browse files
committed
fix hard coding path
Signed-off-by: root <zhangyuekai@foxmail.com>
1 parent a4cc659 commit b134240

File tree

5 files changed

+7
-6
lines changed

5 files changed

+7
-6
lines changed

examples/configs/audio_grpo_3B_megatron.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,14 +54,14 @@ loss_fn:
5454
force_on_policy_ratio: false
5555
checkpointing:
5656
enabled: true
57-
checkpoint_dir: results/audio_grpo_3B_megatron_rerun
57+
checkpoint_dir: results/audio_grpo_3B_megatron
5858
metric_name: val:accuracy
5959
higher_is_better: true
6060
keep_top_k: 10
6161
save_period: 100
6262
checkpoint_must_save_by: null
6363
policy:
64-
model_name: /workspace_yuekai/HF/Qwen2.5-Omni-3B
64+
model_name: Qwen/Qwen2.5-Omni-3B
6565
tokenizer:
6666
name: ${policy.model_name}
6767
train_global_batch_size: 32
@@ -224,7 +224,7 @@ data:
224224
split: validation
225225
# default settings for all datasets
226226
default:
227-
prompt_file: examples/prompts/avqa_cot.txt
227+
prompt_file: null
228228
system_prompt_file: null
229229
processor: "vlm_hf_data_processor"
230230
env_name: "avqa"

examples/configs/sft_audio_lm_megatron.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ checkpointing:
2222
save_period: 500
2323

2424
policy:
25-
model_name: "/workspace_yuekai/HF/Qwen2-Audio-7B"
25+
model_name: "Qwen/Qwen2-Audio-7B"
2626
tokenizer:
2727
name: ${policy.model_name}
2828
train_global_batch_size: 32

examples/prompts/avqa_cot.txt

Lines changed: 0 additions & 1 deletion
This file was deleted.

nemo_rl/data/datasets/response_datasets/avqa.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ def __init__(self, split: str = "train", **kwargs):
8181
f"Invalid split: {split}. Please use one of {VALID_SPLITS}."
8282
)
8383

84-
self.dataset = load_dataset("/workspace_yuekai/HF/avqa-processed", split=split)
84+
self.dataset = load_dataset("gijs/avqa-processed", split=split)
8585

8686
self.dataset = self.dataset.add_column(
8787
"task_name", [self.task_name] * len(self.dataset)

nemo_rl/data/processors.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -469,6 +469,8 @@ def vlm_hf_data_processor(
469469
datum_dict = format_geometry3k_dataset(datum_dict)
470470
elif datum_dict["task_name"] == "avqa":
471471
pass # AVQA data is already formatted by AVQADataset.format_data
472+
elif datum_dict["task_name"] == "aishell":
473+
pass # AISHELL data is already formatted by AishellDataset.format_data
472474
else:
473475
raise ValueError(f"No data processor for task {datum_dict['task_name']}")
474476

0 commit comments

Comments
 (0)