Skip to content

Commit 2f693cc

Browse files
committed
fix documentation
1 parent e994b3f commit 2f693cc

File tree

2 files changed

+12
-11
lines changed

2 files changed

+12
-11
lines changed

examples/evaluation/README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,8 @@ uv run torchrun --standalone --nnodes=1 --nproc-per-node=1 -m lcm.evaluation \
161161
--tasks lcm_generation \
162162
--task_args '{"max_gen_len": 200}' \
163163
--dataset.parquet_path parquet_dataset/cnn_dailymail \
164+
--dataset.source_column prompt_sentences_sonar_emb \
165+
--dataset.target_column answer_sentences_sonar_emb \
164166
--data_loading.batch_size 16 \
165167
--dump_dir output_results
166168
```
@@ -178,13 +180,12 @@ Similar to LLM evaluation, it is possible to specify the prompt prefix and suffi
178180
| `data_loading.batch_size` | Loading and evaluate data in batch. By default `batch_size=10` |
179181
| `dataset_dir` | The directory consists of different JSONL files processed in Step 1. Only used in LLM evaluation
180182
| `dataset.parquet_path` | The parquet path consists of different Parquet files files processed in Step 1. Only used in LCM evaluation
181-
| `dataset.source_column` | The column in the data that refers to the input embedding. Not applicable when evaluating LLMs
182-
| `dataset.source_text_column` | The column in the data that refers to the input text. Not applicable when evaluating LCMs
183-
| `dataset.source_text_column` | The column in the data that refers to the input text. Not applicable when evaluating LCMs
184-
| `dataset.target_column` | The column in the data that refers to the ground-truth embedding. Not applicable when evaluating LLMs
185-
| `dataset.target_text_column` | The column in the data that refers to the ground-truth text. Not applicable when evaluating LCMs
183+
| `dataset.source_column` | The column in the data that refers to the input embedding. Not applicable when evaluating LLMs.
184+
| `dataset.source_text_column` | The column in the data that refers to the input text.
185+
| `dataset.target_column` | The column in the data that refers to the ground-truth embedding. Not applicable when evaluating LLMs.
186+
| `dataset.target_text_column` | The column in the data that refers to the ground-truth text.
186187
| `dataset.source_text_prefix` | The text that will prepended to each input text to make the prompt for the model.
187-
| `dataset.source_text_prefix` | The text that will appended after each input text to make the prompt for the model.
188+
| `dataset.source_text_suffix` | The text that will appended after each input text to make the prompt for the model.
188189
| `task_args` | The JSON-formatted string that represents the task arguments. See [task param list](#task_param_list) below.
189190
| `dump_dir` | The directory consisting output of the eval run. If successful, there should be a file `metrics.eval.jsonl` that consists of metric results, the directory `results` that capture the verbose command line used with the detailed output scores, and the directory `raw_results` that shows
190191
the model output for each individual sample, together with the per-sample metric results.

scripts/prepare_wikipedia.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -85,12 +85,12 @@ def run(output_dir: Path):
8585
# launching config, here we use `local` to run locally, but you can switch it to `slurm` if you have a SLURM cluster.
8686
launcher = Launcher(
8787
cache=None,
88-
# cluster="local",
88+
cluster="local",
8989
# for SLURM you can set some parameters of the launcher here
90-
cluster="slurm",
91-
update_parameters={
92-
"partition": "learn",
93-
},
90+
# cluster="slurm",
91+
# update_parameters={
92+
# "partition": "learn",
93+
# },
9494
)
9595

9696
# launch the shards processing

0 commit comments

Comments
 (0)