Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/evaluation/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ We support many popular benchmarks and it's easy to add new in the future. The f
- [**Instruction following**](./instruction-following.md): e.g. [ifbench](./instruction-following.md#ifbench), [ifeval](./instruction-following.md#ifeval)
- [**Long-context**](./long-context.md): e.g. [ruler](./long-context.md#ruler), [mrcr](./long-context.md#mrcr)
- [**Tool-calling**](./tool-calling.md): e.g. [bfcl_v3](./tool-calling.md#bfcl_v3)
- [**Multilingual**](./multilingual.md): e.g. [mmlu-prox](./multilingual.md#mmlu-prox), [flores-200](./multilingual.md#FLORES-200), [wmt24pp](./multilingual.md#wmt24pp)
- [**Multilingual**](./multilingual.md): e.g. [mmlu-prox](./multilingual.md#mmlu-prox), [flores-200](./multilingual.md#flores-200), [wmt24pp](./multilingual.md#wmt24pp)
- [**Speech & Audio**](./speech-audio.md): e.g. [asr-leaderboard](./speech-audio.md#asr-leaderboard), [mmau-pro](./speech-audio.md#mmau-pro)

See [nemo_skills/dataset](https://github.com/NVIDIA-NeMo/Skills/blob/main/nemo_skills/dataset) where each folder is a benchmark we support.
Expand Down Expand Up @@ -177,7 +177,7 @@ code execution timeout for scicode benchmark
!!! tip "Passing Main Arguments with Config Files"

For parameters that are difficult to escape on the command line (like `end_reasoning_string='</think>'`),
you can use YAML config files instead. See [Passing Main Arguments with Config Files](../pipelines/index.md###passing-main-arguments-with-config-files) for details.
you can use YAML config files instead. See [Passing Main Arguments with Config Files](../pipelines/index.md#passing-main-arguments-with-config-files) for details.


## Using data on cluster
Expand Down
3 changes: 1 addition & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Here are some of the features we support:
- [**Instruction following**](./evaluation/instruction-following.md): e.g. [ifbench](./evaluation/instruction-following.md#ifbench), [ifeval](./evaluation/instruction-following.md#ifeval)
- [**Long-context**](./evaluation/long-context.md): e.g. [ruler](./evaluation/long-context.md#ruler), [mrcr](./evaluation/long-context.md#mrcr)
- [**Tool-calling**](./evaluation/tool-calling.md): e.g. [bfcl_v3](./evaluation/tool-calling.md#bfcl_v3)
- [**Multilingual capabilities**](./evaluation/multilingual.md): e.g. [mmlu-prox](./evaluation/multilingual.md#mmlu-prox), [flores-200](./evaluation/multilingual.md#FLORES-200), [wmt24pp](./evaluation/multilingual.md#wmt24pp)
- [**Multilingual capabilities**](./evaluation/multilingual.md): e.g. [mmlu-prox](./evaluation/multilingual.md#mmlu-prox), [flores-200](./evaluation/multilingual.md#flores-200), [wmt24pp](./evaluation/multilingual.md#wmt24pp)
- [**Speech & Audio**](./evaluation/speech-audio.md): e.g. [asr-leaderboard](./evaluation/speech-audio.md#asr-leaderboard), [mmau-pro](./evaluation/speech-audio.md#mmau-pro)
- [**Robustness evaluation**](./evaluation/robustness.md): Evaluate model sensitvity against changes in prompt.
- Easily parallelize each evaluation across many Slurm jobs, self-host LLM judges, bring your own prompts or change benchmark configuration in any other way.
Expand All @@ -36,4 +36,3 @@ You can find more examples of how to use Nemo-Skills in the [tutorials](./tutori
We've built and released many popular models and datasets using Nemo-Skills. See all of them in the [Papers & Releases](./releases/index.md) documentation.

We support many popular benchmarks and it's easy to add new in the future. The following categories of benchmarks are supported

4 changes: 1 addition & 3 deletions docs/pipelines/generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ See [nemo_skills/inference/generate.py](https://github.com/NVIDIA-NeMo/Skills/bl
!!! tip "Passing Main Arguments with Config Files"

For parameters that are difficult to escape on the command line (like `end_reasoning_string='</think>'`),
you can use YAML config files instead. See [Passing Main Arguments with Config Files](index.md###passing-main-arguments-with-config-files) for details.
you can use YAML config files instead. See [Passing Main Arguments with Config Files](index.md#passing-main-arguments-with-config-files) for details.


## Sampling multiple generations
Expand Down Expand Up @@ -470,5 +470,3 @@ We support three methods for automatic trimming of generation budget or context:
++server.enable_soft_fail=True
++server.context_limit_retry_strategy=reduce_prompt_from_end
```


2 changes: 1 addition & 1 deletion docs/pipelines/start-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Similarly, the local port for the sandbox server can be changed using `--sandbox

## Using the Server

To use this started server in [Evaluation](/Skills/pipelines/evaluation/) or [Generation](/Skills/pipelines/generation/),
To use this started server in [Evaluation](evaluation.md) or [Generation](generation.md),
all the model-related arguments can now be replaced with `--server_type=openai` and `server_address` arguments.

For instance, for the vLLM model server above, the `eval` pipeline arguments can be modified as,
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
strict: true
site_name: Nemo-Skills
site_url: https://nvidia-nemo.github.io/Skills
extra_css:
Expand Down