diff --git a/docs/evaluation/index.md b/docs/evaluation/index.md index b63943e869..bb35c4d52b 100644 --- a/docs/evaluation/index.md +++ b/docs/evaluation/index.md @@ -9,7 +9,7 @@ We support many popular benchmarks and it's easy to add new in the future. The f - [**Instruction following**](./instruction-following.md): e.g. [ifbench](./instruction-following.md#ifbench), [ifeval](./instruction-following.md#ifeval) - [**Long-context**](./long-context.md): e.g. [ruler](./long-context.md#ruler), [mrcr](./long-context.md#mrcr) - [**Tool-calling**](./tool-calling.md): e.g. [bfcl_v3](./tool-calling.md#bfcl_v3) -- [**Multilingual**](./multilingual.md): e.g. [mmlu-prox](./multilingual.md#mmlu-prox), [flores-200](./multilingual.md#FLORES-200), [wmt24pp](./multilingual.md#wmt24pp) +- [**Multilingual**](./multilingual.md): e.g. [mmlu-prox](./multilingual.md#mmlu-prox), [flores-200](./multilingual.md#flores-200), [wmt24pp](./multilingual.md#wmt24pp) - [**Speech & Audio**](./speech-audio.md): e.g. [asr-leaderboard](./speech-audio.md#asr-leaderboard), [mmau-pro](./speech-audio.md#mmau-pro) See [nemo_skills/dataset](https://github.com/NVIDIA-NeMo/Skills/blob/main/nemo_skills/dataset) where each folder is a benchmark we support. @@ -177,7 +177,7 @@ code execution timeout for scicode benchmark !!! tip "Passing Main Arguments with Config Files" For parameters that are difficult to escape on the command line (like `end_reasoning_string=''`), - you can use YAML config files instead. See [Passing Main Arguments with Config Files](../pipelines/index.md###passing-main-arguments-with-config-files) for details. + you can use YAML config files instead. See [Passing Main Arguments with Config Files](../pipelines/index.md#passing-main-arguments-with-config-files) for details. ## Using data on cluster diff --git a/docs/index.md b/docs/index.md index ce53e256c3..a17dce8f81 100644 --- a/docs/index.md +++ b/docs/index.md @@ -21,7 +21,7 @@ Here are some of the features we support: - [**Instruction following**](./evaluation/instruction-following.md): e.g. [ifbench](./evaluation/instruction-following.md#ifbench), [ifeval](./evaluation/instruction-following.md#ifeval) - [**Long-context**](./evaluation/long-context.md): e.g. [ruler](./evaluation/long-context.md#ruler), [mrcr](./evaluation/long-context.md#mrcr) - [**Tool-calling**](./evaluation/tool-calling.md): e.g. [bfcl_v3](./evaluation/tool-calling.md#bfcl_v3) - - [**Multilingual capabilities**](./evaluation/multilingual.md): e.g. [mmlu-prox](./evaluation/multilingual.md#mmlu-prox), [flores-200](./evaluation/multilingual.md#FLORES-200), [wmt24pp](./evaluation/multilingual.md#wmt24pp) + - [**Multilingual capabilities**](./evaluation/multilingual.md): e.g. [mmlu-prox](./evaluation/multilingual.md#mmlu-prox), [flores-200](./evaluation/multilingual.md#flores-200), [wmt24pp](./evaluation/multilingual.md#wmt24pp) - [**Speech & Audio**](./evaluation/speech-audio.md): e.g. [asr-leaderboard](./evaluation/speech-audio.md#asr-leaderboard), [mmau-pro](./evaluation/speech-audio.md#mmau-pro) - [**Robustness evaluation**](./evaluation/robustness.md): Evaluate model sensitvity against changes in prompt. - Easily parallelize each evaluation across many Slurm jobs, self-host LLM judges, bring your own prompts or change benchmark configuration in any other way. @@ -36,4 +36,3 @@ You can find more examples of how to use Nemo-Skills in the [tutorials](./tutori We've built and released many popular models and datasets using Nemo-Skills. See all of them in the [Papers & Releases](./releases/index.md) documentation. We support many popular benchmarks and it's easy to add new in the future. The following categories of benchmarks are supported - diff --git a/docs/pipelines/generation.md b/docs/pipelines/generation.md index dc299ee9dd..058c176e6c 100644 --- a/docs/pipelines/generation.md +++ b/docs/pipelines/generation.md @@ -98,7 +98,7 @@ See [nemo_skills/inference/generate.py](https://github.com/NVIDIA-NeMo/Skills/bl !!! tip "Passing Main Arguments with Config Files" For parameters that are difficult to escape on the command line (like `end_reasoning_string=''`), - you can use YAML config files instead. See [Passing Main Arguments with Config Files](index.md###passing-main-arguments-with-config-files) for details. + you can use YAML config files instead. See [Passing Main Arguments with Config Files](index.md#passing-main-arguments-with-config-files) for details. ## Sampling multiple generations @@ -470,5 +470,3 @@ We support three methods for automatic trimming of generation budget or context: ++server.enable_soft_fail=True ++server.context_limit_retry_strategy=reduce_prompt_from_end ``` - - diff --git a/docs/pipelines/start-server.md b/docs/pipelines/start-server.md index 36d9534022..fd3474a24e 100644 --- a/docs/pipelines/start-server.md +++ b/docs/pipelines/start-server.md @@ -64,7 +64,7 @@ Similarly, the local port for the sandbox server can be changed using `--sandbox ## Using the Server -To use this started server in [Evaluation](/Skills/pipelines/evaluation/) or [Generation](/Skills/pipelines/generation/), +To use this started server in [Evaluation](evaluation.md) or [Generation](generation.md), all the model-related arguments can now be replaced with `--server_type=openai` and `server_address` arguments. For instance, for the vLLM model server above, the `eval` pipeline arguments can be modified as, diff --git a/mkdocs.yml b/mkdocs.yml index 6ef479a929..2d6118732b 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,3 +1,4 @@ +strict: true site_name: Nemo-Skills site_url: https://nvidia-nemo.github.io/Skills extra_css: