Add readme doc for the paged programs script by spzala · Pull Request #132 · foundation-model-stack/aiu-fms-testing-utils

spzala · 2025-09-17T16:22:38Z

Add readme doc for the paged programs script.

spzala · 2025-09-17T16:23:36Z

JRosenkranz · 2025-09-17T17:19:10Z

scripts/README.md


+## How to run and validate paged programs
+
+The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. The script can run tests in a distributed environment, utilizing multiple instances for faster execution. To see the description of various command-line arguments that the script can parse, run it with `--help`. The following examples demonstrate the usage of the script.


We may want to add a separate readme entirely as well for this file, that includes some example output for different test_types (tokens and metrics).

@JRosenkranz I was thinking to create a new folder called drive_paged_program and move the drive_paged_programs.py and add newly created dedicated readme but for now I have just added a readme with a specific name README_drive_paged_program.md. Adding folder may give better clarity but not sure if that can break anything with script path. Let me know your thoughts. Thanks!

JRosenkranz · 2025-09-17T17:20:52Z

scripts/README.md


+## How to run and validate paged programs
+
+The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. The script can run tests in a distributed environment, utilizing multiple instances for faster execution. To see the description of various command-line arguments that the script can parse, run it with `--help`. The following examples demonstrate the usage of the script.


we may also want to point out that you can skip cpu validation (skip_validation) which will make the script much faster, as well as utilize the validation_info_outputs_dir and save_validation_info_outputs which will allow you to re-use saved cpu logits (to avoid re-compute -- significantly reducing time of the script to run)

tharapalanivel

This is great, thanks for putting this doc together @spzala!

tharapalanivel · 2025-09-17T17:31:36Z

scripts/README.md

+
+```bash
+# Run with 4K context length
+VLLM_DT_MAX_BATCH_SIZE=4 VLLM_DT_MAX_CONTEXT_LEN=4096 HF_HUB_CACHE=/home/senuser/models/huggingface_cache/hub DT_DEEPRT_VERBOSE=-1 DTLOG_LEVEL=error torchrun --nproc-per-node=4 /home/senuser/aiu-fms-testing-utils/scripts/drive_paged_programs.py --max_new_tokens=8 --model_variant=ibm-granite/granite-3.3-8b-instruct --program_criteria_json_path=/home/senuser/models/fms-tests-dpp-programs/dpp-4k.json --dataset_path=/home/senuser/models/ShareGPT_V3_unfiltered_cleaned_split.json --test_type=tokens --distributed


Can we make the paths generic here please?

Yes, I kept the path similar to other script examples but good idea to clarify it and will do.

tharapalanivel · 2025-09-17T17:33:58Z

scripts/README.md

+
+```bash
+# Run with 4K context length
+VLLM_DT_MAX_BATCH_SIZE=4 VLLM_DT_MAX_CONTEXT_LEN=4096 HF_HUB_CACHE=/home/senuser/models/huggingface_cache/hub DT_DEEPRT_VERBOSE=-1 DTLOG_LEVEL=error torchrun --nproc-per-node=4 /home/senuser/aiu-fms-testing-utils/scripts/drive_paged_programs.py --max_new_tokens=8 --model_variant=ibm-granite/granite-3.3-8b-instruct --program_criteria_json_path=/home/senuser/models/fms-tests-dpp-programs/dpp-4k.json --dataset_path=/home/senuser/models/ShareGPT_V3_unfiltered_cleaned_split.json --test_type=tokens --distributed


How do users get access to /home/senuser/models/fms-tests-dpp-programs/dpp-4k.json or does it get generated during the test?

Yes, it's a generated file, stored to the provided path. I will clarify it in the doc.

tharapalanivel · 2025-09-17T17:35:15Z

scripts/drive_paged_programs.py

 parser.add_argument(
    "--program_criteria_json_path",
    type=str,
+    required=True,


Are we planning on providing a default program_criteria_json_path example in this repo?

Yes, this gets generated by default now, so I think we can set a default for this and turn off the required flag

spzala · 2025-09-17T20:04:10Z

@JRosenkranz @tharapalanivel thanks so much for the quick feedback. I will work on your suggestions.

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>

JRosenkranz · 2025-10-24T17:23:04Z

scripts/README_drive_paged_program.md

@@ -0,0 +1,76 @@
+The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. 
+
+It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`.


We should mention that this can also support a custom prompt file and the format for that file. Currently it is for a given file, each line will be one sequence in the batch, the batch size being the number of lines in the file.

JRosenkranz · 2025-10-24T17:28:01Z

scripts/README_drive_paged_program.md

@@ -0,0 +1,76 @@
+The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. 
+
+It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`.


We have since added a lot of support for the programs argument in the script. We should have a different set of examples of how to use that. For instance, the current format is as follows:

<program_id>:<batch_constraint>,<seq_len_constraint>

<program_id> can be one of an int, *, or ?. If an int, it will choose the exact program id. If *, it will choose all programs that match the batch_constraint and seq_len_constraint criteria. If ?, it will choose one program that matches the batch_constraint and seq_len_constraint criteria

<batch_constraint> can be one of int or conditional expression on the batch size. Int will default to >= expression. Otherwise we can support >, >=, <, <=, == with a val.

<seq_len_constraint> can be one of int or conditional expression on the sequence length. Int will default to >= expression. Otherwise we can support >, >=, <, <=, == with a val.

JRosenkranz · 2025-10-24T17:30:01Z

scripts/README_drive_paged_program.md

+The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. 
+
+It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`.
+


We may want to also explain the enforce_homogeneous_prompt_programs param. This is used to ensure that all sequences in a batch would hit the same prefill program (by default we only ensure the largest prompt hits a specific prefill program)

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

matthew-pisano · 2026-02-12T21:39:30Z

@JRosenkranz @tharapalanivel Any further issues with this?

spzala marked this pull request as draft September 17, 2025 16:23

JRosenkranz reviewed Sep 17, 2025

View reviewed changes

tharapalanivel self-requested a review September 17, 2025 17:30

tharapalanivel requested changes Sep 17, 2025

View reviewed changes

spzala force-pushed the pagedprograms branch from 6559fdd to 3309773 Compare September 22, 2025 18:18

Add readme doc for the paged programs scripts

136839b

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>

spzala force-pushed the pagedprograms branch from 3309773 to 136839b Compare September 22, 2025 18:20

spzala marked this pull request as ready for review October 1, 2025 12:49

JRosenkranz reviewed Oct 24, 2025

View reviewed changes

Ssukriti and others added 2 commits November 3, 2025 16:23

add details of programs

6d878e5

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

Update README_drive_paged_program.md

e1f4621

Ssukriti marked this pull request as draft November 4, 2025 00:02

dataset formats

afa7b81

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>


		## How to run and validate paged programs

		The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. The script can run tests in a distributed environment, utilizing multiple instances for faster execution. To see the description of various command-line arguments that the script can parse, run it with `--help`. The following examples demonstrate the usage of the script.

		@@ -0,0 +1,76 @@
		The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant.

		It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`.

Conversation

spzala commented Sep 17, 2025

Uh oh!

spzala commented Sep 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JRosenkranz Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tharapalanivel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

spzala commented Sep 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matthew-pisano commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

JRosenkranz Sep 17, 2025 •

edited

Loading