Add readme doc for the paged programs script#132
Add readme doc for the paged programs script#132spzala wants to merge 4 commits intofoundation-model-stack:mainfrom
Conversation
|
cc @JRosenkranz |
scripts/README.md
Outdated
|
|
||
| ## How to run and validate paged programs | ||
|
|
||
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. The script can run tests in a distributed environment, utilizing multiple instances for faster execution. To see the description of various command-line arguments that the script can parse, run it with `--help`. The following examples demonstrate the usage of the script. |
There was a problem hiding this comment.
We may want to add a separate readme entirely as well for this file, that includes some example output for different test_types (tokens and metrics).
There was a problem hiding this comment.
@JRosenkranz I was thinking to create a new folder called drive_paged_program and move the drive_paged_programs.py and add newly created dedicated readme but for now I have just added a readme with a specific name README_drive_paged_program.md. Adding folder may give better clarity but not sure if that can break anything with script path. Let me know your thoughts. Thanks!
scripts/README.md
Outdated
|
|
||
| ## How to run and validate paged programs | ||
|
|
||
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. The script can run tests in a distributed environment, utilizing multiple instances for faster execution. To see the description of various command-line arguments that the script can parse, run it with `--help`. The following examples demonstrate the usage of the script. |
There was a problem hiding this comment.
we may also want to point out that you can skip cpu validation (skip_validation) which will make the script much faster, as well as utilize the validation_info_outputs_dir and save_validation_info_outputs which will allow you to re-use saved cpu logits (to avoid re-compute -- significantly reducing time of the script to run)
tharapalanivel
left a comment
There was a problem hiding this comment.
This is great, thanks for putting this doc together @spzala!
scripts/README.md
Outdated
|
|
||
| ```bash | ||
| # Run with 4K context length | ||
| VLLM_DT_MAX_BATCH_SIZE=4 VLLM_DT_MAX_CONTEXT_LEN=4096 HF_HUB_CACHE=/home/senuser/models/huggingface_cache/hub DT_DEEPRT_VERBOSE=-1 DTLOG_LEVEL=error torchrun --nproc-per-node=4 /home/senuser/aiu-fms-testing-utils/scripts/drive_paged_programs.py --max_new_tokens=8 --model_variant=ibm-granite/granite-3.3-8b-instruct --program_criteria_json_path=/home/senuser/models/fms-tests-dpp-programs/dpp-4k.json --dataset_path=/home/senuser/models/ShareGPT_V3_unfiltered_cleaned_split.json --test_type=tokens --distributed |
There was a problem hiding this comment.
Can we make the paths generic here please?
There was a problem hiding this comment.
Yes, I kept the path similar to other script examples but good idea to clarify it and will do.
scripts/README.md
Outdated
|
|
||
| ```bash | ||
| # Run with 4K context length | ||
| VLLM_DT_MAX_BATCH_SIZE=4 VLLM_DT_MAX_CONTEXT_LEN=4096 HF_HUB_CACHE=/home/senuser/models/huggingface_cache/hub DT_DEEPRT_VERBOSE=-1 DTLOG_LEVEL=error torchrun --nproc-per-node=4 /home/senuser/aiu-fms-testing-utils/scripts/drive_paged_programs.py --max_new_tokens=8 --model_variant=ibm-granite/granite-3.3-8b-instruct --program_criteria_json_path=/home/senuser/models/fms-tests-dpp-programs/dpp-4k.json --dataset_path=/home/senuser/models/ShareGPT_V3_unfiltered_cleaned_split.json --test_type=tokens --distributed |
There was a problem hiding this comment.
How do users get access to /home/senuser/models/fms-tests-dpp-programs/dpp-4k.json or does it get generated during the test?
There was a problem hiding this comment.
Yes, it's a generated file, stored to the provided path. I will clarify it in the doc.
scripts/drive_paged_programs.py
Outdated
| parser.add_argument( | ||
| "--program_criteria_json_path", | ||
| type=str, | ||
| required=True, |
There was a problem hiding this comment.
Are we planning on providing a default program_criteria_json_path example in this repo?
There was a problem hiding this comment.
Yes, this gets generated by default now, so I think we can set a default for this and turn off the required flag
|
@JRosenkranz @tharapalanivel thanks so much for the quick feedback. I will work on your suggestions. |
6559fdd to
3309773
Compare
Signed-off-by: Sahdev Zala <spzala@us.ibm.com>
3309773 to
136839b
Compare
| @@ -0,0 +1,76 @@ | |||
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. | |||
|
|
|||
| It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. | |||
There was a problem hiding this comment.
We should mention that this can also support a custom prompt file and the format for that file. Currently it is for a given file, each line will be one sequence in the batch, the batch size being the number of lines in the file.
| @@ -0,0 +1,76 @@ | |||
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. | |||
|
|
|||
| It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. | |||
There was a problem hiding this comment.
We have since added a lot of support for the programs argument in the script. We should have a different set of examples of how to use that. For instance, the current format is as follows:
<program_id>:<batch_constraint>,<seq_len_constraint>
<program_id> can be one of an int, *, or ?. If an int, it will choose the exact program id. If *, it will choose all programs that match the batch_constraint and seq_len_constraint criteria. If ?, it will choose one program that matches the batch_constraint and seq_len_constraint criteria
<batch_constraint> can be one of int or conditional expression on the batch size. Int will default to >= expression. Otherwise we can support >, >=, <, <=, == with a val.
<seq_len_constraint> can be one of int or conditional expression on the sequence length. Int will default to >= expression. Otherwise we can support >, >=, <, <=, == with a val.
| The [drive_paged_programs.py](https://github.com/foundation-model-stack/aiu-fms-testing-utils/blob/main/scripts/drive_paged_programs.py) is designed to run and validate paged programs using a specified model variant. | ||
|
|
||
| It supports different attention types, including `paged` and `paged_fp8`, with the default set to `paged`. The supported dataset types are `sharegpt` and `rag_factoid`, with the default set to `sharegpt`. | ||
|
|
There was a problem hiding this comment.
We may want to also explain the enforce_homogeneous_prompt_programs param. This is used to ensure that all sequences in a batch would hit the same prefill program (by default we only ensure the largest prompt hits a specific prefill program)
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
|
@JRosenkranz @tharapalanivel Any further issues with this? |
Add readme doc for the paged programs script.