-
Notifications
You must be signed in to change notification settings - Fork 742
Introduce GenerationConfig #10228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce GenerationConfig #10228
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10228
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit ef7d4ca with merge base f911567 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
|
|
||
| if (warmup) { | ||
| runner.warmup(prompt, seq_len); | ||
| runner.warmup(prompt, /*max_new_tokens=*/seq_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be added in the internal runner as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which internal runner?
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
707bda8 to
74cfb7f
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
74cfb7f to
5ecf7b7
Compare
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
5ecf7b7 to
834fac2
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
1 similar comment
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Pull Request resolved: #10228 Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
834fac2 to
72cbdf1
Compare
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
72cbdf1 to
605ff4d
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
2fc8d51 to
163ccea
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
163ccea to
e89ba89
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
e89ba89 to
0357334
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Pull Request resolved: #10228 Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
0357334 to
febbfa6
Compare
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
febbfa6 to
4009fda
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Pull Request resolved: #10228 Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
4009fda to
9fe9659
Compare
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
9fe9659 to
4e038ea
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
4e038ea to
6004124
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Summary: Started to implement #9341 Started to fix #8495 This PR introduces `GenerationConfig` which contains the configs that can be changed across different invocations of `generate()`. For example, `temperature` is moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we call `generate()`. Similarly we put `echo` and `warming` into the config. We also allow both `seq_len` and `max_new_tokens` to be passed through the config and we determine the value of `max_new_tokens` based on these 2 config values, pte file metadata as well as the number of prompt tokens. Reviewed By: iseeyuan Differential Revision: D73091676
6004124 to
ef7d4ca
Compare
|
This pull request was exported from Phabricator. Differential Revision: D73091676 |
Differential Revision: D73091676 Pull Request resolved: pytorch#10228
Summary:
Started to implement #9341
Started to fix #8495
This PR introduces
GenerationConfigwhich contains the configs that can be changed across different invocations ofgenerate().For example,
temperatureis moved out from the runner constructor for it's not tied to the runner instance but instead should be adjustable every time we callgenerate().Similarly we put
echoandwarminginto the config.We also allow both
seq_lenandmax_new_tokensto be passed through the config and we determine the value ofmax_new_tokensbased on these 2 config values, pte file metadata as well as the number of prompt tokens.Differential Revision: D73091676