-
Couldn't load subscription status.
- Fork 25
Bump transformers and torch #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
89ed1c5 to
3a960bb
Compare
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
867eb8b to
300ccdf
Compare
300ccdf to
bc82841
Compare
b110649 to
35fc918
Compare
d89e18d to
6a26464
Compare
|
|
||
| # Create a list of CustomKVCache instances, one per layer | ||
| self.kv_cache = torch.nn.ModuleList() | ||
| for _ in range(config.num_hidden_layers): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happened here? like config doesnt exist anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still exists, feel like it's more idiomatic to iterate over the actual layers
93cbd54 to
64b41b4
Compare
b0027f5 to
eccf6f0
Compare
eccf6f0 to
87fe6e3
Compare
87fe6e3 to
99805f8
Compare
This reverts commit 99805f8.
fc7b69e to
59778eb
Compare
| self._temp_dir = None | ||
|
|
||
| def __del__(self): | ||
| """Clean up temporary files when the model instance is destroyed.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldnt this already happen automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah probably, but added just to be extra sure that it's cleaned up between tests
330ca8d to
b252038
Compare
| n_heads=self.num_key_value_heads, | ||
| head_dim=self.head_dim, | ||
| max_batch_size=layer.max_batch_size, | ||
| max_context_length=layer.max_cache_len, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait what is happening here? is this same as sliding_window_len
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah they removed sliding_window_len, it's now just max_cache_len
https://github.com/huggingface/transformers/blob/main/src/transformers/cache_utils.py#L357
https://github.com/huggingface/transformers/blob/main/src/transformers/cache_utils.py#L265
Summary
Pin bumps
202506014.54.1Code changes
Includes changes to absorb the huggingface/transformers#39106 kv cache refactor introducewd by the transformers upgrade, which specifies kv cache attributes per layer.
cache_configis also no longer aCacheConfiginstance but adictafter this PR, so we change to using.get()Infra changes
Remove mac tests, see #122 for more details. This also allows us to iterate more quickly by cutting down unnecessary CI, since there's technically no need to run on Mac to test export when Linux tests already cover that. Mac tests with larger runners are enabled reciprocally for major LLM models in ExecuTorch in pytorch/executorch#13400.
Known failures