Skip to content

Commit 9ab75fc

Browse files
Tialoy.korobko
andauthored
fix typo (#39936)
* fix typo * fix modular instead * fix --------- Co-authored-by: y.korobko <[email protected]>
1 parent 43b3f58 commit 9ab75fc

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

src/transformers/models/gpt_oss/modeling_gpt_oss.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ def __init__(self, config):
7575

7676
def forward(self, hidden_states: torch.Tensor, router_indices=None, routing_weights=None) -> torch.Tensor:
7777
"""
78-
When training is is more efficient to just loop over the experts and compute the output for each expert
78+
When training it is more efficient to just loop over the experts and compute the output for each expert
7979
as otherwise the memory would explode.
8080
8181
For inference we can sacrifice some memory and compute the output for all experts at once. By repeating the inputs.

src/transformers/models/gpt_oss/modular_gpt_oss.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ def __init__(self, config):
7373

7474
def forward(self, hidden_states: torch.Tensor, router_indices=None, routing_weights=None) -> torch.Tensor:
7575
"""
76-
When training is is more efficient to just loop over the experts and compute the output for each expert
76+
When training it is more efficient to just loop over the experts and compute the output for each expert
7777
as otherwise the memory would explode.
7878
7979
For inference we can sacrifice some memory and compute the output for all experts at once. By repeating the inputs.

0 commit comments

Comments
 (0)