Skip to content

Commit debf131

Browse files
Jackmin801samsja
authored andcommitted
fix typos
1 parent 5e23acd commit debf131

File tree

3 files changed

+3
-3
lines changed

3 files changed

+3
-3
lines changed

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Development workflow
22

3-
This is the develpment workflow of prime intellect to build upon hivemind
3+
This is the development workflow of prime intellect to build upon hivemind
44

55
## Install dependencies
66

open_diloco/train_fsdp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ def cast_str_to_list(cls, values: dict[str, Any]) -> dict[str, Any]:
116116
class Config(BaseConfig):
117117
path_model: str = "PrimeIntellect/llama-150m-fresh"
118118
torch_compile: bool = True
119-
attn_implementation: str = "flash_attention_2"
119+
attn_implementation: str = "sdpa"
120120
# Data
121121
dataset_name_or_path: str = "allenai/c4"
122122
seq_length: int = 1024

open_diloco/utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,7 @@ def get_compression_kwargs(hivemind_compression: str) -> dict:
117117

118118
def found_inf_grad(optimizer: torch.optim.Optimizer, scaler: torch.cuda.amp.GradScaler) -> bool:
119119
"""
120-
this function check if the scaler has found inf grad for the optimizer. It does by looking up the optimzier state
120+
this function check if the scaler has found inf grad for the optimizer. It does by looking up the optimizer state
121121
regsited inside the scaler. Code is mostly copied/inspired by the torch GradScaler codebase.
122122
"""
123123
if not scaler._enabled:

0 commit comments

Comments
 (0)