Skip to content

Commit 3356dd5

Browse files
committed
Pre-commit autofixes
1 parent 0c3120c commit 3356dd5

28 files changed

+42
-47
lines changed

.github/ISSUE_TEMPLATE/feature-request.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,4 @@ body:
2727
description: |
2828
How can you contribute to this feature? For example, could you help by submitting a PR?
2929
validations:
30-
required: true
30+
required: true

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ dataset = load_dataset("HuggingFaceFW/fineweb-edu", name="sample-100BT", num_pro
5050
## Training Recipes
5151

5252
Here's an example of training a 340M FLA Transformer model with a LLaMA-like architecture from scratch on a 100BT subset of the Fineweb-edu corpus ~~in streaming mode~~. (Do not use streaming mode if you are concerned about resuming training.)
53-
53+
5454
> [!WARNING]
5555
> If the dataset is not downloaded beforehand, the streaming mode will attempt to fetch it from a remote server and download it on-the-fly, which can be highly unstable during training due to network issues.
5656
> For stable training, ensure the dataset is downloaded locally (see [**Dataset Preparation**](#dataset-preparation)). Otherwise, we assume you are only testing the new corpus.

configs/delta_net_1B.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,4 @@
2626
"use_gate": false,
2727
"use_output_norm": true,
2828
"use_short_conv": true
29-
}
29+
}

configs/delta_net_340M.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,4 @@
2323
"use_gate": false,
2424
"use_output_norm": true,
2525
"use_short_conv": true
26-
}
26+
}

configs/gated_deltanet_1B.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,4 @@
1919
"use_cache": true,
2020
"use_gate": true,
2121
"use_short_conv": true
22-
}
22+
}

configs/gated_deltanet_340M.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,4 @@
1919
"use_cache": true,
2020
"use_gate": true,
2121
"use_short_conv": true
22-
}
22+
}

configs/gla_340M.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,4 +21,4 @@
2121
"use_gk": true,
2222
"use_gv": false,
2323
"vocab_size": 32000
24-
}
24+
}

configs/gla_7B.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,4 +22,4 @@
2222
"use_gv": false,
2323
"use_output_gate": true,
2424
"use_short_conv": false
25-
}
25+
}

configs/gsa_340M.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,4 @@
2626
"use_output_gate": true,
2727
"use_rope": false,
2828
"use_short_conv": false
29-
}
29+
}

configs/hgrn2_340M.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,4 @@
1717
"tie_word_embeddings": false,
1818
"use_cache": true,
1919
"vocab_size": 32000
20-
}
20+
}

0 commit comments

Comments
 (0)