Skip to content

Commit 70865ce

Browse files
committed
remove unused settings and temporarily remove other model_implementations
1 parent 73c3ce1 commit 70865ce

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+2
-5300
lines changed

ci_scripts/train/ci_7B_sft.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -101,14 +101,12 @@
101101
model = dict(
102102
checkpoint=False,
103103
num_attention_heads=NUM_ATTENTION_HEAD,
104-
embed_split_hidden=True,
105104
vocab_size=VOCAB_SIZE,
106105
embed_grad_scale=1,
107106
parallel_output=True,
108107
hidden_size=HIDDEN_SIZE,
109108
num_layers=NUM_LAYER,
110109
mlp_ratio=MLP_RATIO,
111-
apply_post_layer_norm=False,
112110
dtype="torch.bfloat16",
113111
norm_type="rmsnorm",
114112
layer_norm_epsilon=1e-5,

configs/1.8B_MoE16_sft.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -136,14 +136,12 @@
136136
model = dict(
137137
checkpoint=False, # The proportion of layers for activation aheckpointing, the optional value are True/False/[0-1]
138138
num_attention_heads=NUM_ATTENTION_HEAD,
139-
embed_split_hidden=True,
140139
vocab_size=VOCAB_SIZE,
141140
embed_grad_scale=1,
142141
parallel_output=False,
143142
hidden_size=HIDDEN_SIZE,
144143
num_layers=NUM_LAYER,
145144
mlp_ratio=MLP_RATIO,
146-
apply_post_layer_norm=False,
147145
dtype="torch.bfloat16", # Support: "torch.float16", "torch.half", "torch.bfloat16", "torch.float32", "torch.tf32"
148146
norm_type="rmsnorm",
149147
layer_norm_epsilon=1e-5,

configs/57B_qwen2_MoE.py

Lines changed: 0 additions & 226 deletions
This file was deleted.

configs/7B_MoE4_sft.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,14 +149,12 @@
149149
model = dict(
150150
checkpoint=False, # The proportion of layers for activation aheckpointing, the optional value are True/False/[0-1]
151151
num_attention_heads=NUM_ATTENTION_HEAD,
152-
embed_split_hidden=True,
153152
vocab_size=VOCAB_SIZE,
154153
embed_grad_scale=1,
155154
parallel_output=True,
156155
hidden_size=HIDDEN_SIZE,
157156
num_layers=NUM_LAYER,
158157
mlp_ratio=MLP_RATIO,
159-
apply_post_layer_norm=False,
160158
dtype="torch.bfloat16", # Support: "torch.float16", "torch.half", "torch.bfloat16", "torch.float32", "torch.tf32"
161159
norm_type="rmsnorm",
162160
layer_norm_epsilon=1e-5,

0 commit comments

Comments
 (0)