寻求帮助:稍微给ChatGLM添加了一个外挂的网络,但是训练完成后模型使用存在一定的问题 #336
Replies: 1 comment
-
请在 #253 提出吧 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
自定义的网络如下:
···
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM,default_data_collator
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
config = AutoConfig.from_pretrained(MODEL_PATH, trust_remote_code=True)
config.pre_seq_len = 10
config.prefix_projection = True
class ImageGLM(nn.Module):
def init(self,MODEL_PATH,config):
super().init()
self.GLM = AutoModel.from_pretrained(MODEL_PATH, config=config, trust_remote_code=True)
#for param in self.GLM.parameters():
# param.requires_grad = False
self.layer0 = nn.Linear(4096, 65024)
self.layer1 = nn.Linear(65024, 1024)
self.layer2 = nn.Linear(1024, 65024)
···
trainable params: 1765767088 || all params: 8009351088 || trainable%: 22.04631896640863
训练过程中直接是model(**inputs),完成训练后,model(**test)生成的结果不尽人意,请问我应该如何处理?
Beta Was this translation helpful? Give feedback.
All reactions