结构化文本处理--高质量sft数据制造与增强评测全流程 #552
Hongru0306
started this conversation in
project
Replies: 1 comment
-
赞 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
立项背景
如今,SFT微调llm遇到的最大的难点之一就是SFT数据处理。对于对话角色而言,可以通过人物名字的前后文关系来捕捉对话数据对,从而进行提取。
但对于大量爬取或得到的纸质专业书籍,大多都是知识点或者章节逻辑性的罗列,很难获取较高的高质量SFT数据。这对垂域llm的训练提出了严峻的挑战!!
立项目标
立项难点
项目亮点
项目规划
our repo:link
Beta Was this translation helpful? Give feedback.
All reactions