| group name | description | rule / prompt |
|---|---|---|
| default | rules for text quality check | RuleColonEnd RuleContentNull RuleDocRepeat RuleHtmlEntity RuleIDCard RuleNoPunc RuleSpecialCharacter |
| sft | rules for sft dataset check | RuleColonEnd RuleContentNull RuleDocRepeat RuleHtmlEntity RuleNoPunc RuleSpecialCharacter RuleLineStartWithBulletpoint |
| pretrain | rules for pretrain dataset check | RuleAlphaWords RuleCapitalWords RuleCharNumber RuleColonEnd RuleContentNull RuleDocRepeat RuleHtmlEntity RuleIDCard RuleLineEndWithEllipsis RuleLineEndWithTerminal RuleLineStartWithBulletpoint RuleLineJavascriptCount RuleLoremIpsum RuleMeanWordLength RuleNoPunc RuleSentenceNumber RuleSpecialCharacter RuleStopWord RuleSymbolWordRatio RuleUniqueWords RuleWordNumber |