请问能否介绍一下如何实现prompt攻击检测? #1466
Unanswered
chengq2020
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
有些用户可能会输入涉及歧视、黄暴、侵权等有安全合规风险隐患的对话提示词,常见的手法为目标劫持、角色扮演等,但我测试了几个问题 ChatGLM 都防范的很优秀,想请教一下实现方法和防御技巧,谢谢
Beta Was this translation helpful? Give feedback.
All reactions