-
Notifications
You must be signed in to change notification settings - Fork 49
Description
Is your feature request related to a problem? Please describe.
RAID is a great benchmark to measure on, but it has a big open-sourced train set on which everybody can train. And recently there even appeared solutions with 1.0 score, so now leaderboard already isn't a source of "who has a better ai-detection solution", but more telling who is better overfitted on the RAID.
Describe the solution you'd like
It would be great if you could mark solutions that were trained on the RAID train data (and maybe even create two different leaderboards) to make it a source of "the most accurate ai-detector" information again. It can be done with:
a) just asking in the metadata whether they used RAID train as a part of their dataset or not (but users may provide wrong info)
b) adding into train data some watermarked data (for example with strange augs), that normal detector should not be triggered on, but detector that saw such data during training will deffinetly trigger on it (of course what certain specific aug you use should not be shared publicly). This approach can even be combined with the first one and be used to catch guys, who lies in there metadata with the following restrictions for them.
Describe alternatives you've considered
I don't know about your plans on making RAID 2.0 version, but LLMs are rapidly developing and if you're going to make next iteration of RAID with updated models pool, augs, maybe even adjusted source of texts, it would be great not to publish train set. Therefore it will create a space for fair competition.