Skip to content

Conversation

@Gwyn9
Copy link

@Gwyn9 Gwyn9 commented Oct 3, 2025

Hi @liamdugan👋

This PR includes two new submissions for evaluation:

  • BERT_IdentifyIA_HF
  • RoBERTa_IdentifyIA_HF

Both detectors were fine-tuned on data set of kaggle (https://www.kaggle.com/datasets/shanegerami/ai-vs-human-text) and include predictions.json and metadata.json files following the leaderboard structure.

Apologies for the earlier submission issues; the folder structure has been fixed according to your guidance.

Thanks again for your help!

Gwyn9 and others added 13 commits September 27, 2025 15:12
Hi @liamdugan — I’ve followed your instructions and removed all results.json files from my submission.
Only the predictions.json and metadata.json files remain unchanged.

The PR is now ready for evaluation so the bot can generate the official results.json automatically.

Thanks for your guidance! 🚀
…/submissions/BERT_IdentifyIA_HF/predictions.json
…/submissions/RoBERTa_IdentifyIA_HF/metadata.json
…ard/submissions/RoBERTa_IdentifyIA_HF/predictions.json
@github-actions
Copy link

github-actions bot commented Oct 7, 2025

Eval run succeeded! Link to run: link

Here are the results of the submission(s):

RoBERTa_IdentifyIA_HF

Release date: 2025-02-01

I've committed detailed results of this detector's performance on the test set to this PR.

Warning

Failed to find threshold values that achieve False Positive Rate(s): (['1%']) on all domains. This submission will not appear in the main leaderboard for those FPR values; it will only be visible within the splits in which the target FPR was achieved.
On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 61.27 and a TPR of 20.66% at FPR=5%.
Without adversarial attacks, it achieved AUROC of 60.74 and a TPR of 22.61% at FPR=5%.

BERT_IdentifyIA_HF

Release date: 2025-02-01

I've committed detailed results of this detector's performance on the test set to this PR.

Warning

Failed to find threshold values that achieve False Positive Rate(s): (['5%', '1%']) on all domains. This submission will not appear in the main leaderboard for those FPR values; it will only be visible within the splits in which the target FPR was achieved.

If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID!

@liamdugan
Copy link
Owner

Hi @Gwyn9 just to add a bit more context for this evaluation result. It seems like there does not exist a threshold for which your classifiers get 99% accuracy on human-written text across all domains. I suggest investigating the results.json file to see which domains your classifier has issues on.

I'm happy to answer any more questions if you have them.

@Gwyn9
Copy link
Author

Gwyn9 commented Oct 16, 2025

Hi @liamdugan 👋,
I’m submitting two updated detector versions (V2) for evaluation on RAID.
These models are fine-tuned versions of our previous submissions, now uploaded to Hugging Face and evaluated locally before submission.

Models:

  • KewynG/Bert_IdentifyIA_V2
  • KewynG/RoBERTa_IdentifyIA_V2

Changes:

  • Retrained on a balanced and extended dataset derived from RAID subsets.
  • Improved cleaning, tokenizer alignment, and balanced class sampling.
  • Adjusted hyperparameters and checkpoints for better AUROC and recall stability.

Submission files:

  • leaderboard/submissions/BERT_IdentifyIA_HF_V2/
  • leaderboard/submissions/RoBERTa_IdentifyIA_HF_V2/

These submissions follow the same format as previous ones (metadata.json + predictions.json), now corresponding to the V2 detectors on Hugging Face.

Thanks again for maintaining the RAID benchmark, looking forward to seeing how these updated models perform compared to the previous V1 results.

@github-actions
Copy link

It looks like this eval run failed. Please check the workflow logs to see what went wrong, then push a new commit to your PR to rerun the eval.

@Gwyn9
Copy link
Author

Gwyn9 commented Oct 17, 2025

Hi @liamdugan 👋,

I've pushed an update to fix the UTF-8 BOM issue that caused the previous evaluation to fail during the hydrate.py decoding step.

The affected files were the metadata.json and predictions.json for both:

  • BERT_IdentifyIA_HF_V2
  • RoBERTa_IdentifyIA_HF_V2

They have now been re-saved using UTF-8 (no BOM) encoding and recommitted.
Could you please re-run the evaluation for these updated submissions?

Thanks again for your help and for maintaining RAID! 🚀

@github-actions
Copy link

It looks like this eval run failed. Please check the workflow logs to see what went wrong, then push a new commit to your PR to rerun the eval.

@liamdugan
Copy link
Owner

Hello @Gwyn9 it seems like both the BERT_IdentifyIA_HF_V2 and RoBERTa_IdentifyIA_HF_V2 metadata.json and predictions.json files were empty (0 lines). It looks like while editing the encoding you may have erased the contents of the files as well. Can you reupload them with the original contents in the new encoding?

@Gwyn9
Copy link
Author

Gwyn9 commented Oct 20, 2025

Hi @liamdugan,

Restored full predictions.json files for BERT_IdentifyIA_HF_V2 and RoBERTa_IdentifyIA_HF_V2.
Both ~58 MB, valid UTF-8 (BOM removed), verified via json.tool.
Rebased cleanly and pushed to main; workflow ready for automatic re-evaluation.

Thank you!!

@github-actions
Copy link

Eval run succeeded! Link to run: link

Here are the results of the submission(s):

RoBERTa_IdentifyIA_HF_V2

Release date: 2025-02-01

I've committed detailed results of this detector's performance on the test set to this PR.

On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 79.26 and a TPR of 40.50% at FPR=5% and 19.97% at FPR=1%.
Without adversarial attacks, it achieved AUROC of 84.04 and a TPR of 45.51% at FPR=5% and 22.91% at FPR=1%.

BERT_IdentifyIA_HF_V2

Release date: 2025-02-01

I've committed detailed results of this detector's performance on the test set to this PR.

On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 85.65 and a TPR of 63.22% at FPR=5% and 48.17% at FPR=1%.
Without adversarial attacks, it achieved AUROC of 89.44 and a TPR of 70.74% at FPR=5% and 54.87% at FPR=1%.

If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID!

@liamdugan
Copy link
Owner

Hey @Gwyn9 would you like me to merge this into RAID?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants