FEATURE: Add AI-powered spam detection for new user posts #1004

SamSaffron · 2024-12-05T05:18:18Z

This PR introduces an AI-powered spam detection system that helps protect communities from spam by automatically scanning posts from new users.

Key Features:

Automatic scanning of the first 3 posts by new users (TL0-TL1)
Smart edit detection to catch spam modifications
Admin UI for configuration and monitoring
Integration with existing spam handling mechanisms
Statistics tracking for accuracy monitoring
Custom instructions support for site-specific rules

Technical Implementation:

New AiSpamLog and AiModerationSetting models for tracking and configuration
Post-creation and post-edit hooks for automated scanning
Intelligent rate limiting and edit detection
Custom LLM prompt engineering for spam detection
Integration with Discourse's existing spam handling system

Admin Features:

Enable/disable spam detection
Select LLM model for scanning
Configure custom site-specific instructions
Monitor detection statistics (scanned posts, detected spam, false positives/negatives)

Safety Features:

Only scans public posts
Limited to first 3 posts by new users
Minimum edit difference threshold
Rate limiting for rescanning edited posts
Integration with existing trust system

SamSaffron · 2024-12-09T06:28:22Z

Left to do:

Some way to drill into the 4 lists (false positive / negative / scanned / spam)
Send images through

Except for that, this is pretty much ready

lib/ai_moderation/spam_scanner.rb

app/controllers/discourse_ai/admin/ai_spam_controller.rb

lib/ai_moderation/spam_scanner.rb

config/locales/client.en.yml

app/controllers/discourse_ai/admin/ai_spam_controller.rb

assets/javascripts/discourse/components/ai-spam.gjs

db/migrate/20241206051225_add_ai_spam_logs.rb

lib/ai_moderation/spam_scanner.rb

keegangeorge

Follow up changes look good, and overall everything looks good while testing locally. A few 🤏🏽 tiny things

assets/javascripts/discourse/components/modal/spam-test-modal.gjs

config/locales/client.en.yml

config/locales/server.en.yml

assets/javascripts/discourse/components/ai-spam.gjs

# Conflicts: # config/routes.rb

work in progress

add report

Co-authored-by: Keegan George <[email protected]>

(if llm supports it)

spec

Use I18n in more places, fix text formatting, make stat tiles consistent visually with usage page

Co-authored-by: Keegan George <[email protected]>

perform

Co-authored-by: Keegan George <[email protected]>

handled all issues

SamSaffron force-pushed the add-spam-tab branch from e7ace88 to dcba571 Compare December 6, 2024 02:57

SamSaffron marked this pull request as ready for review December 9, 2024 06:27

SamSaffron changed the title ~~WIP: new spam tab for Discourse AI~~ FEATURE: Add AI-powered spam detection for new user posts Dec 9, 2024

nattsw reviewed Dec 9, 2024

View reviewed changes

lib/ai_moderation/spam_scanner.rb Show resolved Hide resolved

nattsw reviewed Dec 9, 2024

View reviewed changes

app/controllers/discourse_ai/admin/ai_spam_controller.rb Outdated Show resolved Hide resolved

nattsw reviewed Dec 9, 2024

View reviewed changes

lib/ai_moderation/spam_scanner.rb Show resolved Hide resolved

keegangeorge suggested changes Dec 9, 2024

View reviewed changes

SamSaffron requested a review from keegangeorge December 11, 2024 00:02

xfalcox reviewed Dec 11, 2024

View reviewed changes

lib/ai_moderation/spam_scanner.rb Outdated Show resolved Hide resolved

xfalcox reviewed Dec 11, 2024

View reviewed changes

lib/ai_moderation/spam_scanner.rb Outdated Show resolved Hide resolved

keegangeorge previously requested changes Dec 11, 2024

View reviewed changes

SamSaffron and others added 18 commits December 12, 2024 09:04

WIP: new spam tab for Discourse AI

8a38a1d

# Conflicts: # config/routes.rb

Start styling the new page

5f2adca

more WIP

0c01a34

updated controller

712a676

UI basically functioning now

89031ae

start building scanning infra

21b02d4

added some code for spam scanner

d8f290b

work in progress

oops files in wrong spot

7bd2674

annotation and new model

58d8137

working on specs

1a80e34

Improve implementation

217b9dc

track reviewable from spam table

9f257d4

add report

syntax tree

441c831

linting

0df3758

linting

30292d3

rubocop

cba0f32

Update assets/javascripts/discourse/components/ai-spam.gjs

26e275a

Co-authored-by: Keegan George <[email protected]>

Update assets/javascripts/discourse/components/ai-spam.gjs

2485427

Co-authored-by: Keegan George <[email protected]>

SamSaffron and others added 23 commits December 12, 2024 09:04

Update assets/javascripts/discourse/components/ai-spam.gjs

819706f

Co-authored-by: Keegan George <[email protected]>

address PR comments

28073e8

scan images as well during spam scan

ec225e5

(if llm supports it)

Dedicated user for spam scanning

240222b

lint and address limit

52b5388

fix annotation

6366d42

move stuff around, show in enumerator

1922a18

Improves entire flow for first time usage and adds a system

186e51b

spec

Link to scanned posts in review queue

494f505

fix specs

99fa596

backend for test button

e2d8bc3

Finish adding test button

c64a5ef

UX: Rough edges

b1e7a3e

Use I18n in more places, fix text formatting, make stat tiles consistent visually with usage page

shorten tokens to 5 so scan is faster

1869f89

improve formatting

c806a44

Update config/locales/server.en.yml

7ab1062

Co-authored-by: Keegan George <[email protected]>

add inidicator wave

e2e30db

also send llm id to test model so you can test how various llms

b83bb7f

perform

respect seeded llms

75e2adf

lint

6eb43ae

Update config/locales/client.en.yml

f280505

Co-authored-by: Keegan George <[email protected]>

Update config/locales/client.en.yml

8d9c005

Co-authored-by: Keegan George <[email protected]>

Update config/locales/client.en.yml

fefdd9c

Co-authored-by: Keegan George <[email protected]>

SamSaffron force-pushed the add-spam-tab branch from ba408fc to fefdd9c Compare December 11, 2024 22:05

xfalcox approved these changes Dec 11, 2024

View reviewed changes

SamSaffron merged commit 47f5da7 into main Dec 11, 2024
6 checks passed

SamSaffron deleted the add-spam-tab branch December 11, 2024 22:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEATURE: Add AI-powered spam detection for new user posts #1004

FEATURE: Add AI-powered spam detection for new user posts #1004

Uh oh!

SamSaffron commented Dec 5, 2024 •

edited

Loading

Uh oh!

SamSaffron commented Dec 9, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

keegangeorge left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

6 participants

FEATURE: Add AI-powered spam detection for new user posts #1004

FEATURE: Add AI-powered spam detection for new user posts #1004

Uh oh!

Conversation

SamSaffron commented Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SamSaffron commented Dec 9, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

keegangeorge left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

6 participants

SamSaffron commented Dec 5, 2024 •

edited

Loading