Skip to content

Add an automated query/cleanup for potential spam accounts #1729

@ArtOfCode-

Description

@ArtOfCode-

Current query for these:

select
    u.id,
    u.email, 
    u.username,
    concat('https://', c.host, '/users/', u.id) as profile_link,
    u.profile_markdown
from users u
inner join community_users cu on u.id = cu.user_id and cu.id = (select id from community_users where user_id = u.id order by reputation desc limit 1)
inner join communities c on cu.community_id = c.id
left join posts p on p.user_id = u.id
left join comments cm on cm.user_id = u.id
left join votes v on v.user_id = u.id
where u.profile_markdown is not null
  and u.profile like '%href="%'
  and u.created_at >= date_sub(current_timestamp, interval 12 month)
  and u.deleted = false
  and u.email not like '%localhost'
  and cu.reputation = 1
group by u.id
having count(distinct p.id) = 0
   and count(distinct cm.id) = 0
   and count(distinct v.id) = 0
order by u.id

We should add this as an automated query/cleanup job in some form.

Discussion:

  • How can we avoid this catching legitimate users? Do we time-limit to only the last 24 hours if this is running daily? Hopefully any established user would have been around longer than that, although we still run the risk of catching the odd user who's customised their profile but not yet done anything.
  • There is sufficient volume of these that manually reviewing each one is a heavy job. How can we lighten that? Can we automate any part of this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions