Skip to content

Implement flag (report) system for content moderation #192

@dahlia

Description

@dahlia

Note

This issue is based on the Korean specification document Hackers' Pub 신고(flag) 기능 기획서.

Summary

This issue proposes implementing a comprehensive flag (report) system for Hackers' Pub that allows users to report content or accounts that violate the code of conduct. The system should support the federated nature of ActivityPub while protecting reporter anonymity and ensuring fair treatment of reported users.

Core philosophy

The flag system's ultimate purpose is education and growth, not creating a sterile community of perfect users. Reports should help individuals reflect on their behavior and grow as better community members.

Expulsion is the last resort. The system encourages a graduated approach:

  1. Awareness — The reported user learns their behavior may be problematic
  2. Reflection — They understand why the behavior is problematic
  3. Improvement — They modify their behavior and participate harmoniously
  4. Sanctions — Applied only when there's no willingness to improve or for severe violations

Design principles

Reporter protection

  • Reporter identity must be kept strictly confidential
  • Only moderators can access reporter information
  • The reported user only sees the report reason and violated rules, never who reported them
  • This protection is essential for the reporting system to function effectively

Right to know for reported users

  • Reported users have the right to know why they were reported
  • They receive information about which code of conduct provisions were violated
  • They see which content was flagged
  • This enables them to understand and improve their behavior

Flexible code of conduct referencing

  • The code of conduct is a living document that evolves over time
  • Report reasons should not be hardcoded to specific clause numbers
  • The system records the code of conduct version (Git commit hash) at report time
  • LLM analysis references the current full code of conduct text dynamically
  • Original reporter-written reasons are always preserved

Transparent processing

  • Reported users are informed of actions taken and reasons
  • Moderator decision rationale is recorded

Important

Reporter notification requires careful consideration. Notifying reporters of outcomes could enable malicious actors to probe moderation boundaries through repeated reports. The system should:

  • Distinguish between internal reports (from local users) and external reports (from remote instances via ActivityPub Flag)
  • Consider limiting detailed outcome notifications to trusted reporters or omitting them entirely
  • At minimum, avoid revealing specific sanction details that could be exploited

Graduated sanctions

  • Sanctions are proportional to violation severity and frequency
  • Warning → Content censorship → Temporary suspension → Permanent suspension
  • Violation history accumulates and affects subsequent sanction levels
  • Severe violations may skip stages for immediate strong action

Report targets

Content reports

Users can report individual pieces of content:

Article reports

  • Target: Long-form blog-style posts
  • Location: "Report" option in post footer or overflow menu
  • Collected data:
    • Article ID and permalink
    • Author information
    • Content snapshot at report time (evidence preservation)
    • Reporter's written reason

Note reports

  • Target: Short microblog-style posts
  • Location: "Report" option in overflow menu
  • Collected data: Same as articles

User reports

For problematic behavior patterns across multiple posts:

  • Use cases:
    • Persistent problematic behavior across multiple posts
    • Individual posts are borderline, but the overall pattern is problematic
    • Profile itself (name, bio, avatar) violates code of conduct
  • Location: "Report user" option in profile page overflow menu
  • Collected data:
    • User ID and profile link
    • Profile snapshot at report time
    • Reporter's written reason
    • (Optional) Related content links

Remote content and users

Content and users from other ActivityPub instances can be reported identically:

  • Federated timeline content affects Hackers' Pub users
  • Actions affect display/federation within Hackers' Pub
  • Optional: Reporter may explicitly opt in to forward the report to the remote instance via ActivityPub Flag activity (see Cross-instance report forwarding)

Report process

Report flow

flowchart TB
    start[User clicks Report<br>on content/user]
    form[Report form displayed]
    reason[Write reason<br>in free-form text]
    submit[Submit report]
    llm[LLM analyzes reason]
    coc[(Code of Conduct)]
    save[Save report<br>with pending status]
    notify_mod[Notify moderators]
    notify_reporter[Send confirmation<br>to reporter]

    start --> form --> reason --> submit --> llm
    coc -.-> llm
    llm --> save --> notify_mod --> notify_reporter
Loading

Report form

The form should be simple while collecting necessary information.

Required field: Report reason (free-form text)

Please explain why you are reporting this content/user.
You don't need to know the specific code of conduct provisions.
Feel free to describe what felt uncomfortable or problematic.

[                                        ]
[                                        ]
[                                        ]

Minimum 10 characters required.

Rationale:

  • Users shouldn't need to know all code of conduct provisions
  • Free-form input captures richer context
  • LLM analyzes the reason and matches relevant provisions

Optional field: Additional content links (for user reports)

If there are other related content items, add links here. (Optional)

[Add link +]

Optional field: Cross-instance forwarding (for remote content/users)

When reporting content or users from other ActivityPub instances, reporters may choose to forward the report:

☐ Also send this report to the remote instance (@user@remote.example)

Note: If enabled, a Flag activity will be sent to the remote server's
moderators. Your identity will NOT be revealed to the remote instance;
only the report reason and target will be shared.

This is opt-in by the reporter, not automatic.

LLM-based code of conduct matching

When a report is submitted, the LLM analyzes the reason and identifies relevant code of conduct provisions.

Warning

LLM analysis is a reference tool for moderators, not an automated decision system. LLMs can exhibit biases, particularly against marginalized communities. Research by Timnit Gebru, DAIR Institute, and others has documented how AI systems can perpetuate harmful biases. The matching results should:

  • Be treated as one input among many, not as authoritative judgments
  • Always be reviewed and validated by human moderators
  • Never be used for automated actions without human oversight
  • Be monitored for patterns of bias over time

Matching process

  1. Input composition:

    • Reporter's written reason
    • Current version of full code of conduct
    • Reported content (if applicable)
  2. LLM analysis:

    • Analyze relevance between report reason and code of conduct provisions
    • Identify relevant provisions with confidence scores
    • Generate analysis summary
  3. Result storage:

    • List of matched provisions with confidence scores
    • LLM analysis summary
    • Code of conduct version identifier at report time

Code of conduct version management

  • Use Git commit hash of code of conduct file as version identifier
  • Store version identifier with report record
  • Moderators can reference the specific version when reviewing

Matching result usage

  • Results serve as reference material for moderator review
  • Moderators can modify or override matching results
  • Final confirmed violations are communicated to reported user

Duplicate report handling

Multiple reports for the same content/user:

  • Reports for the same target are grouped into a single "report case"
  • Each report's reason is preserved individually
  • Displayed to moderators with report count
  • Higher report count increases priority

Rationale:

  • Multiple independent reporters finding the same issue suggests severity
  • Combining diverse report reasons enables more accurate judgment

Report history

Reporters can check status of their submitted reports:

  • Viewable: Report target, date, their written reason, processing status (pending/reviewing/resolved)
  • Not viewable: Specific action details, existence of other reporters, reported user's appeal content

Note

To prevent malicious actors from using reports to probe moderation boundaries, detailed outcome information (e.g., whether action was taken, what type of sanction) is intentionally limited.

Moderator processing

Moderation flow

flowchart TB
    pending[Report received<br>pending status]
    check[Moderator reviews report]
    reviewing[Status: reviewing]

    subgraph review [Review Process]
        r1[Check reported content]
        r2[Review report reasons]
        r3[Check LLM analysis]
        r4[Review user history]
        r5[Understand context]
    end

    decision{Decision}
    dismiss[Dismiss]
    warn[Warning]
    action[Sanction]

    notify[Record action & notify<br>• Action to reported user<br>• Flag to remote server if opted in]

    pending --> check --> reviewing --> review
    review --> decision
    decision --> dismiss
    decision --> warn
    decision --> action
    dismiss --> notify
    warn --> notify
    action --> notify
Loading

Report states

State Description
pending Report received, awaiting review
reviewing Moderator is reviewing
resolved Processing complete (action taken)
dismissed Dismissed (not a violation)

Review checklist

Moderators review the following comprehensively:

Report information

  • Reporter's written reason
  • LLM-matched code of conduct provisions (as reference only; verify independently)
  • Report count (for duplicates)
  • Each reporter's reason (for duplicates)

Content information

  • Reported content original text
  • Content context (comment thread, etc.)
  • Snapshot at report time (if modified/deleted)

User information

  • Reported user's previous violation history
  • Previous warning/sanction records
  • Account creation date and activity period
  • Local/remote user status

Action options

Action Description Criteria
Dismiss Judged not a violation No code of conduct violation found
Warning Send warning message Minor violation, first offense
Content censorship Hide the content The content itself is problematic
Temporary suspension Suspend account for period Repeated violations or moderate severity
Permanent suspension Permanently suspend account Severe violation or persistent malicious behavior

Required inputs for actions

When taking action, moderators must record:

  • Violation provisions (final confirmation)
  • Action rationale (detailed description of judgment basis)
  • Message to reported user
  • (For temporary suspension) Suspension period: start date – end date

Reported user process

Report notification

Reported users receive notification of the report fact and reason.

Notification timing

  • Not immediately notified: No notification right after report submission (prevents stress from frivolous reports)
  • Notified when: After moderator reviews and decides on action; optionally for dismissed reports (moderator discretion for educational purposes)

Notification content for warnings/sanctions

A report was received regarding your [content/account] and upon review,
it was determined to violate the code of conduct. The following action has been taken.

Violation:
[Related code of conduct provisions]

Target content:
[Content link if applicable]

Action:
[Warning / Content censorship / N-day suspension / Permanent suspension]

Moderator message:
[Moderator's written explanation]

If you have objections to this action, you can file an appeal
using the button below.

[File Appeal]

Optional notification for dismissals

A report was received regarding your [content/account], but upon review,
it was determined not to constitute a code of conduct violation.

However, please note that some community members may have felt uncomfortable.

Related information:
[Brief explanation]

Information visibility for reported users

Information Visible
Report fact Yes
Code of conduct provisions cited as violation Yes
Target content Yes
Action details and duration Yes
Moderator's judgment rationale Yes
Reporter identity No
Original report reason written by reporter No
Number of reports No

Restrictions during sanctions

Content censorship

  • Content hidden from timeline and search
  • Accessible via permalink with censorship notice displayed
  • Author can still view their own content

Temporary suspension

  • Cannot create new content
  • Cannot write comments
  • Cannot react
  • Cannot follow/unfollow
  • Can view existing content
  • Can receive DMs but cannot send

Permanent suspension

  • Cannot access account
  • All features disabled
  • Existing content hidden

Appeal process

Appeal flow

flowchart TB
    start[Reported user<br>files appeal]
    write[Write appeal reason]
    submit[Submit appeal]
    review[Moderator review<br>preferably different moderator]

    decision{Decision}
    reject[Dismiss appeal]
    uphold[Maintain action]
    modify[Modify action]

    notify[Notify result<br>• To reported user]

    start --> write --> submit --> review --> decision
    decision --> reject
    decision --> uphold
    decision --> modify
    reject --> notify
    uphold --> notify
    modify --> notify
Loading

Appeal eligibility

  • Only the sanctioned user can appeal
  • One appeal per action
  • Deadline: Within 14 days of action notification

Appeal form

Appeal reason:
[Explain why you believe this action is unjust]

Additional context or evidence:
[Provide any context or information you believe was not
considered in the decision]

[Submit]

Appeal review

  • Preferably reviewed by a different moderator than the original decision-maker
  • Comprehensive review of original report, action rationale, and appeal content
  • Check for new information or context

Decision options

  • Appeal dismissed: Original action maintained
  • Action reduced: Changed to lighter action (e.g., suspension → warning)
  • Action withdrawn: Action canceled and record corrected
  • Action increased: Rare cases where more severe violation discovered during appeal

Result notification

To reported user

Here is the result of your appeal review.

Decision: [Appeal dismissed / Action reduced / Action withdrawn]

Judgment rationale:
[Moderator's review result explanation]

(If applicable) Modified action:
[New action details]

Penalty system

Penalty types and criteria

Warning

  • Description: Lightest action informing of violation and requesting prevention of recurrence
  • Criteria: Minor code of conduct violation, first offense with no apparent malice, violation due to mistake or ignorance
  • Effect: Warning message sent, warning history recorded for future reference, may be removed from history after certain period (e.g., 1 year)
  • Accumulation: 3 warnings automatically triggers review for stronger action (but no automatic sanction; requires moderator judgment)

Content censorship

  • Description: Hide specific content from public areas
  • Criteria: The content itself violates code of conduct, the specific content is problematic rather than user's overall behavior
  • Effect: Content excluded from timeline, search, recommendations; permalink retained but shows censorship notice; Delete activity may be sent to federated servers

Temporary suspension

  • Description: Restrict account activity for a period
  • Criteria: Repeated violations despite warnings, moderate severity violation, immediate activity stop needed but not warranting permanent suspension
  • Duration: Minimum 1 day – Maximum 90 days
    • Minor repeated violations: 1–7 days
    • Moderate violations: 7–30 days
    • Severe violations (first offense): 30–90 days
  • Effect: Cannot create new content, cannot interact (reactions, comments), can view existing content, full functionality restored upon suspension end
  • For remote users: Federation blocked for the period within Hackers' Pub; if reporter opted in, notification sent to remote server admin via ActivityPub Flag activity

Permanent suspension

  • Description: Most severe action permanently deactivating the account
  • Criteria: Very severe code of conduct violation (hate speech, illegal content), same violation repeated after temporary suspension, clear malicious intent to harm community confirmed
  • Effect: Cannot log into account, all features disabled, public content hidden, profile page shows suspension notice
  • For remote users: Permanent federation block with Hackers' Pub; if reporter opted in, notification sent to remote server admin via ActivityPub Flag activity
  • Recovery: Permanent suspension is not restored in principle; in exceptional cases, re-review request possible after sufficient time

Penalty history management

Penalty Retention period Notes
Warning 1 year Excluded from history if no additional violations for 1 year
Content censorship Indefinite Maintained as long as content exists
Temporary suspension Indefinite Record maintained, elapsed time considered in judgment
Permanent suspension Indefinite

ActivityPub federation handling

Overview

Hackers' Pub is part of a decentralized network using the ActivityPub protocol. The flag system must operate smoothly in this environment.

Flag activity

The ActivityPub specification defines the Flag activity for propagating reports across the federated network.

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "type": "Flag",
  "actor": "https://hackerspub.example/users/moderator",
  "object": [
    "https://remote.example/users/reported_user",
    "https://remote.example/posts/problematic_post"
  ],
  "content": "Violation of Code of Conduct: harassment"
}

Cross-instance report forwarding

When a local user reports remote content or users, they may choose to forward the report to the remote instance.

Important

Cross-instance report forwarding is opt-in by the reporter, not automatic. This design choice:

  • Respects reporter agency: some reporters may not want their report shared beyond Hackers' Pub
  • Prevents potential harassment: automatic forwarding could be weaponized
  • Aligns with privacy expectations: reporters should control where their report goes

Forwarding process (when opted in)

  1. Reporter explicitly checks the forwarding option in the report form
  2. After moderator action (not immediately upon report submission), the Flag activity is sent
  3. The Flag activity includes the violation reason but NOT the reporter's identity
  4. Action by the remote server is at their discretion

Remote content report processing

Report receipt

  1. Local user reports remote content/user
  2. Report saved to Hackers' Pub database
  3. Moderator reviews as with regular reports

Action application

  1. Within Hackers' Pub:

    • Hide/delete local cache of content
    • Block federation with user (temporary/permanent)
  2. Remote server notification (only if reporter opted in):

    • Send Flag activity to remote server
    • Action by remote server is at their discretion

Handling incoming Flag activities

When a Flag activity is received from another server:

  1. Receive and parse Flag activity
  2. Verify if target is local user/content
  3. Notify moderator as external report
  4. Moderator reviews and acts according to own judgment

External report display:

[External Report] Received from remote.example

Target: @localuser's content
Reason: "Violation of our community guidelines"

* This report was received from an external server.
  Please judge according to our own code of conduct.

Note

External reports (incoming Flag activities) are treated differently from internal reports. They provide less context and come from unknown moderation cultures, so moderators should apply additional scrutiny.

Mastodon compatibility

Mastodon is the most widely used ActivityPub implementation:

  • Support Mastodon's Flag activity format
  • Consider integration with Mastodon admin API (future)
  • Receive and process reports sent from Mastodon

Notification system

Notification types

Type Recipient Content
flag_received Moderator New report received
action_taken Reported user Action has been taken
appeal_received Moderator Appeal received
appeal_resolved Reported user Appeal processing complete
suspension_ending Reported user Suspension end approaching

Notification channels

  • In-app notification: Default method
  • Email: For important notifications (actions, suspensions)
  • ActivityPub: For remote users, sent to their server

Privacy and security

Reporter anonymity protection

  • Filter reporter information from API responses
  • Display reporter information only in moderator UI
  • Mask reporter information in logs (as needed)

Data access control

Role Accessible information
Regular user Only their own report history
Reported user Actions and reasons for themselves (excluding reporter info)
Moderator All report information (including reporter info)

Content snapshots

Reasons for saving content snapshots at report time:

  • Preserve original evidence even if reported user modifies/deletes content
  • Maintain records for fair judgment
  • Reference material for appeals

Retention period:

  • Minimum 1 year after case closure
  • Longer retention if legally required

Abuse prevention

False report prevention

  • Limit repeated reports of same target by same user
  • Possible sanctions for false reporters
  • Monitor report patterns

Report flood prevention

  • Rate limiting for many reports in short time
  • Alert moderators of abnormal patterns

Moderator dashboard

Dashboard overview

The moderator dashboard is the central hub for report management.

Main screens:

  1. Pending reports list
  2. Report detail and processing screen
  3. Appeals list
  4. Statistics and analysis
  5. Currently sanctioned users list

Report list screen

The main report list provides moderators with an at-a-glance view of all pending reports, prioritized by urgency.

Key elements:

  • Header: Title with a link to statistics view
  • Filters: Dropdowns for status (all/pending/reviewing/resolved), state, and sort order; plus a search box
  • High priority section: Reports with 5+ flags are highlighted with a warning indicator and displayed at the top
  • Report cards: Each card shows:
    • Priority indicator (red/yellow/green based on report count)
    • Target identifier (user handle or content reference)
    • Report count in parentheses
    • Summary of report reasons
    • Time since first report
┌─────────────────────────────────────────────────────────┐
│  Report Management                         [View Stats] │
├─────────────────────────────────────────────────────────┤
│  Filter: [All ▼] [Pending ▼] [Newest ▼]   Search: [____]│
├─────────────────────────────────────────────────────────┤
│                                                         │
│  ⚠️ High Priority (5+ reports)                          │
│  ┌─────────────────────────────────────────────────┐    │
│  │ 🔴 @user123's content (7 reports)               │    │
│  │    "Hate speech", "Discriminatory language" +5  │    │
│  │    First report: 2 hours ago                    │    │
│  └─────────────────────────────────────────────────┘    │
│                                                         │
│  Regular Reports                                        │
│  ┌─────────────────────────────────────────────────┐    │
│  │ 🟡 @remote@other.server user (2 reports)        │    │
│  │    "Spam behavior"                              │    │
│  │    First report: 5 hours ago                    │    │
│  └─────────────────────────────────────────────────┘    │
│  ┌─────────────────────────────────────────────────┐    │
│  │ 🟢 @newuser's comment (1 report)                │    │
│  │    "Inappropriate language"                     │    │
│  │    Report: 1 day ago                            │    │
│  └─────────────────────────────────────────────────┘    │
│                                                         │
└─────────────────────────────────────────────────────────┘

Statistics screen

The statistics screen helps moderators understand reporting trends and measure moderation effectiveness.

Key features:

  • Period selector: Dropdown to set query range (e.g., last 7 days, 30 days, 90 days, or custom range)
  • Summary metrics: High-level KPIs for quick assessment
  • Action distribution: Breakdown of how reports were resolved, useful for identifying patterns (e.g., high dismissal rate may indicate unclear reporting guidelines)
  • Violation types: Most common violation categories, helps prioritize community guidelines and education efforts
  • Export option: Ability to export data for external analysis (future consideration)

Summary

These metrics provide a quick health check of the moderation queue:

Item Value
Total reports 127
Processed 98 (77%)
Average processing time 4.2 hours

Action distribution

Shows how reports were resolved. A healthy distribution typically shows most reports resulting in dismissals or warnings, with severe sanctions being rare. Unusual patterns (e.g., very high dismissal rate) may indicate issues with reporting guidelines or user education.

Action Count Ratio
Dismissed 45 46%
Warning 38 39%
Content censorship 10 10%
Temporary suspension 4 4%
Permanent suspension 1 1%

Violation types (top 5)

Identifies the most common types of violations reported. This data helps:

  • Prioritize which code of conduct sections need clearer communication
  • Identify emerging problem areas (e.g., sudden spike in spam reports)
  • Guide community education efforts
  • Inform decisions about automated detection tools
Rank Type Count
1 Spam/advertising 32
2 Hate speech 24
3 Harassment 18
4 Inappropriate content 12
5 Misinformation 8

Future considerations

Automation features (for future consideration)

  • Auto-hide: Temporarily hide content when reports exceed threshold before moderator review
  • AI-based pre-filtering: Automatic detection of obvious violations
  • Auto-spam handling: Automatic action for obvious spam

Caution

Automation features risk false positives and should be introduced carefully.

Community participation

  • Trusted reporters: Higher weight for reports from users with accurate report history
  • Community moderators: Consider community moderator system to distribute moderator burden

Multilingual support

  • Auto-translate report reasons (when moderator uses different language)
  • Integration with multilingual code of conduct versions
  • Multilingual templates for action notification messages

Legal requirements

  • Procedures for data preservation/provision per legal requests
  • Separate procedure for copyright infringement reports (DMCA, etc.)
  • Law enforcement cooperation procedures

Database schema (draft)

Design rationale

The schema separates individual reports (flag) from cases (flag_case) to handle duplicate reports elegantly. When multiple users report the same content, each report is preserved individually (maintaining reporter anonymity and capturing diverse perspectives), while moderators work with a single unified case.

Key design decisions:

  • Separation of flag and flag_case: Allows multiple reports to be grouped without losing individual report data. This is essential for both accurate statistics and protecting reporter privacy.
  • Immutable actions: Actions are recorded as separate records rather than updating the case, creating a full audit trail. If a case is reopened or an appeal changes the outcome, the history is preserved.
  • Content snapshots: Stored separately to preserve evidence even if the original content is modified or deleted. This ensures fair judgment and supports appeals.
  • Code of conduct versioning: The coc_version field captures the exact version of the code of conduct at report time, ensuring reports are evaluated in their original context even as rules evolve.

Tables

flag table

Stores individual reports submitted by users. Each report is a separate record, even if multiple users report the same content.

Design notes:

  • reporter_id is nullable to support anonymous reports from external ActivityPub servers where we may not have a local account
  • target_post_id is nullable because user reports (as opposed to content reports) don't target a specific post
  • llm_analysis uses JSONB to flexibly store structured analysis results without schema constraints, allowing the LLM output format to evolve
  • coc_version stores the Git commit hash rather than a version number, providing an immutable reference to the exact code of conduct text
  • forward_to_remote tracks whether the reporter opted in to cross-instance forwarding
Column Type Description
id uuid Primary key
reporter_id uuid Foreign key to account (nullable for anonymous/external)
target_account_id uuid Foreign key to account (reported user)
target_post_id uuid Foreign key to post (nullable, for content reports)
reason text Reporter's written reason
coc_version varchar Code of conduct Git commit hash at report time
llm_analysis jsonb LLM matching results
status varchar pending, reviewing, resolved, dismissed
case_id uuid Foreign key to flag_case (for grouping duplicates)
forward_to_remote boolean Whether reporter opted in to forward to remote instance
created_at timestamptz Report submission time
updated_at timestamptz Last update time

flag_case table

Groups related reports into a single case for moderator review. This is the primary entity moderators interact with.

Design notes:

  • Cases are created automatically when the first report for a target is submitted
  • Subsequent reports for the same target are linked to the existing case
  • assigned_moderator_id enables workload distribution and prevents conflicts when multiple moderators are active
  • resolved_at is separate from created_at to track resolution time metrics
Column Type Description
id uuid Primary key
target_account_id uuid Foreign key to account
target_post_id uuid Foreign key to post (nullable)
status varchar Case status
assigned_moderator_id uuid Foreign key to account (assigned moderator)
created_at timestamptz Case creation time
resolved_at timestamptz Case resolution time

flag_action table

Records actions taken by moderators. Actions are immutable—if a decision changes, a new action record is created rather than updating the existing one.

Design notes:

  • Immutable records create a complete audit trail for accountability and appeals
  • violated_provisions is an array to support cases where multiple provisions are violated
  • rationale is required for transparency and to support consistent decision-making across moderators
  • message_to_user is stored separately from rationale because the internal rationale may contain details not appropriate to share with the reported user
  • Suspension timestamps are nullable since they only apply to suspension actions
Column Type Description
id uuid Primary key
case_id uuid Foreign key to flag_case
moderator_id uuid Foreign key to account (moderator who took action)
action_type varchar dismiss, warning, censor, suspend, ban
violated_provisions text[] List of violated code of conduct provisions
rationale text Moderator's judgment rationale
message_to_user text Message sent to reported user
suspension_start timestamptz Suspension start (for temporary suspension)
suspension_end timestamptz Suspension end (for temporary suspension)
created_at timestamptz Action time

flag_appeal table

Stores appeals filed by users against moderation actions.

Design notes:

  • Links to flag_action rather than flag_case because users appeal specific actions, and a case may have multiple actions over time
  • reviewer_id is tracked separately to ensure appeals are reviewed by a different moderator when possible
  • result is separate from status because a resolved appeal has both a completion status and an outcome
  • Both the appellant's reason and the reviewer's rationale are preserved for transparency
Column Type Description
id uuid Primary key
action_id uuid Foreign key to flag_action
appellant_id uuid Foreign key to account (person filing appeal)
reason text Appeal reason
additional_context text Additional context or evidence
status varchar pending, reviewing, resolved
result varchar dismissed, reduced, withdrawn, increased
reviewer_id uuid Foreign key to account (reviewing moderator)
review_rationale text Review result rationale
created_at timestamptz Appeal submission time
resolved_at timestamptz Appeal resolution time

content_snapshot table

Preserves the state of reported content at the time of the report.

Design notes:

  • Essential for fair judgment—reported users may edit or delete content after being reported
  • Both text and HTML versions are stored because some violations may only be apparent in the rendered form (e.g., misleading link text)
  • metadata captures contextual information like author display name, avatar, and timestamps that may change or be lost if the account is modified
  • Linked to flag rather than flag_case because different reporters may see different versions of content if it's edited between reports
Column Type Description
id uuid Primary key
flag_id uuid Foreign key to flag
post_id uuid Foreign key to post
content text Content text at snapshot time
content_html text Content HTML at snapshot time
metadata jsonb Additional metadata (author info, timestamps, etc.)
created_at timestamptz Snapshot creation time

Indexes

  • flag(target_account_id) — For querying reports by target user
  • flag(target_post_id) — For querying reports by target content
  • flag(status) — For filtering by status
  • flag(case_id) — For grouping by case
  • flag_case(status) — For moderator queue
  • flag_case(assigned_moderator_id) — For moderator workload
  • flag_action(case_id) — For case action history
  • flag_appeal(action_id) — For finding appeals by action

Implementation phases

Phase 1: Core infrastructure

  • Database schema and migrations
  • Basic flag model and CRUD operations
  • Content snapshot functionality

Phase 2: User-facing features

  • Report UI for articles, notes, and users
  • Report form with free-form reason input
  • Report history page for users

Phase 3: Moderator tools

  • Moderator dashboard
  • Report review and action interface
  • Statistics and analytics

Phase 4: LLM integration

  • Code of conduct version management
  • LLM-based provision matching
  • Analysis result storage and display

Phase 5: Notification system

  • In-app notifications for all parties
  • Email notifications for important actions
  • Notification preferences

Phase 6: Appeal system

  • Appeal submission interface
  • Appeal review workflow
  • Result notifications

Phase 7: ActivityPub federation

  • Outgoing Flag activity support (opt-in by reporter)
  • Incoming Flag activity handling
  • Remote user action handling

Phase 8: Advanced features

  • Automation features (with careful consideration)
  • Advanced analytics
  • Multilingual support enhancements

Terminology reference

English Korean Description
flag/report 신고 Notifying of suspected violation content/user
code of conduct 행동 강령 Community rules
moderator 관리자 Person with report processing authority
censorship 검열 Hiding content
suspension 정지 Restricting account activity
appeal 이의 제기 Request for action re-review
federation 연합 Connection between decentralized networks
post 콘텐츠 Collective term for articles and notes
article 게시글 Long-form blog-style post
note 단문 Short microblog-style post
timeline 타임라인 Content feed
fediverse 연합우주 ActivityPub-based decentralized social network
instance 인스턴스 Individual server in the fediverse

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions