Skip to content

Conversation

@tomerqodo
Copy link

@tomerqodo tomerqodo commented Dec 4, 2025

User description

Benchmark PR discourse#36132

Type: Clean (correct implementation)

Original PR Title: FIX: silent failure when watched words contains invalid regex
Original PR Description: If a watched words list contains any invalid regex, the test modal won't work because the "compiled" regex (that contains all the watched words) will be invalid.

This PR introduces a fallback to matching individual watched words when the compiled regex is invalid as well as displaying the error for each invalid regex.

Screenshot 2025-11-20 at 10 34 00
Screen.Recording.2025-11-20.at.10.34.05.mov

Internal ref - t/168206
Original PR URL: discourse#36132


PR Type

Bug fix, Enhancement, Tests


Description

  • Handle invalid regex in watched words gracefully with fallback matching

    • Log warnings for individual invalid regexes instead of silent failures
    • Fall back to individual word matching when compiled regex fails
  • Improve error display in admin UI for multiple invalid regexes

    • Show all invalid regex errors with specific error messages
    • Display errors in scrollable list format
  • Add comprehensive test coverage for mixed valid/invalid regex scenarios

    • Test modal functionality with invalid regexes present
    • Verify fallback matching works correctly

Diagram Walkthrough

flowchart LR
  A["Invalid Regex in Watched Words"] --> B["Log Warning + Return nil"]
  B --> C["Filter Out Invalid Regexes"]
  C --> D["Compile Valid Regexes"]
  D --> E{"Compiled Regex Valid?"}
  E -->|Yes| F["Use Compiled Regex"]
  E -->|No| G["Fall Back to Individual Words"]
  F --> H["Display Matches"]
  G --> H
  I["Admin UI"] --> J["Show All Regex Errors"]
  J --> K["List with Error Details"]
Loading

File Walkthrough

Relevant files
Error handling
word_watcher.rb
Add error handling and logging for invalid regexes             

app/services/word_watcher.rb

  • Added error handling for individual invalid regexes with logging
  • Added try-catch block around compiled regex creation to handle
    compilation failures
  • Returns nil for invalid regexes instead of raising exceptions
  • Logs warning messages with word, regex pattern, and error details
+14/-2   
Tests
word_watcher_spec.rb
Add tests for invalid regex compilation handling                 

spec/services/word_watcher_spec.rb

  • Added new test context for invalid regex causing compilation failure
  • Tests verify no exception is raised when invalid regex present
  • Tests confirm valid words still compile and match correctly
  • Tests validate serialized_regexps_for_action works with mixed
    valid/invalid regexes
+36/-0   
admin-watched-words-test.js
Add acceptance tests for invalid regex handling                   

frontend/discourse/tests/acceptance/admin-watched-words-test.js

  • Added test suite for mixed valid and invalid regex scenarios
  • Tests verify test modal works with replace action containing invalid
    regex
  • Tests confirm block action falls back to individual word matching
  • Tests validate unicode flag validation detects multiple invalid
    patterns
  • Tests check error messages display correctly in admin UI
+204/-0 
Enhancement
action.js
Refactor to collect and display multiple regex errors       

frontend/discourse/admin/controllers/admin-watched-words/action.js

  • Changed regexpError getter to regexpErrors returning array of errors
  • Added @cached decorator for performance optimization
  • Collects all invalid regex errors with word and error message
  • Deduplicates errors using Set to avoid duplicate messages
  • Uses unicode flag 'u' for regex validation
+27/-6   
watched-word-testing.gjs
Add error handling and fallback matching in test modal     

frontend/discourse/admin/components/modal/watched-word-testing.gjs

  • Refactored matches getter to matchesAndErrors returning both matches
    and errors
  • Added error handling with try-catch for each word regex compilation
  • Implements fallback to individual word matching when compiled
    expression fails
  • Collects and deduplicates regex errors during matching process
  • Added cleanErrorMessage helper to extract error details
+98/-42 
action.gjs
Update template to display multiple regex errors                 

frontend/discourse/admin/templates/admin-watched-words/action.gjs

  • Changed from single error display to multiple errors list
  • Updated template to iterate over regexpErrors array
  • Displays each error with word name and specific error message
  • Uses new .watched-word-regex-errors styling class
  • Added header text using new translation key
+12/-2   
Formatting
staff_logs.scss
Add styling for regex error list display                                 

app/assets/stylesheets/admin/staff_logs.scss

  • Added new .watched-word-regex-errors class for styling error list
  • Configured max-height with scrollable overflow for multiple errors
  • Applied disc list style and left padding for better readability
+7/-0     
Documentation
client.en.yml
Add translation for multiple regex errors                               

config/locales/client.en.yml

  • Added new translation key invalid_regex_multiple for multiple errors
    header
  • Maintains existing invalid_regex key for backward compatibility
+1/-0     

@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Logging Context: New warnings for invalid watched word regexes are added but it's unclear if
user/context identifiers and consistent structured fields are included in logs to support
full auditability.

Referred Code
        Rails.logger.warn(
          "Watched word '#{word}' has invalid regex '#{r}' for #{action}: #{e.message}",
        )
        nil
      end
    end
    .select { |r| r.present? }
    .join("|")

next if regexp.blank?

# Add word boundaries to the regexp for regular watched words
regexp =
  match_word_regexp(
    regexp,
    engine: engine,
  ) if !SiteSetting.watched_words_regular_expressions?

# Add case insensitive flag if needed
begin
  Regexp.new(regexp, group_key == :case_sensitive ? nil : Regexp::IGNORECASE)


 ... (clipped 5 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
Error Detail Exposure: The UI surfaces raw regex engine error messages (e.message) which may expose internal
parsing details to admins; verify this is acceptable for the admin audience and does not
leak sensitive internals.

Referred Code
const key = `${id}:${word}:${e.message}`;
if (!seen.has(key)) {
  seen.add(key);
  const parts = e.message.split(": ");
  const cleanError = parts[parts.length - 1];
  errors.push({ word, error: cleanError });
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Log PII Risk: Logs include the watched word and its regex pattern which could contain user-provided or
sensitive terms; confirm that logging these values is acceptable and sanitized per policy.

Referred Code
        Rails.logger.warn(
          "Watched word '#{word}' has invalid regex '#{r}' for #{action}: #{e.message}",
        )
        nil
      end
    end
    .select { |r| r.present? }
    .join("|")

next if regexp.blank?

# Add word boundaries to the regexp for regular watched words
regexp =
  match_word_regexp(
    regexp,
    engine: engine,
  ) if !SiteSetting.watched_words_regular_expressions?

# Add case insensitive flag if needed
begin
  Regexp.new(regexp, group_key == :case_sensitive ? nil : Regexp::IGNORECASE)


 ... (clipped 5 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Implement server-side fallback for regex matching

The server-side WordWatcher service should implement a fallback to individual
regex matching, similar to the frontend test modal. This will prevent
inconsistencies where the test modal shows matches for invalid compiled regexes,
but the live site silently fails to moderate them.

Examples:

frontend/discourse/admin/components/modal/watched-word-testing.gjs [110-132]
      } catch {
        hasCompiledExpressionError = true;
      }
    });

    if (hasCompiledExpressionError) {
      matches = [];
      this.args.model.watchedWord.words.forEach((word) => {
        try {
          const regexp = new RegExp(

 ... (clipped 13 lines)
app/services/word_watcher.rb [198-231]
    regexps = self.class.compiled_regexps_for_action(action)
    return if regexps.blank?

    match_list = []
    regexps.each do |regexp|
      match = regexp.match(@raw)

      if !all_matches
        return match if match
        next

 ... (clipped 24 lines)

Solution Walkthrough:

Before:

# frontend/discourse/admin/components/modal/watched-word-testing.gjs
get matchesAndErrors() {
  let hasCompiledExpressionError = false;
  try {
    // Attempt to use the single compiled regex
  } catch {
    hasCompiledExpressionError = true;
  }

  if (hasCompiledExpressionError) {
    // Fallback: iterate and match each word's regex individually
  }
  return { matches, errors };
}

# app/services/word_watcher.rb
def word_matches_for_action?(action)
  # This method gets the compiled regexes.
  # If compilation failed for a group, it's silently removed.
  # There is no fallback to individual regex matching here.
  regexps = self.class.compiled_regexps_for_action(action)
  regexps.each do |regexp|
    # ... match against the compiled regex
  end
end

After:

# frontend/discourse/admin/components/modal/watched-word-testing.gjs
// The frontend logic can be simplified if the backend is consistent.
// The main change is in the backend.

# app/services/word_watcher.rb
def word_matches_for_action?(action)
  regexps = self.class.compiled_regexps_for_action(action)

  # If compiled regexps are empty or failed, but there are words
  if regexps.blank? && self.class.words_for_action_exist?(action)
    # Fallback: get individual regexes and match them one by one
    individual_regexps = self.class.regexps_for_action(action)
    # ... logic to match individual regexps
  else
    # Original logic: match against compiled regexes
    regexps.each do |regexp|
      # ... match against the compiled regex
    end
  end
end
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a critical discrepancy where the frontend test modal has a fallback for invalid compiled regexes, but the backend moderation logic does not, leading to silent failures on the live site.

High
Possible issue
Preserve valid matches during fallback

Modify the fallback logic to trigger only when a compiled regex fails and no
matches have been found, preventing the incorrect discarding of valid matches.

frontend/discourse/admin/components/modal/watched-word-testing.gjs [98-132]

 let matches = [];
 let hasCompiledExpressionError = false;
 
 this.args.model.watchedWord.compiledRegularExpression.forEach((entry) => {
   try {
     const [regexp, options] = Object.entries(entry)[0];
     const wordRegexp = new RegExp(
       regexp,
       options.case_sensitive ? "gu" : "gui"
     );
 
     matches.push(...(this.value.match(wordRegexp) || []));
   } catch {
     hasCompiledExpressionError = true;
   }
 });
 
-if (hasCompiledExpressionError) {
-  matches = [];
+if (hasCompiledExpressionError && matches.length === 0) {
   this.args.model.watchedWord.words.forEach((word) => {
     try {
       const regexp = new RegExp(
         word.regexp,
         word.case_sensitive ? "gu" : "gui"
       );
       let match;
 
       while ((match = regexp.exec(this.value)) !== null) {
         matches.push(match[1] || match[0]);
       }
     } catch (e) {
       addError(word.word, e.message);
     }
   });
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a bug where valid matches from compiled regexes are discarded if a subsequent regex fails, and proposes a correct fix to preserve them.

Medium
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants