Skip to content

Conversation

@MSAdministrator
Copy link
Member

Description

Enhanced ScanIcs to extract URLs embedded in text fields (DESCRIPTION, SUMMARY, LOCATION) - a critical capability for detecting phishing attacks delivered via calendar invitations.

Previously, the scanner only extracted URLs from the dedicated iCalendar URL property. However, phishing campaigns commonly embed malicious URLs directly in the event description using patterns like Openhttps://malicious-site.com.

This change adds:

  • A regex pattern (URL_PATTERN) to match HTTP/HTTPS/FTP URLs, including those wrapped in angle brackets
  • A new _extract_urls_from_text() method that scans text content for embedded URLs
  • Integration with _extract_component_convenience_fields() to automatically extract URLs from DESCRIPTION, SUMMARY, and LOCATION fields
  • A new DEFAULT_MAX_URLS_PER_COMPONENT limit (100) to prevent resource exhaustion
  • Updated docstring to document the new URL extraction capability
  • Additionally, added comprehensive test coverage:
  • New test fixture: test_phishing.ics - an anonymized ICS file modeled after a real phishing calendar invite
  • New test file: test_scan_ics.py with 11 tests covering metadata extraction, component parsing, attendee/organizer extraction, URL extraction from descriptions, and datetime handling

Describe testing procedures

Added net new tests

Tests include:

  • test_calendar_metadata - Verifies PRODID, VERSION, METHOD extraction
  • test_component_counts - Verifies event, timezone, attendee, organizer counts
  • test_vevent_metadata - Verifies SUMMARY, UID, LOCATION, STATUS fields
  • test_attendee_extraction - Verifies email, name, role, RSVP parsing
  • test_organizer_extraction - Verifies organizer CN and email
  • test_url_extraction_from_description - Critical: Verifies embedded phishing URLs are extracted
  • test_url_total_count - Verifies total URL counter
  • test_description_contains_phishing_indicators - Verifies phishing text patterns are preserved
  • test_datetime_extraction - Verifies DTSTART/DTEND parsing
  • test_no_parse_errors - Verifies clean parsing
  • test_timezone_component - Verifies VTIMEZONE extraction

Sample output

{
  "elapsed": 0.002724,
  "flags": [],
  "total": {
    "components": 5,
    "events": 1,
    "todos": 0,
    "journals": 0,
    "timezones": 1,
    "alarms": 0,
    "attachments": 0,
    "extracted_files": 0,
    "attendees": 2,
    "organizers": 1,
    "urls": 3
  },
  "calendars": [
    {
      "prodid": "Microsoft Exchange Server 2010",
      "version": "2.0",
      "method": "REQUEST",
      "components": [
        {
          "type": "VEVENT",
          "attendees": [
            {
              "email": "[email protected]",
              "name": "ACME Corp IT Help Desk",
              "display_name": "ACME Corp IT Help Desk <[email protected]>",
              "role": "REQ-PARTICIPANT",
              "partstat": "NEEDS-ACTION",
              "rsvp": "TRUE"
            }
          ],
          "organizers": [
            {
              "email": "AcmeCorp",
              "name": "ACME Corp Share-File",
              "display_name": "ACME Corp Share-File <AcmeCorp>"
            }
          ],
          "attachments": [],
          "urls": [
            "http://www.linkedin.com/company/example",
            "https://malicious-phishing-site.lambda-url.us-east-1.on.aws/?e=dGVzdEBleGFtcGxlLmNvbQ==",
            "https://www.facebook.com/example"
          ],
          "summary": "Fw: Reminder - 2026 Annual Work Report ",
          "description": "Good morning,\n\nIs this phishing?...\nOpen<https://malicious-phishing-site.lambda-url.us-east-1.on.aws/?e=dGVzdEBleGFtcGxlLmNvbQ==>\n...",
          "uid": "test-uid-12345-phishing-ics",
          "dtstart": "2026-01-14T18:56:53+00:00",
          "dtend": "2026-01-14T18:56:53+00:00",
          "location": "Conference Room",
          "status": "CONFIRMED"
        }
      ]
    }
  ]
}

Checklist

[x] My code follows the style guidelines of this project
[x] I have performed a self-review of and tested my code
[x] I have commented my code, particularly in hard-to-understand areas
[x] I have made corresponding changes to the documentation
[x] My changes generate no new warnings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant