-
Notifications
You must be signed in to change notification settings - Fork 9
Enhance embed URL handling and validation system #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: embed-url-handling-pre
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR implements a comprehensive embedding system for Discourse that allows external websites to embed forum discussions and automatically import content from RSS/ATOM feeds. The changes introduce several key components:
Core Embedding Infrastructure:
- New
TopicEmbedmodel handles importing external content via URLs using the ruby-readability gem for content parsing and Nokogiri for URL absolutization EmbedControllerprovides the main endpoint (/embed/best) for external sites to retrieve embedded content with host validation and iframe supportTopicRetrieverclass manages URL validation, throttling (60-second cooldown), and delegates to TopicEmbed for content fetching- New database tables (
topic_embeds) track relationships between external URLs and Discourse topics
Content Processing Enhancements:
- Extended
Postmodel with acook_methodsenum supporting both regular markdown and raw HTML processing - Modified
PostCreatorandPostRevisorto handle custom cook methods and optional validation bypassing - New scheduled job
PollFeedautomatically imports RSS/ATOM feeds hourly using the simple-rss gem
Frontend Integration:
- JavaScript embed widget (
embed.js) for third-party sites to load Discourse content in iframes - Responsive embed layout with postMessage communication for dynamic resizing
- Loading states and "best posts" view for embedded content display
- Dedicated CSS styling optimized for embedded contexts
Configuration & Localization:
- New site settings section for embedding configuration (embeddable hosts, feed polling, embed user)
- Comprehensive localization strings for both client and server-side embed functionality
- Updated Disqus import task to leverage the new embed system
The system follows a security model where only configured hosts can embed content, with referer validation and throttling to prevent abuse. Background jobs handle potentially slow content retrieval operations asynchronously.
Confidence score: 1/5
- This PR contains multiple critical security vulnerabilities that make it unsafe to merge in its current state
- The implementation uses deprecated and unsafe methods like
open()for HTTP requests, has weak origin validation in JavaScript, potential XSS vulnerabilities from unescaped content, and SSRF attack vectors - Files requiring immediate attention:
app/models/topic_embed.rb,app/assets/javascripts/embed.js,app/controllers/embed_controller.rb,app/jobs/scheduled/poll_feed.rb,app/views/layouts/embed.html.erb
27 files reviewed, 28 comments
| class CreateTopTopics < ActiveRecord::Migration | ||
| def change | ||
| create_table :top_topics do |t| | ||
| create_table :top_topics, force: true do |t| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Adding force: true makes this migration destructive - it will drop and recreate the table if it exists, potentially causing data loss in production
| create_table :top_topics, force: true do |t| | |
| create_table :top_topics do |t| |
| <%= link_to(I18n.t('embed.title'), @topic_view.topic.url, class: 'button', target: '_blank') %> | ||
| <%- else %> | ||
| <%= link_to(I18n.t('embed.start_discussion'), @topic_view.topic.url, class: 'button', target: '_blank') %> | ||
| <%- end if %> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: Unusual end if syntax - should be just end
| <%- end if %> | |
| <%- end %> |
| require 'ruby-readability' | ||
|
|
||
| opts = opts || {} | ||
| doc = Readability::Document.new(open(url).read, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Using open() is deprecated and creates security vulnerabilities (command injection, SSRF). Use Net::HTTP, HTTParty, or similar safe HTTP libraries instead.
| doc = Readability::Document.new(open(url).read, | |
| doc = Readability::Document.new(Net::HTTP.get_response(URI(url)).body, |
| @@ -0,0 +1,40 @@ | |||
| require 'spec_helper' | |||
| require_dependency 'jobs/regular/process_post' | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Incorrect require_dependency path - should be 'jobs/scheduled/poll_feed' not 'jobs/regular/process_post'
|
|
||
| # Import an article from a source (RSS/Atom/Other) | ||
| def self.import(user, url, title, contents) | ||
| return unless url =~ /^https?\:\/\// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: URL validation is insufficient - allows access to internal services (localhost, 192.168.x.x, etc). Add proper SSRF protection.
|
|
||
| def ensure_embeddable | ||
| raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank? | ||
| raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Host validation can be bypassed if request.referer is nil - the || '' fallback makes URI('').host return nil, which could equal SiteSetting.embeddable_host if it's also nil
|
|
||
| function postMessageReceived(e) { | ||
| if (!e) { return; } | ||
| if (discourseUrl.indexOf(e.origin) === -1) { return; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Security flaw - indexOf allows subdomain attacks. Use exact origin matching: e.origin !== discourseUrl.replace(/\/$/, '')
| if (discourseUrl.indexOf(e.origin) === -1) { return; } | |
| var expectedOrigin = discourseUrl.replace(/\/$/, '').replace(/^(https?:\/\/[^/]+).*/, '$1'); | |
| if (e.origin !== expectedOrigin) { return; } |
| raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank? | ||
| raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host | ||
|
|
||
| response.headers['X-Frame-Options'] = "ALLOWALL" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Setting X-Frame-Options to 'ALLOWALL' permits embedding from any domain, potentially enabling clickjacking attacks from malicious sites
| return if user.blank? | ||
|
|
||
| require 'simple-rss' | ||
| rss = SimpleRSS.parse open(SiteSetting.feed_polling_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Security vulnerability: using open without URL validation allows potential SSRF attacks and local file access
| rss = SimpleRSS.parse open(SiteSetting.feed_polling_url) | |
| require 'net/http' | |
| require 'uri' | |
| uri = URI.parse(SiteSetting.feed_polling_url) | |
| raise ArgumentError, "Invalid URL scheme" unless %w[http https].include?(uri.scheme) | |
| response = Net::HTTP.get_response(uri) | |
| raise "HTTP error: #{response.code}" unless response.is_a?(Net::HTTPSuccess) | |
| rss = SimpleRSS.parse response.body |
|
|
||
| user = nil | ||
| if args[:user_id] | ||
| user = User.where(id: args[:user_id]).first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: use find_by(id: args[:user_id]) instead of where().first for cleaner code
Test 4