Skip to content

Commit 97edd7d

Browse files
authored
FIX: Strip detection text before truncation (#196)
Besides removing images, we also want to make sure that we truncate the text _after_ removing the image, otherwise text sent for detection would be empty. e.g. a cooked post that looks like that `<p></p><div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://asd.cloudfront.net/original/4X/c/d/d/asd.jpeg\" data-download-href=\"/uploads/short-url/asd.jpeg?dl=1\" title=\"IMG_20928\"><img src=\"https://asd.asd.net/optimized/4X/c/d/d/asd.jpeg\" alt=\"IMG_2029\" data-base62-sha1=\"asd\" width=\"666\" height=\"500\" srcset=\"https://asd.cloudfront.net/optimized/4X/c/d/d/asd.jpeg, https://asd.cloudfront.net/optimized/4X/c/d/d/asd.jpeg 1.5x, https://asd.cloudfront.net/optimized/4X/c/d/d/asd.jpeg 2x\" data-dominant-color=\"767065\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use href=\"#far-image\"></use></svg><span class=\"filename\">IMG_2029</span><span class=\"informations\">1920×1440 742 KB</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use href=\"#discourse-expand\"></use></svg>\n</div></a></div>\n<p>L’església romànica de Santa Margarida.</p>` should strip the `div.lightbox` and send `<p>L’església romànica de Santa Margarida.</p>` but is sending `<p></p>` now due to the mis-order.
1 parent 620d774 commit 97edd7d

File tree

2 files changed

+8
-2
lines changed

2 files changed

+8
-2
lines changed

app/services/discourse_translator/base.rb

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,8 +79,9 @@ def self.strip_tags_for_detection(detection_text)
7979
end
8080

8181
def self.text_for_detection(topic_or_post)
82-
strip_tags_for_detection(
83-
get_text(topic_or_post).truncate(DETECTION_CHAR_LIMIT, omission: nil),
82+
strip_tags_for_detection(get_text(topic_or_post)).truncate(
83+
DETECTION_CHAR_LIMIT,
84+
omission: nil,
8485
)
8586
end
8687

spec/services/base_spec.rb

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,11 @@ class EmptyTranslator < DiscourseTranslator::Base
8282
post.cooked = text
8383
expect(DiscourseTranslator::Base.text_for_detection(post)).to eq(text)
8484
end
85+
86+
it "strips text before truncation" do
87+
post.cooked = "<img src='http://example.com/image.png' />" + "a" * 1000
88+
expect(DiscourseTranslator::Base.text_for_detection(post)).to eq("a" * 1000)
89+
end
8590
end
8691

8792
describe ".text_for_translation" do

0 commit comments

Comments
 (0)