Skip to content

Conversation

@mistydemeo
Copy link

@mistydemeo mistydemeo commented Dec 7, 2025

This adds a check in the oembed preview for cases where the oembed is going to contain less information than the OpenGraph; in these cases, the oembed is ignored and the OpenGraph data is retained. At the moment, the only case in here is Mastodon posts, but it could be expanded in the future.

Mastodon's oembed is JavaScript based and only contains stub HTML that's intended to be filled in by client-side JavaScript that synapse doesn't execute. It doesn't contain the actual post text. The og:description, on the other hand, does contain the post text. When synapse unilaterally prefers the oembed over OpenGraph, it ends up accidentally filling in a generic and non-useful preview in place of the actual post content.

See mastodon/mastodon#34710 for context.

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

@mistydemeo mistydemeo requested a review from a team as a code owner December 7, 2025 18:59
@CLAassistant
Copy link

CLAassistant commented Dec 7, 2025

CLA assistant check
All committers have signed the CLA.

This adds a check in the oembed preview for cases where the oembed
is going to contain less information than the OpenGraph; in these
cases, the oembed is ignored and the OpenGraph data is retained.
At the moment, the only case in here is Mastodon posts, but it
could be expanded in the future.

Mastodon's oembed is JavaScript based and only contains stub HTML
that's intended to be filled in by client-side JavaScript that
synapse doesn't execute. It doesn't contain the actual post text.
The og:description, on the other hand, does contain the post text.
When synapse unilaterally prefers the oembed over OpenGraph, it
ends up accidentally filling in a generic and non-useful preview
in place of the actual post content.
@mistydemeo mistydemeo force-pushed the reject_mastodon_oembed branch from 085f270 to c57765e Compare December 7, 2025 19:00
@mistydemeo
Copy link
Author

I also considered using the provider, but unfortunately Mastodon fills in the provider with the site name and so it's different for every Mastodon instance. The only two places where we can reliably tell this is Mastodon is a) the "View on Mastodon" text, and b) the mastodon-embed class on the root <blockquote> element.

Tumblr provides a generic oembed result that consists just of the
post URL. The og:description, on the other hand, contains an actual
abbreviated copy of the post, which is more useful.

This also adds a method to ignore other oembed results based on the
provider string. That wasn't possible for Mastodon, but it is for
Tumblr.
@mistydemeo
Copy link
Author

I've added a second commit to also skip oembed text for Tumblr. Like Mastodon, Tumblr's oembed contains stub content and JavaScript to render a rich post view; in this case, the stub content is just the URL to the post, with no content. Here's what those previews look like right now:

image

And here's the oembed for this post: https://www.tumblr.com/oembed/1.0?url=https://www.suppermariobroth.com/post/802233242234339328/main-blog-patreon-twitter-bluesky-small

Also like Mastodon, however, the og:description contains static content that's actually useful. I think it would be better for synapse not to overwrite the useful og:description with this content. Since Tumblr has a reliable provider_name in its oembed, I've added a list of providers to skip as a constant and updated the oembed parser to also skip replacing the description if it encounters a provider in that list. Right now, the list just contains "Tumblr", but it can be updated as needed in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants