Skip to content

Conversation

obenland
Copy link
Member

@obenland obenland commented Jan 7, 2025

Should we sideload emoji images?

Fixes #970.

See https://mastodon.social/@obenland/113788023027390776
See https://obietester.blog/2025/01/06/99/#comment-31

Proposed changes:

  • Adds callback function that replaces custom emoji strings with their image representations when available.
  • Adds unit test to cover the new function.
  • Runs author name and comment content through the new callback on insert and update.

Other information:

  • Have you written new tests for your changes, if applicable?

Testing instructions:

  • Go to Mastodon, for example, and reply to a federated test post with a custom emoji. :yikes:, :AngeryCat:, etc.
  • Go to the comment list in wp-admin and make sure the emoji render as images.

@obenland obenland requested review from jeherve and a team January 7, 2025 17:32
@obenland obenland self-assigned this Jan 7, 2025
@pfefferle
Copy link
Member

I think before allowing images in comments, we have to think about/implement a proper blocking and moderation tooling (even more if it is about side-loading images): https://www.theverge.com/2023/7/24/23806093/mastodon-csam-study-decentralized-network

@obenland
Copy link
Member Author

I just pushed an update that moves the replacement into a filter that runs after comment_content and the author have been sanitized. That way, only custom emoji images will be added to the content.

@jeherve
Copy link
Member

jeherve commented Jan 13, 2025

Heads up: This PR adds img tags to the list of allowed HTML tags for Interactions!

Would sideloading the image onto the site with media_sideload_image be helpful here?

  • It would ensure the images remain available in the future.
  • It would avoid image hotlinking, which can be frowned upon.
  • It would add some validation to the image data before to simply display it on the site.

Of course, we'd need to be okay adding potentially a lot of new media to a site once that's enabled. I'm not sure that's okay.

@obenland
Copy link
Member Author

It would avoid image hotlinking, which can be frowned upon.

With custom emoji being shared tags, is that not kind of expected?

@jeherve
Copy link
Member

jeherve commented Jan 14, 2025

With custom emoji being shared tags, is that not kind of expected?

I'm honestly not sure what the common practice is with other Fediverse software. From what I can tell, on Mastodon and on GoToSocial the images from remote instances seem to be cached on the local server:

GoToSocial:

image

Mastodon

image

@obenland obenland marked this pull request as draft February 19, 2025 23:26
@Jiwoon-Kim

This comment was marked as outdated.

@obenland
Copy link
Member Author

obenland commented Jun 9, 2025

@Jiwoon-Kim I appreciate your input and feedback, but comments and issues of this length are not helpful. They generally lack a specific ask or suggestion and are incredibly hard to read. Going forward, please keep comments/issues concise and actionable, like I asked for previously.

@Jiwoon-Kim
Copy link

@obenland Thank you for your feedback, and I really appreciate the attention.

As someone coming from a non-developer background, I tend to approach these topics from a more experiential and design-oriented perspective. That often leads me to start from broader conceptual ideas before narrowing down to specifics—especially when things are interconnected or have potential design implications that aren't immediately obvious.

I understand that this approach can result in comments that feel messy or overwhelming, and I apologize if it made the discussion harder to follow. I added the reference materials at the end to reduce the need for re-investigation, but I realize now that might have come across as overly verbose.

Also, since this pull request already exists, I thought discussing the context and UX considerations here would help avoid scattering the conversation into too many isolated issues. But I see now that clarity and conciseness in individual issues is important, so I’ll aim to keep future contributions more structured and actionable.

If you ever have time, I’d be grateful for any advice on how to break down larger conceptual suggestions into well-scoped issues. I’m eager to learn and collaborate more effectively with the team.

Thanks again!


Also, just to explain a bit about my workflow — I usually start with an idea in Korean, then use GPT to help flesh it out and translate it into English. Sometimes I’ll use Gemini or Perplexity to refine the wording too. So I’m not actually writing or editing in English directly most of the time.

If this were a Korean-language project, breaking things down into smaller, well-scoped issues like you suggested would honestly be much easier for me. But when working in English, it takes quite a lot more energy for me to carefully read through, translate, and restructure everything. That’s why I sometimes rely on machine translation to double-check and just post the draft as it is.

Hope you can understand that part of my process.

Anyway — I actually ran your feedback and my original comment through Gemini to auto-organize it a bit, and I’ll drop that summary in a follow-up comment below.

@obenland
Copy link
Member Author

I do understand, maybe part of your workflow could be to instruct your LLM to use natural language and distill its response to a sentence or two?

@Jiwoon-Kim

This comment was marked as outdated.

Replace multiple database queries with single batched query and fast lookup arrays. This eliminates N+1 query problem where each emoji required separate database calls.

Changes:
- Single get_posts() query instead of one per emoji
- Fast O(1) lookup arrays instead of O(n) loops
- Consolidated meta query building into single loop
- Removed redundant array collections
Move temp file cleanup inside success block to prevent cleanup attempts when no file was created. This ensures temp files are properly cleaned up only when download_url() succeeds.
Adds a 10 second timeout to the download_url call when downloading emoji files to prevent long-running requests.
- Use get_comment_author filter instead of comment_author for proper timing
- Improve emoji detection to check for class="emoji" instead of just "emoji"
- Use proper HTML entity decoding with ENT_QUOTES | ENT_HTML5 flags
- Add test coverage for emoji in comment author names
Break up large replace_custom_emoji method into focused single-responsibility methods:
- extract_emoji_data(): Parse emoji from activity tags
- get_emoji_attachments(): Handle database queries and build lookup arrays
- get_or_create_emoji_attachment(): Decision logic for reuse vs download
- download_emoji(): Handle file download and WordPress attachment creation
- replace_emoji_in_text(): Handle text replacement with HTML

Improvements:
- Better separation of concerns following SRP
- Improved documentation with detailed parameter types
- Cleaner alt text (removes colons from emoji names)
- Simplified fallback logic using consistent emoji URLs
Moved custom emoji processing logic from Interactions to a new Activitypub\Emoji class for better separation of concerns and maintainability. Updated Interactions and related tests to use the new Emoji class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Federated comments: Fediverse handles including custom emoticons are displayed in plain text
4 participants