Add story clustering to group duplicate stories across feeds by samuelclay · Pull Request #2057 · samuelclay/NewsBlur

samuelclay · 2026-02-12T05:51:45Z

Summary

New apps/clustering module that groups stories with matching or similar titles from different feeds using exact normalized title matching plus fuzzy Jaccard similarity on significant words
Clusters stored in Redis (sCL: / zCL: keys, 14-day TTL) and displayed as always-expanded inline source list below the representative story in both river and single-feed views
Celery task ComputeStoryClusters triggered after feed updates for feeds with premium subscribers, rate-limited to once per 6h per feed
Briefing integration: shared normalize_title(), pre-computed cluster lookups in _find_duplicate_stories(), and cluster annotations in AI summary prompts

Test plan

Open All Site Stories in river view, verify cluster sources appear inline below stories that have duplicates across feeds
Click into a single feed that has clustered stories, verify clusters also appear there
Verify cluster quality: all grouped stories should be about the same event from different feeds
Check dark theme styling of cluster source rows
Verify non-premium users do not see clusters
Confirm Celery task runs after feed updates (docker logs newsblur_celery | grep Clustering)

New apps/clustering module that groups stories with matching or similar titles from different feeds. Uses exact normalized title matching plus fuzzy Jaccard similarity on significant words. Clusters are stored in Redis and displayed inline below the representative story in both river and single-feed views. Triggered via Celery task after feed updates. Premium-only feature, always on. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

samuelclay and others added 2 commits February 11, 2026 21:51

Gate story clustering to archive subscribers and remove rate limiting

e632e7a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add story clustering to group duplicate stories across feeds#2057

Add story clustering to group duplicate stories across feeds#2057
samuelclay wants to merge 2 commits intomainfrom
story-clusters

samuelclay commented Feb 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

samuelclay commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

samuelclay commented Feb 12, 2026 •

edited

Loading