Skip to content
Closed
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
341c874
First batch of work
ReverM Dec 15, 2025
905d4f7
Added restriction to the tag pages
ReverM Dec 15, 2025
7ad262f
Merge branch 'otwcode:master' into AO3-6067
ReverM Dec 15, 2025
05bf0e1
Reverted autostyle
ReverM Dec 15, 2025
8502d6c
Merge branch 'otwcode:master' into AO3-6067
ReverM Dec 17, 2025
d9c3cbe
Test + Fix
ReverM Dec 18, 2025
a5f3f67
Merge branch 'AO3-6067' of https://github.com/ReverM/otwarchive into …
ReverM Dec 18, 2025
d5c7307
Changed some names for readability
ReverM Dec 18, 2025
e27e433
Did not forget linter, forgot rubocop
ReverM Dec 18, 2025
d109571
Add test for no double counts
ReverM Dec 19, 2025
556d1e7
Made sure that non-canonical aren't shown as well
ReverM Dec 19, 2025
e876c2b
Update app/models/search/search_counts.rb
ReverM Jan 1, 2026
fac8449
Changed behaviour when looking at collection specific tag
ReverM Jan 8, 2026
9e8fa6f
Fixed lint issue
ReverM Jan 8, 2026
dd384ef
disabled caching
ReverM Jan 12, 2026
ce9edfc
Adressed comment and tried suggestion to make test work
ReverM Jan 14, 2026
8db30ad
Moved suggestion
ReverM Jan 14, 2026
e0d1c2a
Maybe here will work?
ReverM Jan 14, 2026
4ed7df5
Maybe this will do?
ReverM Jan 15, 2026
ee889fb
Trying something before resorting to writing a spec test
ReverM Jan 15, 2026
c225c80
For testing what the output is on GitHub's autotest
ReverM Jan 15, 2026
cd6206f
Attempt at a fix
ReverM Jan 17, 2026
27c6d96
Changed search_count to see if it is more robust against test
ReverM Jan 17, 2026
c37aac8
Applied suggestiong to not have ifthen
ReverM Jan 20, 2026
9a783e2
Requested changes to tests
ReverM Jan 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion app/controllers/fandoms_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ class FandomsController < ApplicationController

def index
if @collection
@media = Media.canonical.by_name - [Media.find_by(name: ArchiveConfig.MEDIA_NO_TAG_NAME)] - [Media.find_by(name: ArchiveConfig.MEDIA_UNCATEGORIZED_NAME)]
@media = Media.canonical.by_name.where.not(name: [ArchiveConfig.MEDIA_UNCATEGORIZED_NAME, ArchiveConfig.MEDIA_NO_TAG_NAME]) + [Media.uncategorized]
@page_subtitle = t(".collection_page_title", collection_title: @collection.title)
@medium = Media.find_by_name(params[:media_id]) if params[:media_id]
@counts = SearchCounts.fandom_ids_for_collection(@collection)
Expand Down
19 changes: 14 additions & 5 deletions app/controllers/tags_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -70,14 +70,23 @@ def show
# if tag is NOT wrangled, prepare to show works, collections, and bookmarks that are using it
if [email protected] && [email protected]
@works = if logged_in? # current_user.is_a?User
@tag.works.visible_to_registered_user.paginate(page: params[:page])
@tag.works.visible_to_registered_user
elsif logged_in_as_admin?
@tag.works.visible_to_admin.paginate(page: params[:page])
@tag.works.visible_to_admin
else
@tag.works.visible_to_all.paginate(page: params[:page])
@tag.works.visible_to_all
end
@bookmarks = @tag.bookmarks.visible.paginate(page: params[:page])
@collections = @tag.collections.paginate(page: params[:page])
@bookmarks = @tag.bookmarks.visible
@collections = @tag.collections
# if from a collection, only show works from that collection
if @collection.present?
@works &= @collection.works
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not include works in the subcollections, but the counts on the collection page do include them, so this is a mismatch - I can click on "new fandom (1)" and get to a page with no works. For canonical tags the works from the subcollections are included, so we should do the same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, is there a quick way of having a list of collection from a parent collection? (since I assume child of child are also included)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fortunately, child collections can't have children -- subcollections only go one level deep.

@bookmarks = []
@collections = []
end
@works = @works.paginate(page: params[:page])
@bookmarks = @bookmarks.paginate(page: params[:page])
@collections = @collections.paginate(page: params[:page])
end
# cache the children, since it's a possibly massive query
@tag_children = Rails.cache.fetch "views/tags/#{@tag.cache_key}/children" do
Expand Down
2 changes: 2 additions & 0 deletions app/controllers/works_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -734,6 +734,8 @@ def load_owner
else
redirect_to(tag_works_path(@tag.merger)) && return
end
elsif @collection.present?
redirect_to(collection_tag_path(@collection, @tag)) && return
else
redirect_to(tag_path(@tag)) && return
end
Expand Down
19 changes: 17 additions & 2 deletions app/models/search/search_counts.rb
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,14 @@ def bookmarkable_count_for_collection(collection)
def fandom_count_for_collection(collection)
Rails.cache.fetch(collection_cache_key(collection, :fandom_count),
collection_cache_options) do
collection_works_query(collection).field_count(:fandom_ids)
get_fandom_hash(collection_works_query(collection)).count
end
end

def fandom_ids_for_collection(collection)
Rails.cache.fetch(collection_cache_key(collection, :fandom_ids),
collection_cache_options) do
collection_works_query(collection).field_values(:fandom_ids)
get_fandom_hash(collection_works_query(collection))
end
end

Expand Down Expand Up @@ -114,6 +114,21 @@ def logged_in
User.current_user ? :logged_in : :logged_out
end

# Helper function to get both categorized and uncategorized fandoms from a WorkQuery object
def get_fandom_hash(query)
list = []
query.search_results.each do |work|
work.fandoms.each do |fandom|
Comment on lines +120 to +121
Copy link
Contributor

@Bilka2 Bilka2 Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to load all works in the collection and all fandoms of those works. This is not great, especially in the context in which this was implemented to be done via elasticsearch: https://otwarchive.atlassian.net/browse/AO3-6116. Since I'm writing this during maintenance and you can't look it up right now: Yuletide has 42 806 works. So we'd be loading all that at once, and then also all the fandom tags of those works.

During the migration of collections to elatiscsearch we tried to optimise a similar query and despite that it was still quite slow: #5234 (comment). This here isn't as bad because it doesn't include bookmarks or the crossover check, but we will also run this code here regularly, so we have higher performance requirements than a one-time task.

Well, the above isn't quite correct: As is, query.search_results will return one page worth of works. If we were to fix that (so that we don't get wrong information), then we'd load all of it.

So unfortunately I think we can't go with this approach. If we are going to get this information, we should get it directly from elasticsearch, maybe with some computation in Ruby, not with additional large database queries.

I believe this issue was marked as blocked by collections in elasticsearch because the original plan there was to include fandom information on the collection, so we could have gotten information via that. However, that plan was changed because it was looking too performance intensive. So now we have to solve this issue here differently.

With the existing information we have on works in elasticsearch, the uncategorized fandoms are included in filter_ids, mixed with a bunch of other tag types. So if we were to use them here, we'd still have to get the tags from the db to get their type. That's slightly better because we don't have to get the works, but still not great.

This may work on the fandoms page, since we already have to get the fandoms there to display them, so if we're clever about it we wouldn't introduce another query. But for the count in the sidebar, that doesn't apply.

In the big picture, we should put the data into elasticsearch in a format we can use well (so doing the computation when indexing), not do lots of computation afterwards to use it. So I'm leaning towards putting this information into elasticsearch over trying to use filter_ids. But that means reindexing etc. So I don't know where to go from here. Do you have an idea?

Copy link
Contributor Author

@ReverM ReverM Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While imperfect, an idea could be to combine both methods? That is, if a collection has a lot of work, we could go with the previous approach, but if it has less then we could have a more exact count using the tags?

Another would be to put the uncategorized fandom_ids in the fandom_id field of the query when they're added so we can go with the previous method

Could we add a property to query so that we can fetch only items with an uncategorized tag of some sort? That way we could do two queries (not ideal) then we do one query with fandom_ids for all, then one query that's more involved with the filter_ids when a work has an uncategorized tag then fetch the ones that are uncategorized fandoms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a collection has a lot of work, we could go with the previous approach, but if it has less then we could have a more exact count using the tags?

That's not viable because yuletide is one of the collections where it's especially important that uncategorized fandoms are shown, since it has many of those each year.

Another would be to put the uncategorized fandom_ids in the fandom_id field of the query

We can't modify what is in the field like that, it's used by more code than the counts here, like the facets (the filter sidebar) that expect only canonical tags

Could we add a property to query ...

Something like that is what I mean with "putting the right information into elasticsearch", yes. If we're adding a new property we should add it in a way that is directly useful here (e.g. all fandom tag ids or something), not something that needs another workaround in the queries. But that has the downsides I noted - we'd need to reindex completely, since the mapping changes, plus put a bunch of brainpower into designing that property. Which brings to the end of my last comment: I don't know what we should be doing here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see some possibilities for that new propriety:

We could add a new property that is all fandom tag ids. This would have the downside of having to parse through the tags to check canonicality, which might be not as useful. Some fandoms have a lot of synonyms so I'm not sure if that's the correct way forward. Plus, this feels like something which would have limited usefulness.

I'm unsure about how true that would be, but could having multiple uncategorized properties be useful for tag wrangling? For instance, this could allow someone to get all of the uncategorized tags or fandoms from all work with a specific fandom. I'm unsure if we can have one property that's like a list of list. So like freeform tags, relationship tags, character tags, which I guess could be more generically a list of list with types as keys. This is what feels the more generic and possibly more justifying a huge reindex.

I'm not full certain what is allowed as return type of elastic search, but this feels like a potential idea?

Finally, I apologize for not having understood what you meant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've discussed internally and unfortunately the conclusion is that we don't consider it worth the complications and maintenance cost to add something to the work elasticsearch index only for this usecase.

As I suspected above, this Jira issue was created under the assumption that this data would be available in the collections elasticsearch index. Since that was too complicated to implement, this feature here now also stops being viable, since the information isn't already available in elasticsearch and we don't want to add it just for this feature. I will close the Jira issue as well.

Apologies that this wasn't discovered, discussed and decided earlier in the process.

(Tag wrangling already works by having enough information in the tags elasticsearch index to show unwrangled tags per fandom, meaning tags used on works tagged with that fandom but not yet "connected" to it.)

if fandom.unwrangled? || fandom.canonical?
list.push(fandom.id)
else
list.push(fandom.merger_id)
end
end
end
list.tally
end

######################################################################
# CACHE OPTIONS
######################################################################
Expand Down
8 changes: 7 additions & 1 deletion app/views/tags/show.html.erb
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,13 @@
<p><%= ts("This tag has not been marked common and can't be filtered on (yet).") %></p>
<% unless @works.blank? %>
<div class="work listbox group">
<h3 class="heading"><%= ts('Works which have used it as a tag') %>:</h3>
<h3 class="heading">
<% if @collection.present? %>
<%= t(".unwrangled.header.collection") %>
<% else %>
<%= t(".unwrangled.header.generic") %>
<% end %>
</h3>
<%= paginated_section @works do %>
<ul class="index group">
<% for work in @works %>
Expand Down
4 changes: 4 additions & 0 deletions config/locales/views/en.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2405,6 +2405,10 @@ en:
filter_bookmarks: filter bookmarks
filter_works: filter works
list_fandom_tags_html: You can also access a list of %{fandom_relationship_tags_link}.
unwrangled:
header:
collection: Works in this collection which have used it as a tag
generic: Works which have used it as a tag
wrangle:
manage:
comments: Comments (%{count})
Expand Down
48 changes: 48 additions & 0 deletions features/collections/collection_navigation.feature
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,54 @@ Feature: Basic collection navigation
Then I should not see "High School Musical"
And I should see "Steven's Universe"

Scenario: Non-Canonical Fandoms are not double counted
Given I have the collection "Canons" with name "canon"
And I have a canonical "TV Shows" fandom tag named "TV"
And a synonym "Television" of the tag "TV"
When I am logged in as "Screen"
And I post the work "Full name" with fandom "Television" in the collection "Canons"
And I post the work "Small name" with fandom "TV" in the collection "Canons"
And the collection counts have expired
And I go to "Canons" collection's page
Then I should see "Fandoms (1)"
And I should see "Television"
When I follow "Fandoms (1)"
Then I should not see "Television"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also check TV (2) here, so we're sure that the syn is counted under the fandom?


@disable_caching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still required or can it be replaced with a well placed the collection counts have expired?

Scenario: Uncategorized Fandoms should appear in Collection's Fandoms
Given I have the collection "Categoric" with name "categoric"
And the tag "Unrelated Fandom" does not exist
And I have a canonical "Video Games" fandom tag named "Undertale"
When I am logged in as "Frisk"
And I post the work "Categoric Sans" with fandom "Undertale" in the collection "Categoric"
And I post the work "Categoric Papyrus" with fandom "Undertale"
And I post the work "Uncategoric Sans" with fandom "Unrelated Fandom" in the collection "Categoric"
And I post the work "Uncategoric Papyrus" with fandom "Unrelated Fandom"
And I bookmark the work "Uncategoric Sans" with the note "Secret Sans" with the tags "Unrelated Fandom"
And I go to "Categoric" collection's page
Then I should see "Fandoms (2)"
When I follow "Fandoms (2)"
Then I should see "Unrelated Fandom"
And I should see "Undertale"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also check the counts here

When I select "Uncategorized Fandoms" from "media_id"
And I press "Show"
Then I should see "Unrelated Fandom"
And I should not see "Undertale"

# From collection we only see the works in the collection
When I follow "Unrelated Fandom"
Then I should see "Sans"
And I should not see "Papyrus"
And I should not see "Secret Sans"

# Else we see all
When I follow "Uncategorized Fandoms" within "#header"
And I follow "Unrelated Fandom"
Then I should see "Sans"
And I should see "Papyrus"
And I should see "Secret Sans"

Scenario: Browse tags within a collection (or not)
Given I have a collection "Randomness"
And a canonical fandom "Naruto"
Expand Down
Loading