-
Notifications
You must be signed in to change notification settings - Fork 671
AO3-6067 Uncategorized fandoms in collections not showing #5514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 23 commits
341c874
905d4f7
7ad262f
05bf0e1
8502d6c
d9c3cbe
a5f3f67
d5c7307
e27e433
d109571
556d1e7
e876c2b
fac8449
9e8fa6f
dd384ef
ce9edfc
8db30ad
e0d1c2a
4ed7df5
ee889fb
c225c80
cd6206f
27c6d96
c37aac8
9a783e2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -70,14 +70,23 @@ def show | |
| # if tag is NOT wrangled, prepare to show works, collections, and bookmarks that are using it | ||
| if [email protected] && [email protected] | ||
| @works = if logged_in? # current_user.is_a?User | ||
| @tag.works.visible_to_registered_user.paginate(page: params[:page]) | ||
| @tag.works.visible_to_registered_user | ||
| elsif logged_in_as_admin? | ||
| @tag.works.visible_to_admin.paginate(page: params[:page]) | ||
| @tag.works.visible_to_admin | ||
| else | ||
| @tag.works.visible_to_all.paginate(page: params[:page]) | ||
| @tag.works.visible_to_all | ||
| end | ||
| @bookmarks = @tag.bookmarks.visible.paginate(page: params[:page]) | ||
| @collections = @tag.collections.paginate(page: params[:page]) | ||
| @bookmarks = @tag.bookmarks.visible | ||
| @collections = @tag.collections | ||
| # if from a collection, only show works from that collection | ||
| if @collection.present? | ||
| @works &= @collection.works | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This does not include works in the subcollections, but the counts on the collection page do include them, so this is a mismatch - I can click on "new fandom (1)" and get to a page with no works. For canonical tags the works from the subcollections are included, so we should do the same here
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, is there a quick way of having a list of collection from a parent collection? (since I assume child of child are also included)
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fortunately, child collections can't have children -- subcollections only go one level deep. |
||
| @bookmarks = [] | ||
| @collections = [] | ||
| end | ||
| @works = @works.paginate(page: params[:page]) | ||
| @bookmarks = @bookmarks.paginate(page: params[:page]) | ||
| @collections = @collections.paginate(page: params[:page]) | ||
| end | ||
| # cache the children, since it's a possibly massive query | ||
| @tag_children = Rails.cache.fetch "views/tags/#{@tag.cache_key}/children" do | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -32,14 +32,14 @@ def bookmarkable_count_for_collection(collection) | |
| def fandom_count_for_collection(collection) | ||
| Rails.cache.fetch(collection_cache_key(collection, :fandom_count), | ||
| collection_cache_options) do | ||
| collection_works_query(collection).field_count(:fandom_ids) | ||
| get_fandom_hash(collection_works_query(collection)).count | ||
| end | ||
| end | ||
|
|
||
| def fandom_ids_for_collection(collection) | ||
| Rails.cache.fetch(collection_cache_key(collection, :fandom_ids), | ||
| collection_cache_options) do | ||
| collection_works_query(collection).field_values(:fandom_ids) | ||
| get_fandom_hash(collection_works_query(collection)) | ||
| end | ||
| end | ||
|
|
||
|
|
@@ -114,6 +114,21 @@ def logged_in | |
| User.current_user ? :logged_in : :logged_out | ||
| end | ||
|
|
||
| # Helper function to get both categorized and uncategorized fandoms from a WorkQuery object | ||
| def get_fandom_hash(query) | ||
| list = [] | ||
| query.search_results.each do |work| | ||
| work.fandoms.each do |fandom| | ||
|
Comment on lines
+120
to
+121
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This has to load all works in the collection and all fandoms of those works. This is not great, especially in the context in which this was implemented to be done via elasticsearch: https://otwarchive.atlassian.net/browse/AO3-6116. Since I'm writing this during maintenance and you can't look it up right now: Yuletide has 42 806 works. So we'd be loading all that at once, and then also all the fandom tags of those works. During the migration of collections to elatiscsearch we tried to optimise a similar query and despite that it was still quite slow: #5234 (comment). This here isn't as bad because it doesn't include bookmarks or the crossover check, but we will also run this code here regularly, so we have higher performance requirements than a one-time task. Well, the above isn't quite correct: As is, So unfortunately I think we can't go with this approach. If we are going to get this information, we should get it directly from elasticsearch, maybe with some computation in Ruby, not with additional large database queries. I believe this issue was marked as blocked by collections in elasticsearch because the original plan there was to include fandom information on the collection, so we could have gotten information via that. However, that plan was changed because it was looking too performance intensive. So now we have to solve this issue here differently. With the existing information we have on works in elasticsearch, the uncategorized fandoms are included in This may work on the fandoms page, since we already have to get the fandoms there to display them, so if we're clever about it we wouldn't introduce another query. But for the count in the sidebar, that doesn't apply. In the big picture, we should put the data into elasticsearch in a format we can use well (so doing the computation when indexing), not do lots of computation afterwards to use it. So I'm leaning towards putting this information into elasticsearch over trying to use
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While imperfect, an idea could be to combine both methods? That is, if a collection has a lot of work, we could go with the previous approach, but if it has less then we could have a more exact count using the tags? Another would be to put the uncategorized fandom_ids in the fandom_id field of the query when they're added so we can go with the previous method Could we add a property to query so that we can fetch only items with an uncategorized tag of some sort? That way we could do two queries (not ideal) then we do one query with fandom_ids for all, then one query that's more involved with the filter_ids when a work has an uncategorized tag then fetch the ones that are uncategorized fandoms.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That's not viable because yuletide is one of the collections where it's especially important that uncategorized fandoms are shown, since it has many of those each year.
We can't modify what is in the field like that, it's used by more code than the counts here, like the facets (the filter sidebar) that expect only canonical tags
Something like that is what I mean with "putting the right information into elasticsearch", yes. If we're adding a new property we should add it in a way that is directly useful here (e.g. all fandom tag ids or something), not something that needs another workaround in the queries. But that has the downsides I noted - we'd need to reindex completely, since the mapping changes, plus put a bunch of brainpower into designing that property. Which brings to the end of my last comment: I don't know what we should be doing here
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see some possibilities for that new propriety: We could add a new property that is all fandom tag ids. This would have the downside of having to parse through the tags to check canonicality, which might be not as useful. Some fandoms have a lot of synonyms so I'm not sure if that's the correct way forward. Plus, this feels like something which would have limited usefulness. I'm unsure about how true that would be, but could having multiple uncategorized properties be useful for tag wrangling? For instance, this could allow someone to get all of the uncategorized tags or fandoms from all work with a specific fandom. I'm unsure if we can have one property that's like a list of list. So like freeform tags, relationship tags, character tags, which I guess could be more generically a list of list with types as keys. This is what feels the more generic and possibly more justifying a huge reindex. I'm not full certain what is allowed as return type of elastic search, but this feels like a potential idea? Finally, I apologize for not having understood what you meant.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We've discussed internally and unfortunately the conclusion is that we don't consider it worth the complications and maintenance cost to add something to the work elasticsearch index only for this usecase. As I suspected above, this Jira issue was created under the assumption that this data would be available in the collections elasticsearch index. Since that was too complicated to implement, this feature here now also stops being viable, since the information isn't already available in elasticsearch and we don't want to add it just for this feature. I will close the Jira issue as well. Apologies that this wasn't discovered, discussed and decided earlier in the process. (Tag wrangling already works by having enough information in the tags elasticsearch index to show unwrangled tags per fandom, meaning tags used on works tagged with that fandom but not yet "connected" to it.) |
||
| if fandom.unwrangled? || fandom.canonical? | ||
| list.push(fandom.id) | ||
ReverM marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| else | ||
| list.push(fandom.merger_id) | ||
| end | ||
| end | ||
| end | ||
| list.tally | ||
| end | ||
|
|
||
| ###################################################################### | ||
| # CACHE OPTIONS | ||
| ###################################################################### | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -91,6 +91,54 @@ Feature: Basic collection navigation | |
| Then I should not see "High School Musical" | ||
| And I should see "Steven's Universe" | ||
|
|
||
| Scenario: Non-Canonical Fandoms are not double counted | ||
| Given I have the collection "Canons" with name "canon" | ||
| And I have a canonical "TV Shows" fandom tag named "TV" | ||
| And a synonym "Television" of the tag "TV" | ||
| When I am logged in as "Screen" | ||
| And I post the work "Full name" with fandom "Television" in the collection "Canons" | ||
| And I post the work "Small name" with fandom "TV" in the collection "Canons" | ||
| And the collection counts have expired | ||
| And I go to "Canons" collection's page | ||
| Then I should see "Fandoms (1)" | ||
| And I should see "Television" | ||
| When I follow "Fandoms (1)" | ||
| Then I should not see "Television" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you also check TV (2) here, so we're sure that the syn is counted under the fandom? |
||
|
|
||
| @disable_caching | ||
|
||
| Scenario: Uncategorized Fandoms should appear in Collection's Fandoms | ||
| Given I have the collection "Categoric" with name "categoric" | ||
| And the tag "Unrelated Fandom" does not exist | ||
| And I have a canonical "Video Games" fandom tag named "Undertale" | ||
| When I am logged in as "Frisk" | ||
| And I post the work "Categoric Sans" with fandom "Undertale" in the collection "Categoric" | ||
| And I post the work "Categoric Papyrus" with fandom "Undertale" | ||
| And I post the work "Uncategoric Sans" with fandom "Unrelated Fandom" in the collection "Categoric" | ||
| And I post the work "Uncategoric Papyrus" with fandom "Unrelated Fandom" | ||
| And I bookmark the work "Uncategoric Sans" with the note "Secret Sans" with the tags "Unrelated Fandom" | ||
| And I go to "Categoric" collection's page | ||
| Then I should see "Fandoms (2)" | ||
| When I follow "Fandoms (2)" | ||
| Then I should see "Unrelated Fandom" | ||
| And I should see "Undertale" | ||
|
||
| When I select "Uncategorized Fandoms" from "media_id" | ||
| And I press "Show" | ||
| Then I should see "Unrelated Fandom" | ||
| And I should not see "Undertale" | ||
|
|
||
| # From collection we only see the works in the collection | ||
| When I follow "Unrelated Fandom" | ||
| Then I should see "Sans" | ||
| And I should not see "Papyrus" | ||
| And I should not see "Secret Sans" | ||
|
|
||
| # Else we see all | ||
| When I follow "Uncategorized Fandoms" within "#header" | ||
| And I follow "Unrelated Fandom" | ||
| Then I should see "Sans" | ||
| And I should see "Papyrus" | ||
| And I should see "Secret Sans" | ||
|
|
||
| Scenario: Browse tags within a collection (or not) | ||
| Given I have a collection "Randomness" | ||
| And a canonical fandom "Naruto" | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.