Skip to content

Refactors reading status Shelf and ReadThrough logic#2170

Draft
mouse-reeve wants to merge 5 commits intomainfrom
readthroughs
Draft

Refactors reading status Shelf and ReadThrough logic#2170
mouse-reeve wants to merge 5 commits intomainfrom
readthroughs

Conversation

@mouse-reeve
Copy link
Copy Markdown
Member

@mouse-reeve mouse-reeve commented Jul 3, 2022

Overview

The overlap between shelves and read-throughs has been a major source of bugs and confusion. I'm trying to figure out how to improve the situation without time travel to bestow my earlier self with the gift of retrospect, so these are my thoughts on moving forward. Various people, mostly notably as I recall, @arkhi, have been trying to tell me this for ages so I apologize for not hearing that earlier!

The functional shelves (to-read, currently reading, read, and stopped reading) should go away completely. The remaining, user-created shelves are left in a somewhat ambiguous state, especially given lists, but let's get back to that. Instead of shelves being involved at all, ReadThroughs should be the gold standard. Each ReadThrough should have a status (to-read, currently-reading, read, or stopped-reading), and whatever dates the user has added. There are some logical constraints: a "to-read" ReadThrough cannot have started or finished dates, and a "currently-reading" ReadThrough cannot have finished dates. But a finished ReadThrough does not need to have any dates associated with it. The finished date and stopped reading date can be the same because a ReadThrough can either be finished or stopped, not both.

I think the user interface itself should remain mostly the same. I'm not sure how the ActivityPub serialization needs to change, but it will -- it may be that ReadThroughs will need to be a completely new object, which is a downside in terms of backwards compatibility, but we're also a pretty small and up-to-date network so I think that's a worthwhile cost if it's the best approach longer-term.

This is much simpler and easier to understand. It removes a lot of janky code, like initializing shelves for all new users. But it's a big migration with the potential for data loss so I want to get thorough code review and testing before completing it.

Open Questions

  • Is ReadThrough the best name for this model? Not the most pressing issue but if it's going to change, now would be the time.
  • What should the ActivityPub serialization look like for ReadThrough objects
  • What is the fate of user-created shelves
  • How does reathrough privacy work now? Should ReadThrough objects have individual privacy settings (which makes sense from an ActivityPub perspective)? Or should there be shelf settings somewhere that governs every object (a much weirder solution from a code perspective)?

Steps

(Still working on this section)

  • Models
    • Update the ReadThrough model the better reflect what it needs to store
    • Migrate existing ReadThroughs to the new format
    • Migrate readthrough data that is stored implicitly in Shelfs into ReadThroughs
    • Test migrations
    • Remove functional shelves and related logic
    • Get database constraints working and migrated in safely
    • Figure out readthrough privacy
  • Views
    • Update flow of changing your reading status on a book
    • Update "shelves" view
    • Remove logic for figuring out "active shelf" and "latest readthrough"
  • ActivityPub
    • Support ReadThrough updates

Fixes #1737
Fixes #1754
Fixes #2058
Fixes #2075
Fixes #2161
Fixes #2165

Each readthrough should represent a unique instance of reading a book,
and so it should have a single status that represents that status of
that readthrough. This will be used to place the book on the appropriate
"shelves", and should allow for there to be a single active readthrough
if a book is in progress. That maeks the `is_active` field unnecessary.

Since the status of the readthrough is unambiguous, defining "finished
date" and "stopped date" separately is no longer necessary.

I've sketched out some database constraints but haven't implemented them
yet, because the conflicts might get gnarly and let's take this one step
at a time.

The migration logic is complex, so this is an untested draft and I'll
write unit tests for it in a subsequent commit.
@franbarra
Copy link
Copy Markdown

franbarra commented Jul 3, 2022

Sorry if I'm polluting the conversation, but I'm trying to wrap my head around the model since I would like to start contributing code in the future.

For example, in #2058 the user complained that a book couldn't be in "To Read" and "Read" at the same time.
With this proposed model, only using ReadThroughs, a Book can hold a reference to more than one ReadThrough at a time?

  1. If it does, it would solve When rereading a book, it no longer shows up in the "Read" shelf (but it should) #2058's problem by allowing a user to see many ReadThroughs of the same book, correct? The user would be able to consult, let's say by clicking "Detailed ReadThroughs" in a particular book, all the times they've read the book and the dates associated with those times?
  2. On the other hand, if a book can't hold more than one reference to ReadThroughs, then how could When rereading a book, it no longer shows up in the "Read" shelf (but it should) #2058 be solved? Should we just say: "the app can only maintain one reference to a ReadThrough so only your most recent ReadThrough is recorded? Again, I'm just thinking how ReadThrough would work in practice.

On another note, if we take away the field stopped_date from the ReadThrough model, then, how can one distinguish between a book that has been read and one that has been stopped? It looks like based on #2170 that these two cases would like exactly the same: a book that has been read would have a start/stop date, just like stopped_reading would, without an is_active field.

I apologize in advance if I sound aggresive or haven't expressed myself properly. Look forward to hearing some thoughts on these issues and clear any misunderstandings I may have.

@mouse-reeve
Copy link
Copy Markdown
Member Author

Thank you for reading and asking questions! You don't come across as rude or aggressive at all, I really appreciate the conversation. One thing that may not be obvious from how I wrote the proposal is that ReadThrough is already an existing database model that is very much in use currently, in conjunction with the Shelf model. This refactor will tweak it in order to make it the single source of truth for reading history.

For example, in #2058 the user complained that a book couldn't be in "To Read" and "Read" at the same time. With this proposed model, only using ReadThroughs, a Book can hold a reference to more than one ReadThrough at a time?

Yes - a book can have an arbitrary number of ReadThroughs, but only one can be an active/currently reading status. For example, if I have read a book 3 times, am currently reading it, and want to read it again, that could be represented by 5 unique ReadThroughs.

  1. If it does, it would solve When rereading a book, it no longer shows up in the "Read" shelf (but it should) #2058's problem by allowing a user to see many ReadThroughs of the same book, correct? The user would be able to consult, let's say by clicking "Detailed ReadThroughs" in a particular book, all the times they've read the book and the dates associated with those times?

Yes, this is one of the issues this refactor will address. The book would show up on both their "Read" and "Currently Reading" shelves in the UI. Since recording reading dates with ReadThrough is already supported, you can currently see a history of read dates for a book on the book page. For example, I've read Piranesi twice, and I see those dates when I visit that page:
Screen Shot 2022-07-03 at 2 09 03 PM

On another note, if we take away the field stopped_date from the ReadThrough model, then, how can one distinguish between a book that has been read and one that has been stopped? It looks like based on #2170 that these two cases would like exactly the same: a book that has been read would have a start/stop date, just like stopped_reading would, without an is_active field.

Good question - each ReadThrough has a read_status which indicates whether it is stopped or finished. The possible values for read_status are to-read, reading, read, and stopped-reading. A ReadThrough's read_status can progress from "want to read" to "finished" or "stopped", but wouldn't, for example, change from finished to stopped (which wouldn't make sense).

The one weirdness there is how it should behave if you resume a book after marking it as stopped. The simplest approach would be to create a new ReadThrough when the the book was resumed, and I expect that would more or less work for people (this is how it works currently). A more nuanced/complex solution would be to include a resumed date, but I don't think there's quite enough value in that to justify the complexity.

Does that answer your questions?

@franbarra
Copy link
Copy Markdown

Thank you for the thorough answer! It really did answer my doubts about how the model worked.

Regarding your last paragraph, I wager that most people (based on my experience) use the "Stopped Reading" option as an explicit way of marking that they are no longer interested in continuing to read that particular book. (I doubt that someone would annotate the intervals of time where they started/stopped reading a particular book as a way of keeping track of their ReadThrough time, but the possibility exists).
Therefore, at least in my opinion, going the more complex route of adding a resumed_date would probably imply a lot of effort for minimal returns. Creating a new ReadThrough in case someone does resume the book sounds like the best approach.

(PD: Piranesi does look really interesting, thanks for the book recommendation I guess).

@arkhi
Copy link
Copy Markdown
Contributor

arkhi commented Jul 7, 2022

  • Is ReadThrough the best name for this model? Not the most pressing issue but if it's going to change, now would be the time.

I wasn’t sure of what ReadThroughs would imply but the following conversation really helped getting rid of dark areas. in my understanding of it. It might be a good thing to document it properly for non‑native speakers.

  • What is the fate of user-created shelves

From my perspective, they always have been special lists. ;)

  • How does reathrough privacy work now? Should ReadThrough objects have individual privacy settings (which makes sense from an ActivityPub perspective)? Or should there be shelf settings somewhere that governs every object (a much weirder solution from a code perspective)?

Consistency calls for individual privacy settings (extensively used or not). Maybe those settings can be set globally in the user settings, applied by default, and changed occasionally if needed through the “More options” menu?

@arkhi
Copy link
Copy Markdown
Contributor

arkhi commented Jul 7, 2022

I’m glad to see this discussion happening again, obviously. Thank you. :)

@hughrun
Copy link
Copy Markdown
Member

hughrun commented Jul 8, 2022

Thanks for this @mouse-reeve - I think this is definitely the right move.

Is ReadThrough the best name for this model? Not the most pressing issue but if it's going to change, now would be the time.

Seems as good as any, and it is already used throughout docs etc. Alternatively, we might want to call it ReadingActivity

What should the ActivityPub serialization look like for ReadThrough objects

I'm not exactly sure what this question is asking, but it makes most sense to me that a readthrough would be an ActivityPub Event which is Announced.

What is the fate of user-created shelves

Having somewhere to see your read, currently reading, completed, and stopped books is useful. But it's confusing for them to be displayed as "shelves". Perhaps a new tab called "Reading Activity"?

How does reathrough privacy work now? Should ReadThrough objects have individual privacy settings (which makes sense from an ActivityPub perspective)? Or should there be shelf settings somewhere that governs every object (a much weirder solution from a code perspective)?

Readthrough privacy on a individual readthrough level is definitely the way to go. I'm thinking of the equivalent in a public library context - a user may be quite happy for the world to know about most of their reading history but maybe not that one book on [insert embarrassing-to-the-reader title here]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment