-
Notifications
You must be signed in to change notification settings - Fork 681
Add default_versions.num_versions column
#10519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Admin could leverage the following SQL to set the WITH filtered AS (
SELECT crate_id
FROM default_versions
WHERE num_versions is NULL
ORDER BY crate_id
-- the limit can be tuned or removed as needed.
LIMIT 10000
), to_update AS (
SELECT crate_id, count(*) AS num_versions
FROM versions
JOIN filtered USING (crate_id)
GROUP BY crate_id
)
UPDATE default_versions
SET num_versions = to_update.num_versions
FROM to_update
WHERE default_versions.crate_id = to_update.crate_id; |
5cd638f to
ffde83a
Compare
ffde83a to
739b0d3
Compare
default_versions.total columndefault_versions.num_versions column
739b0d3 to
46f6a3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm good with this, but would appreciate a quick glance from @LawnGnome before we move forward :)
LawnGnome
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of this is blocking, but here are my thoughts:
Firstly, default_versions is now not as accurate a name for the table. It's probably not worth renaming right now, but computed_versions might be closer to what we're doing.
Secondly, I don't love the extra complexity in the publish endpoint, honestly. I know some of that is transitional, but I get slightly twitchy any time denormalised table updates happen in one place outside of the model layer, because that makes it easy to not do it in unusual circumstances.
For default version calculation, we don't really have a choice (since it's — at best — challenging to do in SQL), but it feels like we could calculate num_versions with a trigger on the versions table, which would get rid of that potential issue.
I'm not saying we must use a trigger, and I wouldn't block the merging of this PR as-is, but @eth3lbert, I'm interested in whether you considered that here?
that's a good point. if we can do this within the database then we could avoid a lot of the application code complexity. I assume a "insert version => increment num_versions, delete version => decrement num_versions" might be sufficient with a one-time query to set the initial counts. |
Good suggestion! I totally forgot about the trigger function here. Triggers can be a bit tricky sometimes, but we already have similar one for categories, so I'm happy to use a trigger. Thanks again! |
Yeah, this is basically what I initially thought. With this, a default value of 0 seems more suitable! |
536b190 to
b8a25fb
Compare
|
do we still need the second commit? |
I see this as a last resort, but it should be okay to remove. |
|
probably fine to drop it. but I guess we should adjust the publish endpoint to insert |
b8a25fb to
f614247
Compare
I added a test for this to determine if a default value of 0 is sufficient. |
Yeah, it looks like we must default to 1 or set it to 1 in endpoint for first version. I actually lean towards setting it in the publish endpoint and making the default value |
323304f to
384d9c3
Compare
384d9c3 to
1d552cc
Compare
The only reason If that's the case, then I'd rather set a default, make it
|
You're absolutely right. This is exactly what I expect for the deployment process. The main reason for exposing APIs in separate PRs is to ensure that the post-migration process is completed before exposing them, as you mentioned.
Yeah, basically! And it also allows the update script to determine the initialization status by null. On the other hand, with a default non-null value (like 1) for existing crates and a new version released before initialization, it would increment the value to 2. This means the update script loses the ability to determine the initialization status using a nullish value. However, this isn't a major issue. It only requires adjusting the update script to update all records in a range instead of relying on the initialization state. I'm fine with either approach. I can make the change if needed! @Turbo87 Do you also suggest using a non-null default value? |
Turbo87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you also suggest using a non-null default value?
I think I prefer nullable for now
✅ |
The total number of versions is one of the blockers for avoiding loading full versions within the application. To address this, we can either send a separate request to retrieve the number, or, as suggested by @Turbo87, store the value in db and include it in the crate response. The latter approach seems more promising, as it eliminates the need for an extra request solely to obtain the total.
To ensure the value is up-to-date and maintain deployment convenience, the implementation will be separated into multiple PRs:
num_versionscolumn on the API #10581)