Skip to content

Track dedupe index file size in org storage stats#3207

Merged
ikreymer merged 13 commits intomainfrom
issue-3206-dedupe-index-file-storage-tracking
Mar 4, 2026
Merged

Track dedupe index file size in org storage stats#3207
ikreymer merged 13 commits intomainfrom
issue-3206-dedupe-index-file-storage-tracking

Conversation

@tw4l
Copy link
Copy Markdown
Member

@tw4l tw4l commented Mar 2, 2026

Fixes #3206

Backend changes

  • Adds size of dedupe index files to org storage stats
  • Removes bytes from org storage stats when dedupe index is deleted if file was saved
  • Adds new stats to org metrics endpoint
  • Adds await that was missing in method for updating index state
  • Updates (and fixes) test for org metrics endpoint

Frontend changes

  • Includes dedupe index files in Misc storage in dashboard storage meter
  • Updates type for org metrics

@tw4l tw4l requested review from emma-sg and ikreymer March 2, 2026 20:58
):
"""update dedupe index stats for specified collection"""
self.collections.find_one_and_update(
await self.collections.find_one_and_update(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoa, i'm surprised this was missed before by mypy, and no warnings about it either?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because there's no return to check the type of. mypy doesn't seem to be making sure that async methods are awaited generally, at least with how we set it up.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a contentious issue over in the mypy repo issues: python/mypy#2499

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might require some changes in our codebase (there are lots of places where we await without doing anything with the return value of async methods) but this could help: https://mypy.readthedocs.io/en/stable/error_code_list2.html#check-that-awaitable-return-value-is-used-unused-awaitable

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this may have more unintended consequence, does adding a return type help generally?

@tw4l tw4l force-pushed the issue-3206-dedupe-index-file-storage-tracking branch from 793048b to 0e18f0b Compare March 4, 2026 16:14
@tw4l
Copy link
Copy Markdown
Member Author

tw4l commented Mar 4, 2026

Had to resolve some merge conflicts rebasing on main but should be set for re-review now

@tw4l tw4l requested a review from ikreymer March 4, 2026 16:50
Copy link
Copy Markdown
Member

@ikreymer ikreymer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for additional type fixes as well!

@ikreymer ikreymer merged commit b7a1106 into main Mar 4, 2026
29 checks passed
@ikreymer ikreymer deleted the issue-3206-dedupe-index-file-storage-tracking branch March 4, 2026 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task]: Track storage of uploaded dedupe index files in org storage

2 participants