Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4210 +/- ##
==========================================
- Coverage 76.75% 76.61% -0.14%
==========================================
Files 166 167 +1
Lines 48277 48332 +55
Branches 48277 48332 +55
==========================================
- Hits 37053 37030 -23
- Misses 9365 9444 +79
+ Partials 1859 1858 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I'm gonna review today |
ethan-tyler
left a comment
There was a problem hiding this comment.
Love this - added a couple of comments for your consideration.
| """ | ||
| self._table.create_checkpoint() | ||
|
|
||
| def compact_logs(self, starting_version: int, ending_version: int) -> None: |
There was a problem hiding this comment.
Do we need docs for this new pub api?
There was a problem hiding this comment.
We can add that later stage ;) for now its a hidden gem
rtyler
left a comment
There was a problem hiding this comment.
Quite straight-forward and simple. I do think @ethan-tyler has made some points worth addressing.
I don't really understand the point of these log compaction files compared to checkpoints, but that's not an @ion-elgreco question to resolve 😆 Seems like more over complication of the protocol. 🤷 🙈
If these do provide reader benefits, what if we fire-and-forget a task that tries to generate on of these files in the background after a write on version % 25 or something? If we succeed in writing a log compaction file, great, if we don't, who cares 😆
I assume kernel log replay makes use of these if it encounters them. I can do this in a commit hook in a follow up pr |
2d0479f to
1e51705
Compare
1e51705 to
5308c94
Compare
|
Exact matching of our json logs vs this compaction log is not possible because in the compaction log we drop null columns, suggested by kernel, but our own action to bytes serializer does not. Maybe a small optimization we can do in the future is to also drop null columns
|
ethan-tyler
left a comment
There was a problem hiding this comment.
LGTM - Thanks for taking my feedback :)
This will eventually manifest as a bug of some sort since we're doing different things from kernel. |
I see your concern but I don't think this is a correctness bug. Both encodings are semantically equivalent for Delta replay. You should be able to validate this with replay/state equivalence and not byte matching. Worth a follow up to unify serializers across paths, but IMO not a blocker. |
Yeah, it's probably fine, it's more we don't know what we don't know. I'm just saying there is prolly some corner case that'll come up as a bug eventually on this. This still LGTM |
Signed-off-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com> Signed-off-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com>
Co-authored-by: Ethan Urbanski <ethanurbanski@gmail.com> Signed-off-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com>
Signed-off-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com>
Signed-off-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com> Signed-off-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com>
a0d6d9f to
2e3aac9
Compare
Description
The description of the main changes of your pull request
Related Issue(s)
Documentation