-
Notifications
You must be signed in to change notification settings - Fork 725
Fix line_shift bug in but-hunk-dependency
#11368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
|
Thanks @jonathantanmy2, this is incredibly observant! There are probably many bugs and not enough tests, so this definitely is a step in the right direction. In any case, @mtsgrd might be able to comment on it, maybe showing a path forward as well. On another note, something I think this needs (at some time) is a better way to test it, and of course, a port to new non-legacy data structures. Ideally, this becomes plumbing that doesn't know workspaces at all, which should also improve its testability. |
4d96dfd to
dbe79bd
Compare
line_shift bug in but-hunk-dependencyline_shift bug in but-hunk-dependency
|
This PR now contains the fix. PTAL. See the commit message for a more complete description of the problem and the fix. |
|
I thought I could skip the review if @mtsgrd is going to take a much more proficient look anyway, but I just wanted to welcome our first "Git level" commit message , which I reproduce here for ease of consumption:
Thank you @jonathantanmy2 , this really is something I'd like to copy. And maybe one day GitButler will also be the tool that helps to unearth such messages when people are wondering why it is what it is (-> Git archaeology) |
|
(adding Mattias as a reviewer since he authored the original implementation and may remember certain nuances about the functionality) |
The `line_shift` value is used when combining diff hunks from two or more stacks together, and has an important invariant: at any point where a diff hunk from another stack could be inserted, the cumulative `line_shift` value must be the net lines (lines added less lines removed) of all the diff hunks prior to that point. This is so that the combiner knows how to shift the diff hunks. For example, suppose the combiner needs to combine two stacks; the second stack has a diff that adds a line at line 100. Suppose the combined effect of all diff hunks in the first stack prior to line 100 is a net reduction of 42 lines: the total `line_shift` value of all those diff hunks must thus be -42, so that the combiner knows that the change that the second stack has must be added at line 58 instead of line 100. Point A: Note that this invariant only needs to apply at any point where a diff hunk from another stack could be inserted. If, in a stack, there are two hunks adjacent to each other, no diff hunk may be inserted between them, so as long as their total `line_shift` value is correct, they may have any `line_shift` value they want. (In fact, it is sometimes not possible to determine what each `line_shift` value should be.) This invariant is not met by the current algorithm, so I switched the calculation for something that does. The main principles are: - If applying a hunk causes other hunks to completely disappear, the incoming hunk must bear responsibility for the `line_shift` values of the hunks that disappear by adding their values to itself. - If a hunk splits into two (only possible when applying a hunk in the middle of an existing hunk), the two resulting hunks must split the original `line_shift` value between them. Due to Point A above, the exact proportion does not matter (the two resulting hunks sandwich the hunk that split them, and all three are adjacent to each other), so I have chosen to distribute the `line_shift` value based on their sizes. - If applying a hunk causes another hunk to be reduced in size, but not completely disappear, the exact distribution of `line_shift` values in between these two hunks does not really matter, since they are adjacent (and thus Point A applies). But I have chosen to take some `line_shift` from the reduced-size hunk to give to the incoming hunk, analogous to how a completely disappearing hunk cedes all its `line_shift` to the incoming hunk, to make the `line_shift` values more reasonable (well, reasonable to me, at least). There is some code duplication due to how the diff hunks of an individual stack are combined. I thought of rewriting the combining algorithm before writing this commit (to reduce or eliminate the code duplication needed), but there were some inconsistencies in how zero-line hunks were handled, so I thought it best to correct the `line_shift` issue before making further changes.
This is a WIP. I need to make another pass to make the identifier names consistent, add documentation, and so on, but uploading this first to check if I'm on the right track. Second most noteworthy is the fix in input_hunk_from_unified_diff. `old_start` has a special meaning when `old_lines` is 0 (same for `new_start` and `new_lines`). Most noteworthy are the bug fixes as can be seen in some tests. In crates/but-hunk-dependency/src/ranges/tests/path.rs: In removing_line_updates_range, the file goes from ? 1 1 to (in commit 1) ? a b c to (in commit 2) ? a c It can be readily seen why the existing results are wrong. In shift_is_correct_after_multiple_changes, the file ends up being 1 2 update 3 add line 1 add line 2 add line 4 4 6 update 7 add line 8 9 10 added lines at the bottom Once again, it can be readily seen why the existing results are wrong. DO NOT SUBMIT There are some insta tests that I blindly accepted the review for that I need to look into. There might be some special handling regarding deletion and recreation of files.
5685230 to
2435e41
Compare
|
I continued looking at the code and drastically simplifying it (note the reduction in lines of code, not only in tests but in production code as well), fixing a few bugs in the process. But it's more involved than I thought - there might be some special handling of file deletion and re-creation. If anyone has any pointers regarding those, feel free to let me know - otherwise I'll look at it myself. |
|
OMG! Standing on the sidelines, cheering and clapping! Unfortunately, that's all the support I can provide with this one. |
|
@mtsgrd could you take a look? Regarding the path forward, it may help if someone answers part or all of these questions:
|
Currently, this PR just has a demonstration of a bug. Once I've confirmed my understanding of the situation, I'll update this PR to contain a fix.