-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add a kill switch for SEEK_HOLE/SEEK_DATA #16000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Maybe have |
The problem, and reason I put it here, is that I don't believe FreeBSD has an equivalent - or at a minimum there's no code currently calling it. Specifically, in FreeBSD, I believe the codepath looks like (I see references to Linux's generic handlers, but they appear to all be in code that's shared with Linux in various places.) I could be wrong, and I'll go look once I'm either confident this doesn't break in the form I have locally which is one more commit ahead of this, or that this approach can't be correct. |
Should |
No. If you read #15933 you'll see this still happens even if you have that set. |
Per openzfs#15933, there's currently an inconsistency in behavior in our special-case SEEK_HOLE behavior. So let's not use it for now. Signed-off-by: Rich Ercolani <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
I just shoved a Linux-specific tunable that kills the non- I also included a separate commit for disabling BRT by default, because I am of the opinion that we shouldn't leave git master in a known losing data state by default, and it seems likely to take more than a little bit of investigation to figure out the correct way to resolve this for both cases. |
I missed it, thanks. |
Good news: I can't reproduce any failures after with Also the PR as-written looks correct to me, though I might suggest we have another think about disabling at head. Maybe a good compromise is to both:
This way bleeding edge testers can test things we think work and continue to exercise #15571 and the rest of bclone while these problems are sorted AND we can protect normal release users and distros from future unknown problems with both features. But I'll leave it at that and let you and the other maintainers decide. FYI- these ZTS functional tests are newly failing for me with your patch (I ran functional tests in linux.run and common.run; I don't really know how ZTS is used for ZFS tests/releases so hope this helps while you wait for CI runs):
(cp_files_002 requires bclone, mmap_seek_001 is testing SEEK_HOLE/SEEK_DATA for mmap) |
Yes, I expected those to fail, I was figuring I'd clean it up after agreeing on what to do. And my general thinking on this, is to have a simple thing that is safe and ensures correct behavior, get that merged, and then refine it with more precision, because we've had a lot of cases of "I'm sure this'll work fine" optimizations being wrong, so I'd like to start with "this absolutely cannot fail", and then refine and possibly revert the refinements if they prove to be incorrect. In part on the basis that the friction of getting a change in can be very high, so I'd like to be able to say "run this version" rather than "apply this patch to head". |
Sorry, I meant only those tests failed. The rest passed which is a good sign. SG otherwise. I support merging this PR and hope to see PRs re-enabling incrementally at head once the root cause(s) are fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand now it is for experiments, but lets not merge it this way by mistake. It should not change default, we are not that desperate now. It should be cross-platform, at least include FreeBSD. Plus one more cosmetics:
static loff_t | ||
zpl_llseek(struct file *filp, loff_t offset, int whence) | ||
{ | ||
if (!(zfs_disable_holey)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To not mess with formatting his could be moved into if
below. And lets remove extra parentheses.
None of it was a mistake, Alexander.
Since having such a kill switch doesn't resolve this here, I changed the PR
to a minimum case to stop losing people's data.
FreeBSD hasn't managed to reproduce either of those bugs so far, and the
originally intended kill switch doesn't help with this bug, so I left it be.
…On Mon, Mar 18, 2024, 3:38 PM Alexander Motin ***@***.***> wrote:
***@***.**** requested changes on this pull request.
I understand now it is for experiments, but lets not merge it this way by
mistake. It should not change default, we are not that desperate now. It
should be cross-platform, at least include FreeBSD. Plus one more cosmetics:
------------------------------
In module/os/linux/zfs/zpl_file.c
<#16000 (comment)>:
> @@ -550,6 +552,7 @@ zpl_direct_IO(int rw, struct kiocb *kiocb, struct iov_iter *iter, loff_t pos)
static loff_t
zpl_llseek(struct file *filp, loff_t offset, int whence)
{
+if (!(zfs_disable_holey)) {
To not mess with formatting his could be moved into if below. And lets
remove extra parentheses.
—
Reply to this email directly, view it on GitHub
<#16000 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUI7J2P3CFXXWKJJBURKLYY47CJAVCNFSM6AAAAABEZN5WKGVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNBUGAYTCNRWGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@rincebrain You are proposing to disable a big chunk of functionality in master branch being there for many years without any understanding of the problem and killing any hope or motivation to fix it. This is NOT the way! Lets work on something more constructive! |
I'm opening a PR that stops two rare data loss edge cases, given that we have a history of them in this functionality, with the intent we can iterate on a safe baseline. |
It not just stops two DAMN RARE data loss edge cases, but also disables significant chunks of functionality. And you are doing it on master branch, where development is going on any way, and that is not used in any Linux distro AFAIK. Considering Brian haven't merged anything in few weeks (I don't know where he is), I am not certain any fix can be merged nearest days. And if we have few days/week, I would prefer something more creative rather than destructive. Speaking about two cases, I don't think anybody argue that we should flush mapped pages before block cloning. That is the patch (or some of its variation) I'd like to see merged ASAP. After that the only question I see is what the hell is wrong with zn_flush_cached_data() on Linux? Why does it not flush everything, or what else is wrong with mmap? That is where we should focus instead of this discussion. |
If you are eager to disable something, you can do it specifically in case of zn_has_cached_data() == B_TRUE and Linux. It should specifically limit the scope to mmap(), where we suspect the problem. |
It's behind a tunable, you can just adjust it. I too would like to see changes for that. But I'd like to have a start that isn't breaking anything. If Brian is just gone, and we have no prospect of merging anything, then we may have nicer patches available by the time he's back. But I'd like to have a solid mitigation to start, which people can turn off if they wish, and then attempt more refined approaches from there, once it's no longer an acute issue. And Arch and Gentoo ship git packages, off the top of my head, though since this also breaks on pre-2.2, that needs a solution too. (I thought Nix did too but I don't immediately see a listing.) |
Lets close this in context of #16019 . |
We've had a number of varyingly rare races with trying to cheat and not copy certain file regions from userland come up.
Let's add a kill switch which should hopefully allow people to just hit it and keep running without those optimizations rather than hoping that we've caught them all This Time.
Motivation and Context
#15994 #15933 #15526 #11900 #14753
Description
I'd like a safe option while we debug things further in the future, rather than having to push out a release saying "we think this fixes it but since it's a rare edge case it's hard to be certain".
How Has This Been Tested?
Again, isn't that what CI is for?
(Really, I'm expecting to need to go modify the tests to set the tunable off to pass, but I wanted to see if anything else breaks first.)
Types of changes
Checklist:
Signed-off-by
.