I/O errors on zvol, but no erros reported by zpool scrub #15720
Replies: 7 comments 54 replies
-
I am currently destroying all the datasets/zvols affected by this issue on the backup machine. However, I'll leave at least one intact for debugging. |
Beta Was this translation helpful? Give feedback.
-
You should probably read #12014 before destroying everything in a fire. Assuming you're using native encryption ,of course. |
Beta Was this translation helpful? Give feedback.
-
Thanks, @rincebrain, for this zdb test/investigation I decided to use a dataset which I can share. If sharing the output of Also, somewhere in the past I've changed Output of
Contents of
... with Output of Finding out the offset... I assume From part of
At the end of
So, L0 DVA in this case will be |
Beta Was this translation helpful? Give feedback.
-
Here is the link to output of |
Beta Was this translation helpful? Give feedback.
-
Ok, one thing less to suspect
Yes, there are 63 such zvols presently. I'm in the process of manual zfs send / recv some of them - for testing. |
Beta Was this translation helpful? Give feedback.
-
Thanks, I understand that this is exceptionally weird. I really appreciate Your help. |
Beta Was this translation helpful? Give feedback.
-
Another problem caused by |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
System information
Describe the problem you're observing
I have a backup machine, to which backups are sent using
zrepl
tool. Recently, due to power failure (which also disabled UPS - it was "weird" power failure, forced UPS to shutdown) I was faced with a need to restore some ZFS ZVOLs from backup. I usedzfs send | zfs recv
to accomplish this goal.(The backup machine was connected to the same UPS and was also shut down abruptly.)
zfs send -Lec | zfs recv ...
completed without issue, however the result on the production machine was a ZVOL which exhibited I/O errors upon trying todd
it (also, was impossible tofsck
and reported I/O errors).zpool status -v
on the production machine reported errors correctly:The names under
errors:
are lost, because I deleted the affected ZVOLs - earlier it was showingbelt/vms/vdi-proxy/root
which is a path to ZVOL being restored.Weirdly, when trying
dd
on the backup machine, using the same snapshots, and evenzfs clone
them to ZVOLs, anddd
-ing them to/dev/null
no errors were reported byzpool status -v
Describe how to reproduce the problem
I do not know how to reproduce these specific state, since, apparently it has taken a long time to occur without any error messages being reported. The last "integral" snapshot (without I/O errors) is from 2023-11-08.
Include any warning/errors/backtraces from the system logs
Output of
zpool events -v
on production machine (restore target) afterdd
from broken ZVOL:Output from
dmesg
afterdd
affected ZVOL:Output of dmesg during
e2fsck
on affected ZVOL:e2fsck
due to write I/O errors.Changed ZFS kernel params (
modprobe.d/zfs.conf
):Questions - most important
zfs destroy
-ing affected ZVOLs?Possible causes
zrepl
? However,zrepl
uses ZFS shell utils.fstrim -a
in all VMs, which makes all ZVOLs receive TRIM/DISCARD/UNMAP dailyNon-causes
memtest86+
zfs scrub
on backup machinesync=never
- not usedGoals of this report
Non-goals of this report
Questions and doubts
zpool scrub
on the backup machine detect those?EDIT1:
Beta Was this translation helpful? Give feedback.
All reactions