Inactive hot spare checking/failing #12248
AeonJJohnson
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey friends.... before I go off into the weeds and try and create something I wanted to check with the group to see if I am missing something that already exists.
I'm seeing field (hardware) problems where inactive hot spare drives die of boredom. It's the vendor's issue to unscrew but it has exposed a need to be able to check or test inactive hot spares to make sure they are in good condition and ready to go in the event of a data drive failure. I can't write to the drive (dd, badblocks) because of the labeling on it. So I have to find a test method within the ZFS construct.
Just label checking with zdb -l seems a bit weak to test, or even better......exercise the inactive hot spare a bit.
Running a dd read of the inactive hot spare would work but running that with a cronjob or something wouldn't be interruptible by ZFS operations and I don't like operating on drives outside of the ZFS construct.
Anyone know of anything that already exists......before I go and make something up? Like a scrub function for hot spares or something?
I want to use something other than SMART tests as I think the interface needs activity as much as the media does.
Plus, it should be a test that if failure occurs triggers ZFS to fail the hot spare and show the failed status in zpool status
Beta Was this translation helpful? Give feedback.
All reactions