Skip to content

zfs fails to detect ZFS-8000-8A corruption: Reading file causes ZFS-8000-8A, scrub claims OK, repeat #16520

@haraldrudell

Description

@haraldrudell

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 22.04.4 LTS jammy
Kernel Version 6.5.0-45-generic
Architecture x86_64
OpenZFS Version zfs-2.1.5-1ubuntu6~22.04.4
zfs-kmod-2.2.0-0ubuntu1~23.10.3

Describe the problem you're observing

  1. Every time a particular file is read, it returns I/O error
  2. zpool status -xv reports: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A, all error counters zero and that file permanent error
  3. date --rfc-3339=second && zpool scrub -w z2023 && date --rfc-3339=second && zpool scrub -w z2023 && date --rfc-3339=second claims everything fixed
  4. back to step 1

.

BUG: scrub and zfs should not claim everything is fine when it isn’t
BUG: there is no way to have zfs admit that there is corruption

.

QUESTION: is zfs-2.1.5 OK paired with zfs-kmod-2.2.0? the semantic versions are different.
fresh installs have same versions, another host also have the same difference

.

  • The disk is good ssd with little use
  • reading the disk surface is error free
  • smartctl reports no errors ever

Describe how to reproduce the problem

cp -avn /mnt/w/2024/Media/filename .
'/mnt/w/2024/Media/filename' -> './filename'
cp: error reading '/mnt/w/2024/Media/filename': Input/output error

The software that wrote this file:
— first wrote the file verifying no errors
– then read the file verifying no errors and validated the checksum
meaning: immediately after writing, the file could be read
the first bookmark event I/O error was 6 days later
nothing in particular happened to the host or the disk during that time, no reboots or such
this particular host has operated this pool for two years

zfs came up with this error all by itself, it can’t read what it writes to disk and
zfs can’t figure out ahead of time that it is unreadable.
Of course, 100s of other files worked written around that same time

syslog:

Sep  9 05:11:58 c68z zed: eid=6871 class=authentication pool='z2023' bookmark=3243:4600:0:450

there is no zed logging since when this file was written, September 2, or any sys-logging when the file was written
zfs came up with this issue all by itself, there was no power outage or tripping over cables

every time a scrub completes and the I/O error occurs, the bookmark log statement is printed

Include any warning/errors/backtraces from the system logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions