ZFS CKSUM and smart UDMA_CRC_Error_Count relation #14369
Unanswered
FFourtyTwo
asked this question in
Q&A
Replies: 1 comment 1 reply
-
If you have checksum errors and non-ECC RAM, I'd recommend to memtest before possible memory corruptions eaten your pool. SATA indeed should recover corruptions. Same is about PCIe. Reallocated sectors count should not be directly relates, since HDD firmware should be able to handle it transparently or explicitly report errors, not just set something in SMART. But memory corruptions may happen at any point, corrupting either data or checksums, or even code processing them. There are reasons why all TrueNAS systems are going with ECC RAM. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello!
I have 4x WD DC HC320 in stripe of mirrors config, 64GB of non-ECC RAM, B450 based MB and Ryzen 2700X CPU.
I just replaced my old 4x2TB HGST disks with new 4x8TB WD disks and found that i had a little number of CKSUM errors while mirrors resilvered. No data lost and no problems occured generally but i started to google about it. After reading whole internet i understood possible problems but i'm still interested if CKSUM is related or not to UDMA_CRC_Error_Count from smartctl.
Most users mentions that if you have CKSUM errors and no Reallocated sectors then you need to check cabling, RAM, clock speeds first. My expirience tells me the same, no questions here. But some users also suggests to check UDMA_CRC_Error_Count and if there is no errors then cabling is ok and should not be under suspicion. Seems legit because UDMA provides (as i understand as non-expert here) reliable data transmission via bus and SATA cable respectively. ZFS on the other side (again, as i understand) compares checksums after transmission of data from HDD to RAM or so. Generally we can tell that UDMA checks integrity between HDD<>SATA controller and ZFS checks integrity between HDD<>RAM after data goes the whole distance.
Summing up this we can say that:
Am i right in this? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions