Avoiding Data Loss and Outages Using Solaris ZFS Self Healing and Checksumming Capabilities
(Doc ID 1355155.1)
Last updated on MAY 29, 2018
Applies to:Solaris Operating System - Version 10 10/08 U6 to 11.3 [Release 10.0 to 11.0]
Information in this document applies to any platform.
ZFS uses several techniques to keep on-disk data self consistent and eliminate silent data corruption, such as copy-on-write and end-to-end checksumming. Data is written to a new block on the media before changing the pointers to the old data and committing the write. As the file system is always consistent, time-consuming recovery procedures like fsck are not required if the system is shut down in an unclean manner.
ZFS writes data to media and continues when it receives an "ok" that the data were written. ZFS does not re-read the data to check if they were written correctly. When the data is next read, ZFS calculates the checksum and compares it with the stored checksum. In the case of a mismatch, an FMA event is generated. This is an advantage of ZFS that data corruption is detected. ZFS requires redundancy at the ZFS layer (mirror or raidz) in order to have the chance to correct these errors. For this reason, it is always recommended to configure redundancy at the ZFS level. Of course, on the storage side you may have redundancy but this does not supersede the redundancy at the filesystem level, since the corruption can happen anywhere on the path from the HBA to the disk.
ZFS stores all metadata in duplicate or triplicate copies and so will often be able to recover block corruption for those blocks, even on non-redundant ZFS storage. There would typically only be a single copy of application data blocks in a non-redundant zpool configuration so this would not be recoverable, although any corruption would still be detected so that you know there is something wrong in your system.
One can, however, increase the ZFS dataset property "copies" to protect a subset of the pool considered to be of higher value. This property controls the number of copies of data blocks stored for this dataset. These copies are in addition to any redundancy provided by the pool, for example, mirroring or raid-z. The copies are stored on different disks, if possible. Changing this property only affects newly-written data. Therefore, set this property at file system creation time by using the "-o copies=" option.
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!