Solaris 11.3 and older releases can exhibit a kernel memory leak triggered by a large number of ZFS checksum errors
(Doc ID 2655088.1)
Last updated on JULY 07, 2023
Applies to:
Solaris Operating System - Version 10 3/05 to 11.3 [Release 10.0 to 11.0]Information in this document applies to any platform.
Symptoms
Two symptoms that should be sufficient to confirm the issue are:
1. The kstatkma.out file collected by the GUDS script will show the zio_cache consuming a large amount of kernel memory.
The example below shows the zio_cache was consuming well over 20GB of memory:
--------------------------------------------------------------------------------------------------------
kernel physical memory caches
--------------------------------------------------------------------------------------------------------
cache buf buf buf memory alloc alloc
name size in use total in use succeed fail
-------------------------------- ------- --------- --------- ---------------------- ------------ -------
zio_cache 888 26484801 26486406 23519928528 4400523775 0
2. The zpool-status-v.out file collected by the GUDS script will show a large number of ZFS checksum errors logged for one or more zpools.
The example below shows the 'rpool' has accrued 25.6 million checksum errors:
pool: rpool
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://support.oracle.com/msg/ZFS-8000-8A
scan: resilvered 226K in 0h0m with 1952 errors on Wed Mar 18 01:00:53 2020
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 25.6M
c3d0 DEGRADED 0 0 0
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |
References |