Oracle HSM / SAM-QFS - Server Becomes Non-Responsive When Reading Large File Via Buffered I/O

(Doc ID 2033628.1)

Last updated on JANUARY 03, 2018

Applies to:

Oracle Hierarchical Storage Manager (HSM) and StorageTek QFS Software - Version 5.4 to 5.4 [Release 5.0]
Oracle Solaris on x86-64 (64-bit)


After upgrading MDS machine from S11.1/SAM-QFS 5.3 to S11.2/SAM-QFS 5.4.9, whenever a large file which is online in the disk cache of a QFS filesystem is read using buffered I/O, the entire system becomes extremely non-responsive until the process performing I/O is killed.

This problem does not happen immediately. It gets triggered after ~50-100 GBytes of data have been read. This will not happen if the file is a few 10's of GBytes.

While the system is degraded there are no obvious problems reported by top (lots of free memory), iostat (no significant I/O happening to QFS filesystem), or intrstat. However, the output of "lockstat uptime" shows a very large amount of kernel lock activity happening.

Another observation is that whenever this problem occurs the amount of free memory as reported by /usr/bin/top goes from a small number like 4GBytes to a large number like the following.

Memory: 48G phys mem, 41G free mem, 4096M total swap, 3802M free swap

It is quite unusual that the problem frees up memory rather than a traditional performance issue when there is little free memory available.


MDS machines were upgraded from S11.1/SAM-QFS 5.3 to S11.2/SAM-QFS 5.4.9.


Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms