Kernel I/O Scheduler and OCFS2 issues
Last updated on SEPTEMBER 27, 2013
Applies to:Linux OS - Version 1.2.0-1 to 1.2.5-1 [Release OCFS2]
When a server is run with the default CFQ io scheduler, it causes a process to do heavy IO to temporarily prevent other processes to run. While this is not fatal for most environments, it is for OCFS2 as we expect the heartbeat thread to be read/write to the heartbeat area at least once every 12 seconds (default).
2. OCFS2 fencing issues
A node fences itself if it fails to update its timestamp for "(O2CB_HEARTBEAT_THRESHOLD - 1) * 2" seconds. The [o2hb-xx] kernel thread, after every timestamp write, sets a timer to panic the system after that duration. If the next timestamp is written within that duration, as it should, it first cancels that timer before setting up a new one. This way it ensures the system will self fence if for some reason the [o2hb-x] kernel thread is unable to update the timestamp and thus be deemed dead by other nodes in the cluster.
This bug has been addressed by Red Hat in RHEL4 U4 (2.6.9-42.EL) and Novell in SLES9 SP3 (2.6.5-7.257).
Sign In with your My Oracle Support account
Don't have a My Oracle Support account? Click to get started
My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms