My Oracle Support Banner

Oracle VM: DomU/Guest VM Nodes OCFS2 Cluster Failure Causing The Oracle DBs To Crash (Doc ID 2623118.1)

Last updated on DECEMBER 24, 2019

Applies to:

Oracle VM - Version 3.2.10 to 3.4.6 [Release OVM32 to OVM34]
Linux x86-64

Symptoms

An Oracle Linux DomU's/Guest VM Nodes with an OCFS2 Cluster is kept failing causing the Oracle DB to crash.

Further details found in the /var/log/messages:

 

Node01:/var/log/messages:

Dec 17 11:12:29 guestvm_node01 kernel: [27982.108620] o2hb: receive negotiate timeout message from node 1 on region AFD92117A4044E7A96716C5332A9F263 (xvde).
Dec 17 11:12:41 guestvm_node01 kernel: [27994.261074] o2hb: node 0 hb write hung for 30s on region AFD92117A4044E7A96716C5332A9F263 (xvde).
Dec 17 11:12:41 guestvm_node01 kernel: [27994.264246] o2hb: all nodes hb write hung, maybe region AFD92117A4044E7A96716C5332A9F263 (xvde) is down.
Dec 17 11:12:50 guestvm_node01 o2hbmonitor: Last ping 38708 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:12:52 guestvm_node01 o2hbmonitor: Last ping 40708 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:12:54 guestvm_node01 o2hbmonitor: Last ping 42708 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:12:56 guestvm_node01 o2hbmonitor: Last ping 44709 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:12:58 guestvm_node01 o2hbmonitor: Last ping 46709 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:13:00 guestvm_node01 o2hbmonitor: Last ping 48709 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:13:02 guestvm_node01 o2hbmonitor: Last ping 50709 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:13:04 guestvm_node01 o2hbmonitor: Last ping 52710 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:13:06 guestvm_node01 o2hbmonitor: Last ping 54710 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:13:08 guestvm_node01 o2hbmonitor: Last ping 56710 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:13:10 guestvm_node01 o2hbmonitor: Last ping 58710 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 17 11:13:11 guestvm_node01 kernel: [28024.267054] o2hb: node 0 hb write hung for 30s on region AFD92117A4044E7A96716C5332A9F263 (xvde).
Dec 17 11:13:12 guestvm_node01 o2hbmonitor: Last ping 60711 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263

 

Node02:/var/log/messages:

Dec 18 00:06:03 guestvm_node02 kernel: [329951.699026] o2net: No longer connected to node guestvm_node01 (num 0) at xxx.xxx.xxx.xxx:7777
Dec 18 00:06:03 guestvm_node02 kernel: [329951.699071] o2cb: o2dlm has evicted node 0 from domain AFD92117A4044E7A96716C5332A9F263
Dec 18 00:06:04 guestvm_node02 kernel: [329952.594386] ocfs2: Begin replay journal (node 0, slot 1) on device (202,64)
Dec 18 00:06:06 guestvm_node02 kernel: [329954.542053] o2dlm: Begin recovery on domain AFD92117A4044E7A96716C5332A9F263 for node 0
Dec 18 00:06:06 guestvm_node02 kernel: [329954.542079] o2dlm: Node 1 (me) is the Recovery Master for the dead node 0 in domain AFD92117A4044E7A96716C5332A9F263
Dec 18 00:06:06 guestvm_node02 kernel: [329954.542347] o2dlm: End recovery on domain AFD92117A4044E7A96716C5332A9F263
Dec 18 00:07:33 guestvm_node02 kernel: [330041.414336] ocfs2: End replay journal (node 0, slot 1) on device (202,64)
Dec 18 00:07:34 guestvm_node02 kernel: [330042.470642] ocfs2: Beginning quota recovery on device (202,64) for slot 1
Dec 18 00:07:35 guestvm_node02 kernel: [330043.432872] ocfs2: Finishing quota recovery on device (202,64) for slot 1
Dec 18 00:11:00 guestvm_node02 kernel: [330248.101794] o2net: Connection to node guestvm_node01 (num 0) at xxx.xxx.xxx.xxx:7777 shutdown, state 7
Dec 18 00:11:02 guestvm_node02 kernel: [330250.089676] o2net: Connected to node guestvm_node01 (num 0) at xxx.xxx.xxx.xxx:7777
Dec 18 00:11:11 guestvm_node02 kernel: [330259.195196] o2dlm: Node 0 joins domain AFD92117A4044E7A96716C5332A9F263 ( 0 1 ) 2 nodes
Dec 18 03:06:57 guestvm_node02 kernel: [340805.329060] o2hb: node 1 hb write hung for 30s on region AFD92117A4044E7A96716C5332A9F263 (xvde), negotiate timeout with node 0.
Dec 18 03:06:59 guestvm_node02 o2hbmonitor: Last ping 31871 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 18 03:07:01 guestvm_node02 o2hbmonitor: Last ping 33873 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 18 03:07:03 guestvm_node02 o2hbmonitor: Last ping 35873 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263
Dec 18 03:10:53 guestvm_node02 kernel: [341041.467052] o2hb: node 1 hb write hung for 30s on region AFD92117A4044E7A96716C5332A9F263 (xvde), negotiate timeout with node 0.
Dec 18 03:16:03 guestvm_node02 kernel: [341351.512040] o2hb: node 1 hb write hung for 30s on region AFD92117A4044E7A96716C5332A9F263 (xvde), negotiate timeout with node 0.
Dec 18 03:16:34 guestvm_node02 kernel: [341382.076053] o2hb: node 1 hb write hung for 30s on region AFD92117A4044E7A96716C5332A9F263 (xvde), negotiate timeout with node 0.
Dec 18 03:16:35 guestvm_node02 o2hbmonitor: Last ping 31140 msecs ago on /dev/xvde, AFD92117A4044E7A96716C5332A9F263

Changes

 Newly provisioned OCFS2 Cluster Filesystem

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.