Concurrent Shutdown of Both Data Nodes in One Node Group Causes Crash and Potential Corruption of the Cluster File System - Error: 2341 ; Error Object: SUMA (Doc ID 1601180.1)

Last updated on NOVEMBER 23, 2015

Applies to:

MySQL Cluster - Version 7.1 and later
Information in this document applies to any platform.

Symptoms

After shutting down two data nodes in the same node group concurrently, for example using (assuming node id 1 and node id 2 are in the same node group):

shell> ndb_mgm -e "1 STOP -f"
...
shell> ndb_mgm -e "2 STOP -f"

can cause a crash during the handover process between the two nodes:

Time: Tuesday 12 November 2013 - 15:05:49
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: Suma.cpp
Error object: SUMA (Line: 4895) 0x00000006
Program: ndbmtd
Pid: 23246 thr: 1
Version: mysql-5.5.34 ndb-7.2.14
Trace: /opt/mysql/Cluster-7.2.14/data/node_2/ndb_2_trace.log.1 [t1..t7]
***EOM***

One of the two nodes, may crash the same way or as:

Time: Tuesday 12 November 2013 - 15:05:49
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbdihMain.cpp
Error object: DBDIH (Line: 9324) 0x00000006
Program: ndbmtd
Pid: 24684 thr: 0
Version: mysql-5.5.34 ndb-7.2.14
Trace: /opt/mysql/Cluster-7.2.14/data/node_1/ndb_1_trace.log.1 [t1..t7]
***EOM***

 

<node_id> STOP -f is used here as an example to initialise the shutdown. However the issue can occur whenever two data nodes are shutting down in such a way that each node attempts to do a handover to its peer while the peer is also shutting down. This can for example also happen in case of a cluster crash.

 

Upon restarting the cluster, one or more data nodes may require an --initial restart to recover.

The error message during a failed restart varries with the following error messages being examples:

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms