Cellsrv fails to start ORA-07445: exception encountered: core dump _Z17raiseStartupErrorP5kgsmpPKcj10assertType11restartType() (Doc ID 1384448.1)

Last updated on AUGUST 19, 2014

Applies to:

Exadata Database Machine V2
Linux OS - Version Oracle Linux 5.0 to Oracle Linux 5.0 [Release OL5]
Information in this document applies to any platform.

Symptoms

one of the exadata  storage is not able to startup the cellsrv services.


/etc/init.d/celld status
rsStatus: running
msStatus: running
cellsrvStatus: stopped


cell alert history shows:

2011-12-02T19:45:28+01:00 critical "ORA-07445: exception encountered: core dump [_Z17raiseStartupErrorP5kgsmpPKcj10assertType11restartType()+317] [11] [0x000000000] [] [] []"


Cell alert log shows:

[RS] Started monitoring process /opt/oracle/cell11.2.2.3.2_LINUX.X64_110520/cellsrv/bin/cellrssmt with pid 9922
Fri Dec 02 19:45:28 2011
Successfully setting event parameter -
CELLSRV process id=9919
CELLSRV cell host name=fraartou01cel05.de.db.com
CELLSRV version=11.2.2.3.2,label=OSS_11.2.0.3.0_LINUX.X64_110520,Fri_May_20_05:23:52_PDT_2011
No Infiniband device found.
CELLSRV could not initialize infiniband context
CELLSRV failed to start due to the error 15 ((null))
Errors in file /opt/oracle/cell11.2.2.3.2_LINUX.X64_110520/log/diag/asm/cell/fraartou01cel05/trace/svtrc_9919_0.trc (incident=17):
ORA-07445: exception encountered: core dump [_Z17raiseStartupErrorP5kgsmpPKcj10assertType11restartType()+317] [11] [0x000000000] [] [] []
Incident details in: /opt/oracle/cell11.2.2.3.2_LINUX.X64_110520/log/diag/asm/cell/fraartou01cel05/incident/incdir_17/svtrc_9919_0_i17.trc


The diag trace shows:


Trace file /opt/oracle/cell11.2.2.3.2_LINUX.X64_110520/log/diag/asm/cell/fraartou01cel05/trace/svtrc_9919_0.trc
ORACLE_HOME = /opt/oracle/cell11.2.2.3.2_LINUX.X64_110520
System name: Linux
Node name: fraartou01cel05.de.db.com
Release: 2.6.18-194.3.1.0.4.el5
Version: #1 SMP Sat Feb 19 03:38:37 EST 2011
Machine: x86_64
CELL SW Version: OSS_11.2.0.3.0_LINUX.X64_110520

*** 2011-12-02 19:45:28.441
skgzib_load_ib_symbols: loaded infiniband libraries from libibumad.so.2.0.0 and libibmad.so.2.2.0
skgzib_ini: SKGZIB is running in debug mode.
skgzib_query_node_guid: No Infiniband device found.
skgzib_ini: Fail to query GUIDs on the current system.
ossp_initskgzib: Fail to init skgzib context, retcode = 56881
Writing message type OSS_PIPE_ERR_DNR to OSS->RS pipe
Writing message type OSS_PIPE_ERR_FAILED_STARTUP_RESTART to OSS->RS pipe
DDE: Flood control is not active
Incident 17 created, dump file: /opt/oracle/cell11.2.2.3.2_LINUX.X64_110520/log/diag/asm/cell/fraartou01cel05/incident/incdir_17/svtrc_9919_0_i17.trc
ORA-07445: exception encountered: core dump [_Z17raiseStartupErrorP5kgsmpPKcj10assertType11restartType()+317] [11] [0x000000000] [] [] []
Writing message type OSS_PIPE_ERR_FAILED_STARTUP_RESTART to OSS->RS pipe



/var/log/messages :


Dec 2 21:50:41 fraartou01cel05 kernel: Registered RDS/iwarp transport
Dec 2 21:50:41 fraartou01cel05 kernel: Registered RDS/infiniband transport
Dec 2 21:50:41 fraartou01cel05 kernel: Ethernet Channel Bonding Driver: v3.2.3 (December 6, 2007)
Dec 2 21:50:41 fraartou01cel05 kernel: bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
Dec 2 21:50:41 fraartou01cel05 kernel: bonding: bondib0 is being created...
Dec 2 21:50:41 fraartou01cel05 kernel: bonding: bondib0: setting mode to active-backup (1).
Dec 2 21:50:41 fraartou01cel05 kernel: bonding: bondib0: Setting MII monitoring interval to 100.
Dec 2 21:50:41 fraartou01cel05 kernel: bonding: bondib0: Setting down delay to 5000.
Dec 2 21:50:41 fraartou01cel05 kernel: bonding: bondib0: Setting up delay to 5000.
Dec 2 21:50:41 fraartou01cel05 kernel: ADDRCONF(NETDEV_UP): bondib0: link is not ready
Dec 2 21:50:41 fraartou01cel05 kernel: bonding: bondib0: Adding slave ib0.
Dec 2 21:50:41 fraartou01cel05 kernel: bonding: bondib0: Interface ib0 does not exist!
Dec 2 21:50:41 fraartou01cel05 kernel: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX


Run of IB commands fails:



# ibstatus
Fatal error: device '*': sys files not found (/sys/class/infiniband/*/ports)
# ibv_devinfo
No IB devices found
# ibdiagnet
Loading IBDIAGNET from: /usr/lib64/ibdiagnet1.2
-W- Topology file is not specified.
Reports regarding cluster links will use direct routes.
Loading IBDM from: /usr/lib64/ibdm1.2

-E- IBIS: No HCA was found on local machine.
Exiting.
# ibhosts
ibwarn: [10306] mad_rpc_open_port: can't open UMAD port ((null):0)
/usr/sbin/ibnetdiscover: iberror: failed: Failed to open (null) port 0


No Infiniband device is found at system

lspci | grep InfiniBand



Changes

Recently HBA battery was replaced

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms