GI CRSD Crashes After Message T_HA_FINISH_TAG (Doc ID 1928364.1)

Last updated on SEPTEMBER 26, 2014

Applies to:

Oracle Database - Enterprise Edition - Version 12.1.0.1 to 12.1.0.1 [Release 12.1]
Information in this document applies to any platform.

Symptoms

Grid Infrastructure CRSD crashes after the following is printed:

 

<ORACLE_BASE>/diag/crs/<node>/crs/trace/crsd.trc 

2014-09-20 09:21:42.135: [ CRSPE][1166063936] {3:28695:157} Cmd : 0x2b1e480bc2f0 : flags: EVENT_TAG | HOST_TAG | QUEUE_TAG | T_HA_FINISH_TAG
2014-09-20 09:21:42.135: [ CRSPE][1166063936] {3:28695:157} Processing PE command id=214. Description: [Start Resource : 0x2b1e480bc2f0]
2014-09-20 09:21:42.135: [ CRSPE][1166063936] {3:28695:157} Attribute overrides for the command: USR_ORA_OPI = true;
2014-09-20 09:21:42.135: [ CRSPE][1166063936] {3:28695:157} Start cmd 0x2b1e480bc2f0 creating ops...
2014-09-20 09:21:42.135: [ CRSD][1166063936] {3:28695:157} Dump State Starting ...
2014-09-20 09:21:42.135: [ CRSD][1166063936] {3:28695:157} State Dump for RTILock
2014-09-20 09:21:42.155: [ CRSPE][1166063936] {3:28695:157} Dumping PE Data Model...:DM has [90 resources][46 types][4 servers][42 spools][0 categories]
------------- RESOURCES:
....
....
2014-09-20 09:21:42.158: [ CRSPE][1166063936] {3:28695:157} Dumping ICE contents...:ICE operation count: 0
2014-09-20 09:21:42.158: [ CRSCCL][1166063936]--- Dumping CCL state ---
2014-09-20 09:21:42.158: [ CRSCCL][1166063936]Dumping member data ------------------
2014-09-20 09:21:42.158: [ CRSCCL][1166063936]Member (1, 7196984) on node racnode01 port=.
2014-09-20 09:21:42.158: [ CRSCCL][1166063936]Member (2, 7111054) on node racnode02 port=.
2014-09-20 09:21:42.158: [ CRSCCL][1166063936]Done ------------------
2014-09-20 09:21:42.158: [ CRSCCL][1166063936]--- End CCL state ---
2014-09-20 09:21:42.158: [CLSFRAME][1166063936] {3:28695:157} Thread Model [PolicyEngineTM] owns the thread where crash happened. Will try to dump the active message:0x2b1e386a2bb0
2014-09-20 09:21:42.158: [CLSFRAME][1166063936] {3:28695:157} MIDTo:2|OpID:2|FromA:{Absolute|Node:3|Process:7172004|Type:1}|ToA:{Absolute|Node:4|Process:7212514|Type:1}|MIDFrom:6|Type:1|Pri2|Id:917:Ver:2Value params:UI_CONTAINER=c13|8!UI_STARTk11|API_HDR_VERt1|2k9|ASYNC_TAGt1|1k6|CLIENTt0|k11|CLIENT_NAMEt37|/u02/oracle/product/1120GL/bin/oraclek10|CLIENT_PIDt4|2406k20|CLIENT_PRIMARY_GROUPt3|dbak9|EVENT_TAGt1|1k4|HOSTt10|njcracpr03k8|HOST_TAGt1|1k6|LOCALEt25|AMERICAN_AMERICA.AL32UTF8k9|QUEUE_TAGt1|1k8|RESOURCEt50|ora.glprod.applsys.wf_control.svcUSR_ORA_OPI=truek15|T_HA_FINISH_TAGt1|1|String params:CMD=UI_START|FROM_SN=njcracpr03|UI_USERNAME=oraclep|Int params:CMD_ID=214|RCTX_VER=0|UICMDID=114|UI_ELEM_COUNT=13|UI_PROD_MAJOR=11|UI_PROD_MINOR=2|UI_PROT_MAJOR=2|UI_PROT_MINOR=0|_RC=0|
2014-09-20 09:21:42.158: [ CRSD][1166063936] {3:28695:157} Dump State Done.
2014-09-20 09:21:43.355: [ OCRSRV][1038133568]ath_conn_manage: Invalidate_con_table[0] is NULL
2014-09-20 09:21:43.356: [ OCRSRV][1038133568]ath_conn_manage: Invalidate_con_table[0] is NULL
2014-09-20 09:21:43.356: [ OCRSRV][1038133568]ath_conn_manage: Invalidate_con_table[0] is NULL

 

<ORACLE_BASE>/crsdata/eclnt4f/output/crsdOUT.trc

2014-09-20 09:20:49 
CRSD REBOOT
CRSD handling signal 11
Dumping CRSD state 
Dumping CRSD stack trace


----- Call Stack Trace -----
calling call entry argument values in hex 
location type point (? means dubious value) 
-------------------- -------- -------------------- ----------------------------
sclssutl_sigdump()+ call kgdsdst() 2B1E458018C0 ? 000000000 ?
980 sclssutl.c:271 2B1E457E4180 ? 2B1E457E4298 ?
2B1E45801F78 ? 2B1E45801870 ?
sclssutl_signalhand call sclssutl_sigdump() 00000000B ? 000000000 ?
ler()+119 sclssutl. 2B1E457E4180 ? 2B1E457E4298 ?
c:305 2B1E45801F78 ? 2B1E45801870 ?
__sighandler() call sclssutl_signalhand 00000000B ? 000000000 ?
ler() 2B1E457E4180 ? 2B1E457E4298 ?
2B1E45801F78 ? 2B1E45801870 ?
_ZN8cls_pe129UiComm signal __sighandler() 2B1E45803158 ? 2B1E45803148 ?
and19finishThaProce 000000080 ? 000000001 ?
ssingEv()+697 clsPe 2B1E387CC668 ? 000000021 ?
UiCmd12.cpp:2690 
_ZN8cls_pe1222Start call _ZN8cls_pe129UiComm 2B1E480BC2F0 ? 2B1E45803148 ?
ResourceUiCommand19 and19finishThaProce 000000080 ? 000000001 ?
createIceOperations ssingEv() 2B1E387CC668 ? 000000021 ?
Ev()+4143 clsPeStar 
tResUiCmd12.cpp:286 
_ZN8cls_pe127Comman call _ZN8cls_pe1222Start 2B1E480BC2F0 ? 2B1E45803148 ?
d7processEv()+1337 ResourceUiCommand19 000000080 ? 000000001 ?
clsPeCmd12.cpp:269 createIceOperations 2B1E387CC668 ? 000000021 ?
Ev() 
_ZN8cls_pe1218Polic call _ZN8cls_pe127Comman 2B1E480BC2F0 ? 2B1E45803148 ?
yEngineModule15invo d7processEv() 000000080 ? 000000001 ?
keUiCommandEPNS_7Co 2B1E387CC668 ? 000000021 ?
mmandE()+944 clsPeM 
odule12.cpp:2467 
_ZN8cls_pe1218Polic call _ZN8cls_pe1218Polic 01FCB89A0 ? 2B1E480BC2F0 ?
yEngineModule23proc yEngineModule15invo 000000080 ? 000000001 ?
essUiCommandHandler keUiCommandEPNS_7Co 2B1E387CC668 ? 000000021 ?
EPNS_14PeUiCmdMessa mmandE() 
geE()+5612 clsPeMod 
ule12.cpp:2437 
_ZN3cls11ThreadMode call _ZN8cls_pe1218Polic 01FCB89A0 ? 2B1E386A2BB0 ?
l12processQueueEP7s yEngineModule23proc 000000080 ? 000000001 ?
ltstid()+2540 clsTh essUiCommandHandler 2B1E387CC668 ? 000000021 ?
readModel.cpp:276 EPNS_14PeUiCmdMessa 
geE() 
_ZN3cls11ThreadMode call _ZN3cls11ThreadMode 2B1E381AD400 ? 2B1E45806F80 ?
l5runTMEPv()+152 cl l12processQueueEP7s 000000080 ? 000000001 ?
sThreadModel.cpp:10 ltstid() 2B1E387CC668 ? 000000021 ?

_ZN13CLS_Threading1 call _ZN3cls11ThreadMode 2B1E381AD400 ? 2B1E45806F80 ?
3CLSthreadMain8cppS l5runTMEPv() 000000080 ? 000000001 ?
tartEPv()+49 clsThr 2B1E387CC668 ? 000000021 ?
eadMain.cpp:97 
start_thread()+221 call _ZN13CLS_Threading1 2B1E4003BD70 ? 2B1E45806F80 ?
3CLSthreadMain8cppS 000000080 ? 000000001 ?
tartEPv() 2B1E387CC668 ? 000000021 ?
clone()+109 call start_thread() 2B1E4580B940 ? 2B1E45806F80 ?
000000080 ? 000000001 ?
2B1E387CC668 ? 000000021 ?

 

If the issue happens during rootupgrade.sh, likely root script log will have the following:

<NEW_GI_HOME>/cfgtoollogs/crsconfig/rootcrs_<node>_<timestamp>.log 

2014-09-20 09:21:44: Executing cmd: /u02/grid/product/12101GI/bin/srvctl upgrade model -s 11.2.0.3.0 -d 12.1.0.1.0 -p last
2014-09-20 09:21:55: Command output:
> PRCR-1115 : Failed to find entities of type resource type that match filters null and contain attributes
> Cannot communicate with crsd
>End Command output
2014-09-20 09:21:55: "/u02/grid/product/12101GI/bin/srvctl upgrade model -s 11.2.0.3.0 -d 12.1.0.1.0 -p last" failed with status 1.
2014-09-20 09:21:55: Executing cmd: /u02/grid/product/12101GI/bin/clsecho -p has -f clsrsc -m 180 "srvctl upgrade model -s 11.2.0.3.0 -d 12.1.0.1.0 -p last" "0"
2014-09-20 09:21:55: Command output:
> CLSRSC-180: An error occurred while executing the command 'srvctl upgrade model -s 11.2.0.3.0 -d 12.1.0.1.0 -p last' (error code 0)
>End Command output
2014-09-20 09:21:55: Executing cmd: /u02/grid/product/12101GI/bin/clsecho -p has -f clsrsc -m 180 "srvctl upgrade model -s 11.2.0.3.0 -d 12.1.0.1.0 -p last" "0"
2014-09-20 09:21:55: Command output:
> CLSRSC-180: An error occurred while executing the command 'srvctl upgrade model -s 11.2.0.3.0 -d 12.1.0.1.0 -p last' (error code 0)
>End Command output
2014-09-20 09:21:55: CLSRSC-180: An error occurred while executing the command 'srvctl upgrade model -s 11.2.0.3.0 -d 12.1.0.1.0 -p last' (error code 0)
2014-09-20 09:21:55: Running as user oraclep: /u02/grid/product/12101GI/bin/cluutil -ckpt -oraclebase /u02/oracle -writeckpt -name ROOTCRS_NODECONFIG -state FAIL
2014-09-20 09:21:55: s_run_as_user2: Running /bin/su oraclep -c ' /u02/grid/product/12101GI/bin/cluutil -ckpt -oraclebase /u02/oracle -writeckpt -name ROOTCRS_NODECONFIG -state FAIL '
2014-09-20 09:21:57: Removing file /tmp/fileWG9mLv
2014-09-20 09:21:57: Successfully removed file: /tmp/fileWG9mLv
2014-09-20 09:21:57: /bin/su successfully executed

2014-09-20 09:21:57: Succeeded in writing the checkpoint:'ROOTCRS_NODECONFIG' with status:FAIL
2014-09-20 09:21:57: Executing cmd: /u02/grid/product/12101GI/bin/clsecho -p has -f clsrsc -m 284
2014-09-20 09:21:57: Command output:
> CLSRSC-284: Failed to perform last node tasks for cluster modeling upgrade
>End Command output
2014-09-20 09:21:57: Executing cmd: /u02/grid/product/12101GI/bin/clsecho -p has -f clsrsc -m 284
2014-09-20 09:21:57: Command output:
> CLSRSC-284: Failed to perform last node tasks for cluster modeling upgrade
>End Command output
2014-09-20 09:21:57: CLSRSC-284: Failed to perform last node tasks for cluster modeling upgrade
2014-09-20 09:21:57: ###### Begin DIE Stack Trace ######
2014-09-20 09:21:57: Package File Line Calling
2014-09-20 09:21:57: --------------- -------------------- ---- ----------
2014-09-20 09:21:57: 1: main rootcrs.pl 211 crsutils::dietrap
2014-09-20 09:21:57: 2: crsupgrade crsupgrade.pm 3319 main::__ANON__
2014-09-20 09:21:57: 3: crsupgrade crsupgrade.pm 239 crsupgrade::upgrade_node
2014-09-20 09:21:57: 4: crsupgrade crsupgrade.pm 190 crsupgrade::CRSUpgrade
2014-09-20 09:21:57: 5: main rootcrs.pl 220 crsupgrade::new
2014-09-20 09:21:57: ####### End DIE Stack Trace #######

 

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms