AIX: OSYSMOND.BIN CORE Dumps & 'ora.crf' Fails (Doc ID 1549693.1)

Last updated on OCTOBER 13, 2016

Applies to:

Oracle Database - Enterprise Edition - Version 11.2.0.3 to 11.2.0.4 [Release 11.2]
IBM AIX on POWER Systems (64-bit)

Symptoms

11gR2 GI osysmond.bin fails:

Case I

crfmond.log

2013-03-13 04:46:10.310: [ CRFMOND][1802]OSD error: op="stat64" loc="getdevid1" other="" dep="2"
2013-03-13 04:46:10.410: [ CRFMOND][1802]crfmond_refresh: mark oracle disks encountered errors
2013-03-13 04:53:48.640: [    CRFM][1288]SYNC crfm_send: send fail  datasize 122 sbuf 1137d51f0, msglen 166, olen 0, ret 1
2013-03-13 04:53:48.640: [ CRFMOND][1288]ipcmsghdlr: crfm_send failed
2013-03-13 04:55:33.571: [    CRFM][1288]SYNC crfm_send: send fail  datasize 122 sbuf 114facf70, msglen 166, olen 0, ret 1
2013-03-13 04:55:33.571: [ CRFMOND][1288]ipcmsghdlr: crfm_send failed
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_answer:gipcWait failed with 1
2013-03-13 04:55:33.572: [    CRFM][1288]crfmctx dump follows
2013-03-13 04:55:33.572: [    CRFM][1288]****************************
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: connection local name: ipc://spn15gdb1oraclepn15gdb-clusterCRFM_SIPC
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: connection peer name:  
..
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: connaddr:  
ipc://spn15gdb1oraclepn15gdb-clusterCRFM_SIPC
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: ctype:  1
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: mytype:  1
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: hostname  pn15gdb1
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: myport:  0
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: rhostname  
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: rport:  
2013-03-13 04:55:33.572: [    CRFM][1288]crfm_dumpctx: flags:  5
2013-03-13 04:55:33.572: [    CRFM][1288]****************************
2013-03-13 04:56:28.523: [    CRFM][1288]SYNC crfm_send: send fail  datasize 122 sbuf 111b9a870, msglen 166, olen 0, ret 1
2013-03-13 04:56:28.523: [ CRFMOND][1288]ipcmsghdlr: crfm_send failed
2013-03-13 04:56:48.521: [    CRFM][1288]crfm_answer: Wait failed2 timeout 2147483645. GIPC return values is 1

 

crfmondOUT.log

2013-03-13 04:58:03  
.
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
sclssutl_sigdump()+  bl       _ptrgl()             F100070F00038400 ?
440                                                900000000D0FD10 ?
                                                   8001000A0320578 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   01025F0C3 ? FFFFFFFFFFFBCF0 ?
sclssutl_signalhand  call     sclssutl_sigdump()   B0000000B ? 8001000A0320578 ?
ler()+84                                           00000000B ? 000000000 ?
                                                   000000000 ? 01025F0C3 ?
                                                   01025F0C3 ? 900000000D0FC7C ?
44f0                 call     sclssutl_signalhand  B0000000B ? 000000468 ?
                              ler()                000000001 ? 130D67AD4 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
                                                   7076000000000000 ?
odm_get_list()+556   call     44f0                 1613000016EB0 ?
                                                   1D5EC00020938 ?
                                                   20DA40002ED68 ?
                                                   37E61000514AE ?
                                                   8ABBFFFFAF64 ? 43A600000000 ?
                                                   4530000000000000 ?
                                                   190000743D ?
get_odm_vgname_from  call     odm_get_list()       FFFFFFFFFFFCE80 ?
_pvid()+68                                         9001000A01CEFF0 ?
                                                   FFFFFFFFFFFDB50 ? 000000000 ?
                                                   000000000 ? 01025F0C3 ?
                                                   01025F0C3 ? 011B96620 ?
__get_lvm_lsvg_root  call     get_odm_vgname_from  FFFFFFFFFFFD4A0 ?
()+196                        _pvid()              9001000A01CF038 ?
                                                   FFFFFFFFFFFDC10 ? 000000000 ?
                                                   000000000 ? 01025F0C3 ?
                                                   01025F0C3 ? 000000144 ?
get_lvm_metrics_wit  call     __get_lvm_lsvg_root  9000000004526A4 ? 000000000 ?
h_disk_dictionary()           ()                   000000000 ? 9001000A01D0AB0 ?
+412                                               000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
perfstat_disk()+856  call     get_lvm_metrics_wit  11BC1A4B8 ? 9001000A01CF564 ?
                              h_disk_dictionary()  11BC472A8 ? 000000002 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
scrfosm_num_disk_in  call     perfstat_disk()      008969DA0 ? 00E957489 ?
terfaces()+252                                     000000000 ? 000000000 ?
                                                   000000000 ? 0001DEDFF ?
                                                   000000000 ? 013E76F49 ?
scrfosm_get_all_dis  call     scrfosm_num_disk_in  FFFFFFFFFFFDFF0 ?
k_info()+296                  terfaces()           FFFFFFFFFFFDE40 ?
                                                   FFFFFFFFFFFDE58 ?
                                                   8001000A02CE228 ?
                                                   FFFFFFFFFFFDE80 ?
                                                   8244422200000000 ?
                                                   8000000000148F4 ? 700000000 ?
crfmond_loop()+2196  call     scrfosm_get_all_dis  FFFFFFFFFFFDFF0 ? 11093FED0 ?
                              k_info()             110BD0950 ? 1110FB030 ?

 The resource ora.crf may fail on server and generate core dump by osysmond.bin

Case II

osysmond.bin dumps core, no information is written in crflogd.log or crfmond.log.

ohasd/orarootagent_root.log shows:

2012-09-28 20:08:44.719: [ora.crf][3355] {0:6:229} [check] Check return = 0, state detail = NULL
2012-09-28 20:09:10.245: [ USRTHRD][2923] {0:6:229} Check action requested by agent etnry point for ora.crf
2012-09-28 20:09:10.276: [    AGFW][2314] {0:6:229} Agent received the message: RESOURCE_PROBE[ora.crf 1 1] ID 4097:2830454
2012-09-28 20:09:10.277: [    AGFW][2314] {0:6:229} Preparing CHECK command for: ora.crf 1 1
2012-09-28 20:09:10.326: [ora.crf][1909] {0:6:229} [check] Error = error 11 encountered when sending messages to MOND
2012-09-28 20:09:10.326: [ora.crf][1909] {0:6:229} [check] Calling PID check for daemon
2012-09-28 20:09:10.326: [ora.crf][1909] {0:6:229} [check] Process id 21823648 translated to
2012-09-28 20:09:10.327: [ora.crf][1909] {0:6:229} [check] Error = error 6 encountered when sending messages to MOND
2012-09-28 20:09:10.327: [ora.crf][1909] {0:6:229} [check] Check return = 1, state detail = NULL
2012-09-28 20:09:10.345: [    AGFW][2314] {0:6:229} ora.crf 1 1 state changed from: ONLINE to: OFFLINE
2012-09-28 20:09:10.393: [    AGFW][2314] {0:6:231} Agent received the message: RESOURCE_START[ora.crf 1 1] ID 4098:1078090
2012-09-28 20:09:10.393: [    AGFW][2314] {0:6:231} Preparing START command for: ora.crf 1 1
2012-09-28 20:09:10.393: [    AGFW][2314] {0:6:231} ora.crf 1 1 state changed from: OFFLINE to: STARTING

Stack trace from the core dump shows:

pthread_kill
_p_raise
sclssutl_signalhandler(sig = 6)
pthread_kill
_p_raise
sclssutl_signalhandler(sig = 11)
crfmond_fill_disk_info_fields
crfmond_compute_disk_stats
crfmond_compute_stats
crfmond_loop
clsmond_main

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms