My Oracle Support Banner

Coredumping with segmentation fault while starting system with AFD is in use (Doc ID 2549749.1)

Last updated on APRIL 17, 2023

Applies to:

Oracle Database - Enterprise Edition - Version 12.1.0.2 to 12.2.0.1 [Release 12.1 to 12.2]
Information in this document applies to any platform.

Symptoms

+ AFD enabled on linux system cluster or standalone.
+ With GRID HOME at 2018 OCT PSU / 2019 JAN PSU.

---------------
Case 1:-----
---------------

crsctl commands were not working. crsd goes into intermediate state .

# crsctl stat res -t -init
Oracle Clusterware infrastructure error in CRSCTL (OS PID 4248): Fatal signal
11 has occurred in program crsctl thread 3005124224; nested signal count is 1
CRS-8503: Oracle Clusterware CRSCTL process with operating system process ID
4248 experienced fatal signal or exception code 11
Oracle Clusterware infrastructure fatal error in CRSCTL (OS PID 4248):
Clusterware dumping stack for fatal signal 11...
Signal details: [si_signo=11] [si_errno=0] [si_code=1] [si_int=1868767346]
[si_ptr=0x202c65646f632072] [si_addr=(nil)]

----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- --------------------
----------------------------
clsbStackDump()+351 call kgdsdst() 7FFD16C96C58 000000003
clsb0.c:3523 7FFD16C776C0 ?
7FFD16C777D8 ?
7FFD16C954B8 ? 000000083 ?
clsbFatalAction()+9 call clsbStackDump() 003878D60 7FFD16C980B0
73 clsb0.c:1541 7FCFB2BE7844 000002137

$ crsctl stst res -t -init
Oracle Clusterware infrastructure error in CRSCTL (OS PID 37503): Fatal
signal 1

1 has occurred in program crsctl thread 2916195968; nested signal
count is 1
CRS-8503: Oracle Clusterware CRSCTL process with operating system process ID
375

03 experienced fatal signal or exception code 11
Oracle Clusterware infrastructure fatal error in CRSCTL (OS PID 37503):
Clusterw

are dumping stack for fatal signal 11...
Signal details: [si_signo=11] [si_errno=0] [si_code=1] [si_code=1]
[si_int=1371721216] [si_ptr=0x7ffc51c2ce00] [si_addr=(nil)]

gdb $GI_HOME/bin/crsctl.bin
core.8724
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-110.el7

Reading symbols from $GI_HOME/bin/crsctl.bin...done.
[New LWP 8724]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by '$GI_HOME/bin/crsctl.bin status resource
diagsnap'.
Program terminated with signal 6, Aborted.
#0 0x00007f6e5c569207 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install
compat-libcap1-1.10-7.el7.x86_64 glibc-2.17-222.el7.x86_64
libaio-0.3.109-13.el7.x86_64 libgcc-4.8.5-28.el7.x86_64
libstdc++-4.8.5-28.el7.x86_64
(gdb) bt
#0 0x00007f6e5c569207 in raise () from /lib64/libc.so.6
#1 0x00007f6e5c56a8f8 in abort () from /lib64/libc.so.6
#2 0x00007f6e621f0d7a in skgdbgcra () from
$GI_HOME/lib/libclntsh.so.12.1
#3 0x00007f6e63e72b5a in clsbSigErrCB (sig=0x7ffe5a59db70, ctx=0x2fc5d60) at
clsb0.c:2332
#4 0x00007f6e621e91d5 in skgesig_sigactionHandler () from
$GI_HOME/lib/libclntsh.so.12.1
#5 <signal handler called>
#6 0x00007f6e624253ca in dbgc_cra () from
$GI_HOME/lib/libclntsh.so.12.1
#7 0x00007f6e60ba4c61 in kgepop () from
$GI_HOME/lib/libclntsh.so.12.1
#8 0x00007f6e61f409ea in kgeasnmierr () from
$GI_HOME/lib/libclntsh.so.12.1
#9 0x00007f6e62850a35 in dbgfcsIlcsRegister () from
$GI_HOME/lib/libclntsh.so.12.1
#10 0x00007f6e60ea1476 in dbgfilcsRegister () from
$GI_HOME/lib/libclntsh.so.12.1
#11 0x00007f6e60ea1276 in dbgfcsInitDiagCtx () from
$GI_HOME/lib/libclntsh.so.12.1
#12 0x00007f6e60e9eeb1 in dbgc_init_all () from
$GI_HOME/lib/libclntsh.so.12.1
#13 0x00007f6e63eda74f in clsdAdrInit (oracle_base=0x2c26450 "$ORACLE_BASE",
prod_type=CRS_dbgfcsAdrProdId, host_name=0x2c35de0 "<RACNODE1>",
instance_id=0x7ffe5a5a3470 "crs",
initfile=0x7ffe5a5a1470
"$ORACLE_BASE/crsdata/<RACNODE1>/crsdiag/crsctl.ini", prefix=0x2d9e6c0
"crsctl", pidstr=0x2c4d250 "8724", suffix=0x0, adr_parms=0x7ffe5a5a44c0,
adr_cbs=0x7ffe5a5a4490, flags=1,
adr_initialized=0x7ffe5a5a458c, adr_flags=0) at clsdadr.c:1235
#14 0x00007f6e63e73570 in clsbInitDiag (ps=0x2fc5d60, iopts=0x7ffe5a5a4740)
at clsb0.c:2662
#15 0x00007f6e63e6f4dc in clsbProcessInit (ps=0x2fc5d60,
iopts=0x7ffe5a5a4740) at clsb0.c:775
#16 0x00007f6e63e6cdab in clsbCInit (iopts=0x7ffe5a5a4740,
cbtok=0x7ffe5a5a4820, trcb=0x0, trctx=0x0, alcb=0x0, alctx=0x0) at clsb.c:546
#17 0x00007f6e63e6c8ca in clsbCMain (iopts=0x7ffe5a5a4880, pmain=0x48df24
<crsctl_main(int, unsigned char**)>, argc=4, argv=0x7ffe5a5a4a28,
pmainret=0x7ffe5a5a4918) at clsb.c:409
#18 0x000000000044c087 in main (argc=4, argv=0x7ffe5a5a4a28) at
s0crsctl.cpp:95
(gdb) q

Oct 18 14:07:05 <RACNODE1> kernel: oracle[9156] general protection ip:bce6ebf
sp:7ffde7f53900 error:0 in oracle[400000+efdb000]
Oct 18 14:07:05 <RACNODE1> kernel: oracle[9166] general protection ip:bce6ebf
sp:7ffc54a5c200 error:0 in oracle[400000+efdb000]
Oct 18 14:07:05 <RACNODE1> kernel: oracle[9168] general protection ip:bce6ebf
sp:7ffd5bb53c00 error:0 in oracle[400000+efdb000]
Oct 18 14:07:05 <RACNODE1> kernel: oracle[9170] general protection ip:bce6ebf
sp:7fff9109cb80 error:0 in oracle[400000+efdb000]
Oct 18 14:07:05 <RACNODE1> kernel: oracle[9175] general protection ip:bce6ebf
sp:7ffd6bb67480 error:0 in oracle[400000+efdb000]
Oct 18 14:07:05 <RACNODE1> kernel: oracle[9180] general protection ip:bce6ebf
sp:7ffedffb4300 error:0 in oracle[400000+efdb000]
Oct 18 14:07:05 <RACNODE1> kernel: oracle[9187] general protection ip:bce6ebf
sp:7ffffe371600 error:0 in oracle[400000+efdb000]

---------------
Case 2:---
---------------

+ Standlone system without ACFS ,AFD is only in use.

+ With database is in Mount stage.

Reread of blocknum=255, file=+DG/db_sid/DATAFILE/db_sid.278.988472743. found valid data
Hex dump of (file 14, block 255) in trace file $ORACLE_BASE/diag/rdbms/db_sid/db_sid/trace/db_sid_ora_22540.trc

Corrupt block relative dba: 0x038000ff (file 14, block 255)
Bad header found during validation
Data in bad block:
type: 0 format: 1 rdba: 0x3a006500
last change scn: 0x7400.70003800 seq: 0x0 flg: 0x22
spare1: 0x0 spare2: 0x7a spare3: 0x3e00
consistency value in tail: 0xc64e2802
check value in block header: 0x2f00
block checksum disabled

Reread of blocknum=255, file=+DG/db_sid/DATAFILE/db_sid.279.988472769. found valid data
Sun Jun 02 12:40:03 2019
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x10EB3060] [PC:0x3ACFA52, {empty}] [flags: 0x8, count: 2]
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0xCEDB000, {empty}] [flags: 0xA, count: 3]
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x10EB3060] [PC:0x3ACE1C6, {empty}] [flags: 0xA, count: 4]
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0xCEDB000, {empty}] [flags: 0xA, count: 4]
Sun Jun 02 12:40:04 2019
Instance Critical Process (pid: 25, ospid: 22581, ASMB) died unexpectedly
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x10EB3060] [PC:0x3ACFA52, {empty}] [flags: 0x8, count: 2]
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0xCEDB000, {empty}] [flags: 0xA, count: 3]
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x10EB3060] [PC:0x3ACE1C6, {empty}] [flags: 0xA, count: 4]
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0xCEDB000, {empty}] [flags: 0xA, count: 4]
Sun Jun 02 12:40:15 2019
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x10EB3060] [PC:0x3ACFA52, {empty}] [flags: 0x8, count: 2]
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0xCEDB000, {empty}] [flags: 0xA, count: 3]
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x10EB3060] [PC:0x3ACE1C6, {empty}] [flags: 0xA, count: 4]
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0xCEDB000, {empty}] [flags: 0xA, count: 4]
Sun Jun 02 12:40:15 2019
System state dump requested by (instance=1, osid=21334 (PSP0)), summary=[abnormal instance termination].
System State dumped to trace file $ORACLE_BASE/diag/rdbms/db_sid/db_sid/trace/db_sid_diag_21384_20190602124015.trc
Sun Jun 02 12:40:16 2019
USER (ospid: 22597): terminating the instance due to error 1092

OS logfile shows,

2019-06-02T05:54:02.507544-04:00 <RACNODE1> kernel: auditd[18384] general protection ip:7f12a27289f6 sp:7ffc52ee8f48 error:0 in libc-2.12.so[7f12a26ad000+18b000]
Jun 2 05:54:02 <RACNODE1> kernel: auditd[18384] general protection ip:7f12a27289f6 sp:7ffc52ee8f48 error:0 in libc-2.12.so[7f12a26ad000+18b000]
2019-06-02T05:54:13.515579-04:00 <RACNODE1> kernel: evmlogger.bin[18386]: segfault at 10 ip 00007fa8f6406470 sp 00007ffdd4160e38 error 4 in libpthread-2.12.so[7fa8f63fd000+17000]
Jun 2 05:54:13 <RACNODE1> kernel: evmlogger.bin[18386]: segfault at 10 ip 00007fa8f6406470 sp 00007ffdd4160e38 error 4 in libpthread-2.12.so[7fa8f63fd000+17000]
2019-06-02T05:55:13.556564-04:00 <RACNODE1> kernel: evmlogger.bin[18575]: segfault at 10 ip 00007f30e21a9470 sp 00007fffbffda5a8 error 4 in libpthread-2.12.so[7f30e21a0000+17000]
Jun 2 05:55:13 <RACNODE1> kernel: evmlogger.bin[18575]: segfault at 10 ip 00007f30e21a9470 sp 00007fffbffda5a8 error 4 in libpthread-2.12.so[7f30e21a0000+17000]

File_name:: messages

==============================================================================

 

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.