My Oracle Support Banner

Oracle Linux: Panic in qla2xxx.ko, or Fail to Change Active Path When Path Down (Doc ID 2644579.1)

Last updated on MARCH 03, 2020

Applies to:

Linux OS - Version Oracle Linux 7.5 with Unbreakable Enterprise Kernel [4.1.12] to Oracle Linux 7.5 with Unbreakable Enterprise Kernel [4.1.12] [Release OL7U5]
Oracle VM - Version 3.4.5 to 3.4.5 [Release OVM34]
x86_64

Symptoms

OS Panic occurs or switching active path fails when current physical active path is down on Oracle Linux 7.5 or Oracle VM Server(Dom0) 3.4.5.

Panic message example:

PID: 0      TASK: ffff880954182a00  CPU: 1   COMMAND: "swapper/1"
 #0 [ffff88097fc83a80] machine_kexec at ffffffff8106126b
 #1 [ffff88097fc83af0] crash_kexec at ffffffff81119fa2
 #2 [ffff88097fc83bc0] oops_end at ffffffff8101b958
 #3 [ffff88097fc83bf0] no_context at ffffffff81741324
 #4 [ffff88097fc83c50] __bad_area_nosemaphore at ffffffff8174140c
 #5 [ffff88097fc83ca0] bad_area_nosemaphore at ffffffff81741578
 #6 [ffff88097fc83cb0] __do_page_fault at ffffffff810700a6
 #7 [ffff88097fc83d20] do_page_fault at ffffffff810704b0
 #8 [ffff88097fc83d60] page_fault at ffffffff8175571f
    [exception RIP: unknown or invalid address]
    RIP: 0000000000000000  RSP: ffff88097fc83e10  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: ffff88074335a180  RCX: 0000000000001fdc
    RDX: 0000000000000000  RSI: ffffffffa006dc00  RDI: ffff88074335a180
    RBP: ffff88097fc83e38   R8: ffff88097fc91828   R9: ffff88097fc91700
    R10: 0000000000000002  R11: 0000000000000005  R12: ffff88094debb7f8
    R13: 0000000000000296  R14: ffffffffa006dc00  R15: ffff88074335a180
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff88097fc83e10] qla2x00_sp_timeout at ffffffffa006dc5d [qla2xxx]
#10 [ffff88097fc83e40] call_timer_fn at ffffffff810f4db0
#11 [ffff88097fc83e80] run_timer_softirq at ffffffff810f6c38
#12 [ffff88097fc83f00] __do_softirq at ffffffff8108d6b0
#13 [ffff88097fc83f70] irq_exit at ffffffff8108dbd5
#14 [ffff88097fc83f90] smp_apic_timer_interrupt at ffffffff817570fc
#15 [ffff88097fc83fb0] apic_timer_interrupt at ffffffff817514e2
--- <IRQ stack> ---
#16 [ffff880954197da8] apic_timer_interrupt at ffffffff817514e2
    [exception RIP: cpuidle_enter_state+209]
    RIP: ffffffff815cf3a1  RSP: ffff880954197e58  RFLAGS: 00000246
    RAX: 000037aca7acbf78  RBX: ffff88097fc93bc0  RCX: 0000000000000018
    RDX: 0000000000666669  RSI: ffff880954198000  RDI: ffffffff81b55240
    RBP: ffff880954197e98   R8: 0000000000000018   R9: 0000000000001b88
    R10: 0000000000006f79  R11: 0000000000000000  R12: ffffffff81751393
    R13: ffffffff8175139a  R14: ffffffff817513a1  R15: ffffffff817513a8
    ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
#17 [ffff880954197ea0] cpuidle_enter at ffffffff815cf547
#18 [ffff880954197eb0] cpu_startup_entry at ffffffff810d0996
#19 [ffff880954197f20] start_secondary at ffffffff81054d38

/var/log/messages example when fail to switch active path, typically includes "Async done-gpnid fail":

Nov 26 05:48:12 <hostname> kernel: [10853315.078175] qla2xxx[0000:af:00.0]-####:15: DPC handler sleeping.
Nov 26 05:49:31 <hostname> kernel: [10853394.022306] qla2xxx[0000:af:00.0]-####:15: RSCN database changed -- #### #### ####.
Nov 26 05:49:31 <hostname> kernel: [10853394.022317] qla2xxx[0000:af:00.0]-####:15: RSCN: Domain 0x###### was affected
Nov 26 05:49:31 <hostname> kernel: [10853394.022323] qla2xxx[0000:af:00.0]-####:15: qla24xx_handle_rscn_event ##:##:##:##:##:##:##:## DS0 LS 7
Nov 26 05:49:31 <hostname> kernel: [10853394.022333] qla2xxx[0000:af:00.0]-####:15: qla24xx_handle_rscn_event ##:##:##:##:##:##:##:## DS0 LS 7
Nov 26 05:49:31 <hostname> kernel: [10853394.022376] qla2xxx[0000:af:00.0]-####:15: Async-gpnid hdl=### ID ##:##:##.
Nov 26 05:49:31 <hostname> kernel: [10853394.022396] qla2xxx[0000:af:00.0]-####:15: DPC handler sleeping.
Nov 26 05:49:31 <hostname> fcoemon: FC_HOST_EVENT 6 at 1574714971 secs on host15:code 5=rscn datalen 4 data=48693248
Nov 26 05:49:31 <hostname> kernel: [10853394.048632] qla2xxx[0000:af:00.0]-####:15: Async done-gpnid fail res 1 rscn gen 1 ID ##:##:##:##:##:##:##:##:##:##
Nov 26 05:49:31 <hostname> kernel: [10853394.048643] qla2xxx[0000:af:00.0]-####:15: qla24xx_handle_gpnid_event 3192 port_id: ######
Nov 26 05:49:31 <hostname> kernel: [10853394.048735] qla2xxx[0000:af:00.0]-####:15: DPC handler sleeping.
Nov 26 05:49:31 <hostname> kernel: [10853394.050047] qla2xxx[0000:af:00.0]-####:15: Port logout #### #### ####.
Nov 26 05:49:31 <hostname> kernel: [10853394.051157] qla2xxx[0000:af:00.0]-####:15: Port logout #### #### ####.
Nov 26 05:53:09 <hostname> kernel: [10853611.835097] qla2xxx[0000:d8:00.0]-####:17: RSCN database changed -- #### #### ####.
Nov 26 05:53:09 <hostname> kernel: [10853611.835124] qla2xxx[0000:d8:00.0]-####:17: RSCN: Domain 0x###### was affected

 

Changes

 current active path is down(by pulling out cable, fc switch down)

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.