Solaris 10 May Panic With a Page Fault in Module mpt_sas Due to a Null Pointer Dereference (Doc ID 1359856.1)

Last updated on JULY 29, 2016

Applies to:

Solaris Operating System - Version 10 10/08 U6 to 10 9/10 U9 [Release 10.0]
Information in this document applies to any platform.

Symptoms

SAS cards with FW v5.250.09.00 or older can send the TASK QUEUE FULL event.  A diag reset event can cause system to panic with a string like this one:

Aug 30 00:28:34 dibbler unix: [ID 836849 kern.notice]
Aug 30 00:28:34 dibbler ^Mpanic[cpu4]/thread=fffffe8000a69c60:
Aug 30 00:28:34 dibbler genunix: [ID 335743 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=fffffe8000a699e0 addr=20 occurred in module "unix" due to a NULL pointer dereference
Aug 30 00:28:34 dibbler unix: [ID 100000 kern.notice]
Aug 30 00:28:34 dibbler unix: [ID 839527 kern.notice] sched:
Aug 30 00:28:34 dibbler unix: [ID 753105 kern.notice] #pf Page fault
Aug 30 00:28:34 dibbler unix: [ID 532287 kern.notice] Bad kernel fault at addr=0x20
Aug 30 00:28:34 dibbler unix: [ID 243837 kern.notice] pid=0, pc=0xfffffffffb83d58b, sp=0xfffffe8000a69ad8, eflags=0x10246
Aug 30 00:28:34 dibbler unix: [ID 211416 kern.notice] cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
Aug 30 00:28:34 dibbler unix: [ID 354241 kern.notice] cr2: 20 cr3: 11051000 cr8: c
Aug 30 00:28:34 dibbler unix: [ID 592667 kern.notice]   rdi:               20 rsi: fffffe87007cd778 rdx: fffffe8000a69c60
Aug 30 00:28:34 dibbler unix: [ID 592667 kern.notice]   rcx:         4c87f607  r8:         4c000000  r9:  0 
Aug 30 00:28:34 dibbler unix: [ID 592667 kern.notice]   rax:                0 rbx:       1d3050ec6d rbp: fffffe8000a69bc0
Aug 30 00:28:34 dibbler unix: [ID 592667 kern.notice]   r10:        a30313630 r11: fffffffffbcf3420 r12:  0  
Aug 30 00:28:34 dibbler unix: [ID 592667 kern.notice]   r13:                0 r14: ffffffff91997b28 r15:   0       
Aug 30 00:28:34 dibbler unix: [ID 592667 kern.notice]   fsb: fffffd7ffe6f0200 gsb: ffffffff8cbdc800  ds: 0    
Aug 30 00:28:34 dibbler unix: [ID 592667 kern.notice]    es:                0  fs:              1bb  gs:             0  
Aug 30 00:28:34 dibbler unix: [ID 592667 kern.notice]   trp:                e err:                2 rip:fffffffffb83d58b
Aug 30 00:28:34 dibbler unix: [ID 592667 kern.notice]    cs:               28 rfl:            10246 rsp:  fffffe8000a69ad8
Aug 30 00:28:34 dibbler unix: [ID 266532 kern.notice]    ss:               30
Aug 30 00:28:34 dibbler unix: [ID 100000 kern.notice]
Aug 30 00:28:34 dibbler genunix: [ID 655072 kern.notice] fffffe8000a698f0 unix:die+da ()
Aug 30 00:28:34 dibbler genunix: [ID 655072 kern.notice] fffffe8000a699d0 unix:trap+5e6 ()
Aug 30 00:28:34 dibbler genunix: [ID 655072 kern.notice] fffffe8000a699e0 unix:_cmntrap+140 ()
Aug 30 00:28:34 dibbler genunix: [ID 655072 kern.notice] fffffe8000a69bc0 unix:mutex_enter+b ()
Aug 30 00:28:34 dibbler genunix: [ID 655072 kern.notice] fffffe8000a69c40 genunix:taskq_thread+1d5 ()
Aug 30 00:28:34 dibbler genunix: [ID 655072 kern.notice] fffffe8000a69c50 unix:thread_start+8 ()
Aug 30 00:28:34 dibbler unix: [ID 100000 kern.notice]
Aug 30 00:28:34 dibbler genunix: [ID 672855 kern.notice] syncing file systems...

When using SCAT to analyze the crash dump, this command will give output with mutex_enter and mptsas_handle_event on the stack:

CAT(vmcore.5/10X)> stack -l

will include a sequence with these routines:

  0xfffffe8000ae2aa8 -0x118(%rbp) = 0xfffffffffb83d58b (unix:mutex_enter+0xb)
  0xfffffe8000ae2ab0 -0x110(%rbp) = 0x28
  0xfffffe8000ae2ab8 -0x108(%rbp) = 0x10246
  0xfffffe8000ae2ac0 -0x100(%rbp) = 0xfffffe8000ae2ad8
  0xfffffe8000ae2ac8 -0xf8(%rbp) = 0x0
  0xfffffe8000ae2ad0 -0xf0(%rbp) = 0xfffffe8000ae2bc0
  0xfffffe8000ae2ad8 -0xe8(%rbp) = 0xffffffffef5e5481 (mpt_sas:mptsas_handle_event+0x31)

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms