System Panic Occurs When Closing SDP Socket After IPMP Failback - Solaris 10 (Doc ID 2011156.1)

Last updated on MAY 24, 2015

Applies to:

Solaris SPARC Operating System - Version 10 11/06 U3 to 10 1/13 U11 [Release 10.0]
Information in this document applies to any platform.

Symptoms

Solaris host running solaris 10 panics when closing Sockets Direct Protocol (SDP) socket after IPMP failback of an Infiniband (IB) link.   The panic has the following stack:

 

CAT(vmcore.14/10V)> panic
panic on CPU 20
panic string:   BAD TRAP: type=31 rp=2a102a61570 addr=38 mmu_fsr=0 occurred
in module "sdpib" due to a NULL pointer dereference
==== panic user (LWP_SYS) thread: 0x30052062120  PID: 1488  on CPU: 20 ====
cmd: ./srv 3333
t_procp: 0x30047d23398
   p_as: 0x6003603a6a8  size: 2441216  RSS: 1679360
      a_hat: 0x3000470bac0
      cnum: CPU20:1/181
      cpusran: 20
   p_zone: 0x1a1e138 (global)
t_stk: 0x2a102a61ae0  sp: 0x198aab1  t_stkbase: 0x2a102a5c000
t_pri: 59 (TS)  pctcpu: 0.001756
t_transience: 10 (TRANSIENT)  t_wkld_flags: 0
t_lwp: 0x3004df31390  t_tid: 1
   machpcb: 0x2a102a61ae0
   lwp_ap:   0x2a102a61bd0
   t_mstate: LMS_SYSTEM  ms_prev: LMS_KFAULT
   ms_state_start: 3 days 15 hours 2 minutes 56.673202039 seconds earlier
   ms_start: 3 days 15 hours 3 minutes 10.252330487 seconds earlier
t_cpupart: 0x198b9c0(0)  last CPU: 20
idle: 13318 ticks (2m13.18s)
start: Tue Sep 16 09:02:03 2014
age: 135 seconds (2 minutes 15 seconds)
t_state:     TS_ONPROC
t_flag:      0x1800 (T_PANIC|T_LWPREUSE)
t_proc_flag: 0x104 (TP_TWAIT|TP_MSACCT)
t_schedflag: 3 (TS_LOAD|TS_DONT_SWAP)
p_flag:      0x4a004000 (SEXECED|SMSACCT|SAUTOLPG|SMSFORK)

pc:      unix:panicsys+0x48:   call     unix:setjmp

void unix:panicsys+0x48((const char *)0x10c5f28, (va_list)0x2a102a61318,
(struct regs *)0x198b480, (int)1, 0x9900001604, , , , , , , , 0x10c5f28,
0x2a102a61318)
unix:vpanic_common+0x78(0x10c5f28, 0x2a102a61318, 8, 0x80000000, 0x7fff6d2b,
0x80000000)
void unix:panic+0x1c((const char *)0x10c5f28, (void *)0x31, 0x2a102a61570,
0x38, 0, 0x60035be29f8, 0x186f098, ...)
int unix:die+0x78((unsigned)0x31, (struct regs *)0x2a102a61570,
(caddr_t)0x38, (uint_t)0)
void unix:trap+0xa20((struct regs *)0x2a102a61570, (caddr_t)0x38, (uint32_t),
(uint32_t))
unix:ktl0+0x64()
-- trap data  type: 0x31 (data access MMU miss)  rp: 0x2a102a61570  --
  addr: 0x38
pc:  0x7b396930 sdpib:sdp_conn_deallocate_ib+0xa0:   lduw       [%o0 + 0x38],
%o4
npc: 0x7b396934 sdpib:sdp_conn_deallocate_ib+0xa4:   sub  %o4, 0x1, %o3
  global:                       %g1                  0
        %g2                  0  %g3          0x18726a0
        %g4          0x1088610  %g5               0x32
        %g6                  0  %g7      0x30052062120
  out:  %o0                  0  %o1                  0
        %o2            0x10d05  %o3         0x701ea400
        %o4                  0  %o5      0x60034918fd0
        %sp      0x2a102a60e11  %o7         0x7b396928
  loc:  %l0             0x12d0  %l1                  0
        %l2                  5  %l3                  0
        %l4      0x6003aab6000  %l5         0x701f3bf0
        %l6         0x701f3bb8  %l7      0x6004e7ad858
  in:   %i0                  2  %i1      0x6004e7ad5f8
        %i2      0x6004e7ad858  %i3                  0
        %i4                  0  %i5 0xffffffffffffffff
        %fp      0x2a102a60ec1  %i7         0x7b396a94
<trap>void sdpib:sdp_conn_deallocate_ib+0xa0((sdp_conn_t *)0x6004e7ad5f8)
int32_t sdpib:sdp_conn_destruct+0x28((sdp_conn_t *)0x6004e7ad5f8)
unix:stubs_common_code+0x70()
unix:sdp_close(0) - frame recycled
void sockfs:socksdpv_inactive+0x54((struct vnode *)0x6003e662f80?, (struct
cred *)0x600308822a8)
genunix:vn_rele((vnode_t *)0x6003e662f80) - frame recycled
int genunix:closef+0x7c((file_t *)0x300544fce38)
int genunix:closeandsetf+0x4b0((int)3, (file_t *)0)
int genunix:close+8((int))
unix:syscall_trap32+0xcc()
-- switch to user thread's user stack --

 

This issue is not seen in Solaris 11.

 

Changes

Soalris 10 system is using Sockets Direct Protocol (SDP) over Infiniband (IB) links with IPMP failover.   The problem is seen when there is an open SDP socket and the socket is closed after IPMP failback after an IB link is restored.

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms