Infiniband switch "Local SM enabled and running, state "DISCOVER" or possibly "BOOT IN PROGRESS"
(Doc ID 2819434.1)
Last updated on NOVEMBER 10, 2021
Applies to:
Zero Data Loss Recovery Appliance X7 Hardware - Version All Versions and laterSun Network QDR InfiniBand Gateway Switch - Version All Versions and later
Sun Datacenter InfiniBand Switch 36 - Version All Versions and later
Information in this document applies to any platform.
Symptoms
[root@sw-ibX ~]# getmaster
Local SM enabled and running, state DISCOVER
Last change in Master SubnetManager status detected at: Thu Oct 14 10:06:14 EDT 2021
Master SubnetManager on sm lid 12 sm guid 0x10e0cdd128a0a0 : SUN DCS 36P QDR ib36p-bur09-b-sp 10.152.230.98
Master SubnetManager Activity Count: 1169011 Priority: 14
If the switch was Re-booted you will see the following:
[root@sw-ibX ~]# getmaster
Local SM enabled and running, state DISCOVER
BOOT IN PROGRESS
[root@sw-ibX ~]# service ilom status
ILOM stack is partly started with 13 processes.
ILOM daemons that failed to start are : IPMIMain MsgHndlr PEF LAN
However, ILOM stack subsystem is locked..... OK!
When you run disablesm you get the following:
[root@sw-ibX ~]# disablesm
Stopping partcfgchk [ OK ]
Stopping partitiond-daemon [ OK ]
Traceback (most recent call last):
File "/usr/local/util/smcommand", line 259, in <module>
retval=handover(log)
File "/usr/local/util/smcommand", line 185, in handover
master=connection(None, log, handlehandover)
File "/usr/local/util/smcommand", line 112, in connection
tn.read_until(smprompt, 1)
File "/usr/lib/python2.6/telnetlib.py", line 319, in read_until
return self.read_very_lazy()
File "/usr/lib/python2.6/telnetlib.py", line 395, in read_very_lazy
raise EOFError, 'telnet connection closed'
EOFError: telnet connection closed
Stopping IB Subnet Manager..-. [ OK ]
If the switch hasn't been re-booted since the unsupported configuration changes, you may see the following:
[root@sw-ibX ~]# enablesm
Starting IB Subnet Manager. [ OK ]
Starting partitiond-daemon [ OK ]
Starting partcfgchk [ OK ]
"rpcinfo -p localhost" will produce the following:
[root@sw-ibX ~]# rpcinfo -p localhost
rpcinfo: can't contact portmapper: RPC: Authentication error; why = Client credential too weak
Changes
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |