Qlge mpi link down! and qlc Link OFFLINE / ONLINE Events (Doc ID 1948259.1)

Last updated on JULY 29, 2016

Applies to:

Solaris Operating System - Version 10 10/08 U6 and later
Information in this document applies to any platform.

Symptoms

On SunOS 5.10 Generic 148889-01 x86

'Dual 10-Gigabit Ethernet SFP+ FCoE - DA' | SG-XPCIEFCOE2-Q-TA | 375-3682

  Id Size Info Rev Module Name
  -----------------------------------------------------------------
  106 1af48 196 1 fp (SunFC Port v20121113-1.87)
  107 18e98 201 1 fcp (SunFC FCP v20121113-1.151)
  108 a1d0 - 1 fctl (SunFC Transport v20121113-1.62)
  110 1bc020 231 1 qlc (SunFC Qlogic FCA v20120717-4.01)
  
Frequently FC/qlc  link and qlge goes down on CNA adapter

- IP and FCoE link down/up since weeks on several systems, the messages occur always on exact 5 minutes times even on systems without ntp (e.g. 09:05:00, 13:00:00; 20:25:00, etc)

We do see this behaviour on several systems with different patch levels.
The one host, where this SR was opened is running qlc 3.05 (Patch 146490-05).
This issue is also present on hosts with qlc 3.06 (Patch 146490-06) and we just patched a cluster to the latest recommended patchset  qlc 4.01, but the problem still persists:

qlc (SunFC Qlogic FCA v20110321-3.05)
qlge (GLDv3 QLogic 81XX 101203-v1.10)
Firmware Version: 05.04.03

qlc (SunFC Qlogic FCA v20110825-3.06)
qlge (GLDv3 QLogic 81XX 101203-v1.10)
Firmware Version: 05.06.00

qlc (SunFC Qlogic FCA v20120717-4.01)
qlge (GLDv3 QLogic 81XX 101203-v1.10)
Firmware Version: 05.06.04

 

- The systems are connected via two QLC CNA adapters connecting one port on each adapter (375-3682-01) using IPMP. Customer  using tagged VLANs.
- One onboard interface (igb0) and one QGE interface (ixgbe0) is used for cluster interconnect.
- The issue is visible  on clustered and non-clustered systems
- It seems that only hosts with running zones are affected.
- On a cluster node with many installed flying zones we saw the messages several times a day, but when we switched the zones to a different node everything was fine.
- All systems were we have seen this until now have non-global zones running on it
- The systems are monitored by a 3rd party management SW ( HP Open View (OVO) ) - no SNMP is used.

- The switch model connected to the servers are - Cisco N2K-C2232PP-10GE connected to Cisco Nexus 5000

- Oracle supported FCOE switch are :
* Cisco nexus 5000
* Brocade 8000

- Below you can see the symptom and events from messages :

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms