My Oracle Support Banner

Servers Keeps Losing EMC XtremIO Storage (Doc ID 2513523.1)

Last updated on MAY 24, 2020

Applies to:

Linux OS - Version Oracle Linux 7.0 and later
Linux x86-64

Symptoms

/u01 file systems keep going to read only on multiple Oracle Linux 7.5 servers. /u01 is EXT3 from EMC storage.

kernel: lpfc 0000:81:00.0: 2:1305 Link Down Event x2 received Data: x2 x20 x800110 x0 x0
kernel: lpfc 0000:03:00.0: 0:1305 Link Down Event x2 received Data: x2 x20 x800110 x0 x0
kernel: lpfc 0000:81:00.0: 2:1303 Link Up Event x3 received Data: x3 x0 x80 x0 x0 x0 0
kernel: lpfc 0000:03:00.0: 0:1303 Link Up Event x3 received Data: x3 x0 x80 x0 x0 x0 0

kernel: lpfc 0000:81:00.0: 2:(0):0748 abort handler timed out waiting for aborting I/O (xri:x19a7) to complete: ret 0x2003, ID 0, LUN 234
kernel: lpfc 0000:81:00.0: 2:(0):0748 abort handler timed out waiting for aborting I/O (xri:x19f4) to complete: ret 0x2003, ID 0, LUN 147
kernel: lpfc 0000:81:00.0: 2:(0):0713 SCSI layer issued Device Reset (0, 1) return x2002
kernel: lpfc 0000:81:00.0: 2:(0):0724 I/O flush failure for context LUN : cnt x5
kernel: lpfc 0000:81:00.0: 2:0338 IOCB wait timeout error - no wake response Data x3c
kernel: lpfc 0000:81:00.0: 2:(0):0727 TMF FCP_LUN_RESET to TGT 0 LUN 100 failed (0, 0) iocb_flag x206
kernel: lpfc 0000:81:00.0: 2:(0):0713 SCSI layer issued Device Reset (0, 100) return x2007
kernel: lpfc 0000:81:00.0: 2:0338 IOCB wait timeout error - no wake response Data x3c
kernel: lpfc 0000:81:00.0: 2:(0):0727 TMF FCP_LUN_RESET to TGT 0 LUN 103 failed (0, 0) iocb_flag x206
kernel: lpfc 0000:81:00.0: 2:(0):0713 SCSI layer issued Device Reset (0, 103) return x2007

kernel: lpfc 0000:81:00.0: 2:(0):9024 FCP command x0 failed: x2 SNS x70000600 x29000000 Data: x2 x0 x12 x0 x0
kernel: lpfc 0000:81:00.0: 2:(0):0710 Iodone <0/375> cmd ffff887bb6203b80, error x2 SNS x60070 x29 Data: x0 x0
kernel: lpfc 0000:81:00.0: 2:(0):9030 FCP cmd x0 failed <1/375> status: x1 result: x0 sid: xbb0e42 did: xbb0dc0 oxid: x14d Data: x5 x853
kernel: lpfc 0000:81:00.0: 2:(0):9024 FCP command x0 failed: x2 SNS x70000600 x29000000 Data: x2 x0 x12 x0 x0
kernel: lpfc 0000:81:00.0: 2:(0):0710 Iodone <1/375> cmd ffff887bb62001c0, error x2 SNS x60070 x29 Data: x0 x0
kernel: F 7445013.084/181114022913 ora_ctwr_ovgccx[88703] afdq_batch_collect: Request for dev 249/25 incomplete for 141100 msec.(inflt=1)
kernel: F 7445016.086/181114022916 ora_ctwr_ovgccx[88703] afdq_batch_collect: Request for dev 249/25 incomplete for 144102 msec.(inflt=1)
kernel: F 7445016.086/181114022916 ora_ctwr_ovgccx[88703] afdq_batch_collect: Request for dev 249/25 incomplete for 144102 msec.(inflt=1)
kernel: F 7445016.086/181114022916 ora_ctwr_ovgccx[88703] afdq_batch_collect: Request for dev 249/25 incomplete for 144102 msec.(inflt=1)
kernel: F 7445019.089/181114022919 ora_ctwr_ovgccx[88703] afdq_batch_collect: Request for dev 249/25 incomplete for 147105 msec.(inflt=1)
kernel: lpfc 0000:03:00.0: 0:(0):0724 I/O flush failure for context LUN : cnt x1
kernel: lpfc 0000:03:00.0: 0:(0):0702 Issue FCP_LUN_RESET to TGT 2 LUN 100 rpi x4 nlp_flag x80000000 Data: x1bf x4
kernel: lpfc 0000:03:00.0: 0:(0):0706 fcp_rsp valid 0x1, rsp len=8 code 0x0
kernel: lpfc 0000:03:00.0: 0:(0):0715 Task Mgmt No Failure
kernel: lpfc 0000:03:00.0: 0:(0):0713 SCSI layer issued Device Reset (2, 100) return x2002
kernel: F 7445022.089/181114022922 ora_ctwr_ovgccx[88703] afdq_batch_collect: Request for dev 249/25 incomplete for 150105 msec.(inflt=1)
kernel: F 7445025.092/181114022925 ora_ctwr_ovgccx[88703] afdq_batch_collect: Request for dev 249/25 incomplete for 153108 msec.(inflt=1)
kernel: lpfc 0000:81:00.0: 2:(0):9030 FCP cmd x0 failed <0/100> status: x1 result: x0 sid: xbb0e42 did: xbb0de0 oxid: x17d Data: x6 x883
kernel: lpfc 0000:81:00.0: 2:(0):9024 FCP command x0 failed: x2 SNS x70000600 x29000000 Data: x2 x0 x12 x0 x0
kernel: lpfc 0000:81:00.0: 2:(0):0710 Iodone <0/100> cmd ffff887bb6200a80, error x2 SNS x60070 x29 Data: x0 x0
kernel: F 7445028.095/181114022928 ora_ctwr_ovgccx[88703] afdq_batch_collect: Request for dev 249/25 incomplete for 156111 msec.(inflt=1)
kernel: lpfc 0000:81:00.0: 2:(0):9030 FCP cmd x0 failed <1/100> status: x1 result: x0 sid: xbb0e42 did: xbb0dc0 oxid: x16e Data: x5 x874
kernel: lpfc 0000:81:00.0: 2:(0):9024 FCP command x0 failed: x2 SNS x70000600 x29000000 Data: x2 x0 x12 x0 x0
kernel: lpfc 0000:81:00.0: 2:(0):0710 Iodone <1/100> cmd ffff887bb6201a40, error x2 SNS x60070 x29 Data: x0 x0
kernel: F 7445031.098/181114022931 ora_ctwr_ovgccx[88703] afdq_batch_collect: Request for dev 249/25 incomplete for 159114 msec.(inflt=1)
kernel: INFO: task multipathd:38743 blocked for more than 120 seconds.
kernel:      Tainted: P           OE  Z 4.1.12-124.18.6.el7uek.x86_64 #2
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: multipathd      D ffff887e7cc58140     0 38743      1 0x00000080
kernel: ffff8824b1003d38 0000000000000086 ffff8824b1003d68 ffff883d04cfc600
kernel: ffff8824b1003d58 ffff8824b1004000 ffff883e248e96c0 ffff883e36251000
kernel: 000000000002001d 00007f0c4ac45d10 ffff8824b1003d58 ffffffff8174e8d7
kernel: Call Trace:
kernel: [<ffffffff8174e8d7>] schedule+0x37/0x90
kernel: [<ffffffff814dccb5>] scsi_block_when_processing_errors+0xb5/0x130
kernel: [<ffffffff810d1e30>] ? prepare_to_wait_event+0xf0/0xf0
kernel: [<ffffffff814da93d>] scsi_ioctl_block_when_processing_errors+0x4d/0x60
kernel: [<ffffffffa00f7b85>] sd_ioctl+0x65/0x120 [sd_mod]
kernel: [<ffffffff81329fda>] blkdev_ioctl+0x2aa/0x990
kernel: [<ffffffff81257141>] block_ioctl+0x41/0x50
kernel: [<ffffffff81230213>] do_vfs_ioctl+0x323/0x530
kernel: [<ffffffff81136714>] ? __audit_syscall_entry+0xb4/0x110
kernel: [<ffffffff8102714c>] ? do_audit_syscall_entry+0x6c/0x70
kernel: [<ffffffff812304a1>] SyS_ioctl+0x81/0xa0
kernel: [<ffffffff81753c1f>] ? system_call_after_swapgs+0xe9/0x190
kernel: [<ffffffff81753c18>] ? system_call_after_swapgs+0xe2/0x190
kernel: [<ffffffff81753c11>] ? system_call_after_swapgs+0xdb/0x190
kernel: [<ffffffff81753cde>] system_call_fastpath+0x18/0xd7
kernel: INFO: task multipathd:38744 blocked for more than 120 seconds.

kernel: lpfc 0000:81:00.0: 2:(0):9024 FCP command x12 failed: x0 SNS x0 x0 Data: x8 xf1 x0 x0 x0
kernel: lpfc 0000:81:00.0: 2:(0):9030 FCP cmd x4d failed <2/307> status: x1 result: x4 sid: xb90c70 did: xb90e60 oxid: x1902 Data: x1806 x808
kernel: lpfc 0000:81:00.0: 2:(0):9024 FCP command x4d failed: x2 SNS x70000500 x24000000 Data: xa x4 x12 x0 x0
kernel: lpfc 0000:81:00.0: 2:(0):0710 Iodone <2/307> cmd ffff883b904ab100, error x2 SNS x50070 x24 Data: x0 x4
kernel: lpfc 0000:81:00.0: 2:(0):9030 FCP cmd x4d failed <2/307> status: x1 result: x44 sid: xb90c70 did: xb90e60 oxid: x193a Data: x1806 x840
kernel: lpfc 0000:81:00.0: 2:(0):9024 FCP command x4d failed: x2 SNS x70000500 x24000000 Data: xa x44 x12 x0 x0
kernel: lpfc 0000:81:00.0: 2:(0):0710 Iodone <2/307> cmd ffff883b904aaa00, error x2 SNS x50070 x24 Data: x0 x44
kernel: lpfc 0000:81:00.0: 2:(0):9024 FCP command x12 failed: x0 SNS x0 x0 Data: x8 xf1 x0 x0 x0
kernel: lpfc 0000:81:00.0: 2:(0):9030 FCP cmd x4d failed <1/299> status: x1 res

Changes

 There were changes in network configuration and the kernel was updated to 4.1.12-124.18.6.el7uek.x86_64

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.