ASM hangs when One Disk from the Fail-Group Goes Down (Doc ID 2292847.1)

Last updated on AUGUST 04, 2017

Applies to:

Oracle VM - Version 3.0.1 and later
Linux OS - Version Oracle Linux 5.10 and later
Linux x86-64

Symptoms

ASM is configured for disk redundancy (disk-mirroring). The Disks for the ASM is using multipath.
When one disk is down, ASM is expected to switch to the other redundant disk (mirror) and work fine. Here this doesn't happen.

ASM hangs with the following call trace:

Jun 8 09:56:02 xxxx kernel: INFO: task asm_dbw0_+asm:8145 blocked for more than 120 seconds.
Jun 8 09:56:02 xxxx kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 8 09:56:02 xxxx kernel: asm_dbw0_+asm D ffff8830d399e668 0 8145 1 0x00000080
Jun 8 09:56:02 xxxx kernel: ffff8830cce1b918 0000000000000082 0000000000000000 ffff883000000001
Jun 8 09:56:02 xxxx kernel: 0000000000012180 ffff8830cce1bfd8 ffff8830cce1a010 0000000000012180
Jun 8 09:56:02 xxxx kernel: ffff8830cce1bfd8 0000000000012180 ffff88002fba0080 ffff8830d399e0c0
Jun 8 09:56:02 xxxx kernel: Call Trace:
Jun 8 09:56:02 xxxx kernel: [<ffffffff8150e0cf>] schedule+0x3f/0x60
Jun 8 09:56:02 xxxx kernel: [<ffffffff8150e17c>] io_schedule+0x8c/0xd0
Jun 8 09:56:02 xxxx kernel: [<ffffffff812413ea>] get_request_wait+0xca/0x190
Jun 8 09:56:02 xxxx kernel: [<ffffffff81091440>] ? wake_up_bit+0x40/0x40
Jun 8 09:56:02 xxxx kernel: [<ffffffff8123a0d4>] ? elv_merge+0x84/0xe0
Jun 8 09:56:02 xxxx kernel: [<ffffffff812415d1>] __make_request+0x121/0x300
Jun 8 09:56:02 xxxx kernel: [<ffffffff81240160>] generic_make_request+0x2f0/0x5e0
Jun 8 09:56:02 xxxx kernel: [<ffffffff8104ce5d>] ? get_user_pages_fast+0xdd/0x1c0
Jun 8 09:56:02 xxxx kernel: [<ffffffff811a4c4f>] ? __bio_map_user_iov+0x25f/0x310
Jun 8 09:56:02 xxxx kernel: [<ffffffff812404d6>] submit_bio+0x86/0x110
Jun 8 09:56:02 xxxx kernel: [<ffffffffa00fa52f>] asm_submit_io+0x81f/0xbc0 [oracleasm]
Jun 8 09:56:02 xxxx kernel: [<ffffffffa00fb56a>] asm_submit_io_native+0x9a/0x1b0 [oracleasm]
Jun 8 09:56:02 xxxx kernel: [<ffffffff815100df>] ? _raw_spin_lock_irqsave+0x2f/0x40
Jun 8 09:56:02 xxxx kernel: [<ffffffffa00fc76b>] asm_do_io+0xa1b/0xb20 [oracleasm]
Jun 8 09:56:02 xxxx kernel: [<ffffffff811f6230>] ? sys_semtimedop+0x490/0x620
Jun 8 09:56:02 xxxx kernel: [<ffffffffa00f6240>] ? asmfs_put_super+0x20/0x20 [oracleasm]
Jun 8 09:56:02 xxxx kernel: [<ffffffffa00fc9ba>] asmfs_svc_io64+0x14a/0x2b0 [oracleasm]
Jun 8 09:56:02 xxxx kernel: [<ffffffff811a3c2b>] ? bio_put+0x2b/0x40
Jun 8 09:56:02 xxxx kernel: [<ffffffffa00fd436>] asmfs_file_read+0x66/0x160 [oracleasm]
Jun 8 09:56:02 xxxx kernel: [<ffffffff811725a5>] vfs_read+0xc5/0x190
Jun 8 09:56:02 xxxx kernel: [<ffffffff81172771>] sys_read+0x51/0x90
Jun 8 09:56:02 xxxx kernel: [<ffffffff810cfdab>] ? audit_syscall_exit+0x25b/0x290
Jun 8 09:56:02 xxxx kernel: [<ffffffff81518582>] system_call_fastpath+0x16/0x1b

Changes

Value for  "no_path_retry"  parameter in multipath.conf is set to queue

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms