My Oracle Support Banner

Bug 33689598 ZFS Storage Appliance Thread Deadlock in Appliance Kit Daemon After Replication Reversal, Sever, or Cluster Operations (Doc ID 2832628.1)

Last updated on OCTOBER 14, 2022

Applies to:

Internal ZFS ZS5-4 Rack - Version All Versions to All Versions [Release All Releases]
Oracle ZFS Storage ZS5-4 - Version All Versions to All Versions [Release All Releases]
Oracle ZFS Storage Appliance Racked System ZS5-4 - Version All Versions to All Versions [Release All Releases]
Oracle ZFS Storage ZS7-2 Mid-Range TAA Compliant - Version All Versions to All Versions [Release All Releases]
Oracle ZFS Storage Appliance Racked System ZS7-2 for PCA/CatC - Version All Versions to All Versions [Release All Releases]
7000 Appliance OS (Fishworks)
All ZFS Storage Appliance Products, Exceptions Noted Below.





Symptoms

Replication sever, replication reversal, cluster takeover, or cluster failback operation does not complete. Management Browser User Interface (BUI) and CLI operations for clustering, shares, replication, analytics, status, and any configuration changes may also hang and not complete.

This issue can only occur if the system is a target of a replication and can happen after the source of the replication has changed its IP address or if there are multiple IP addresses for the source and routing has picked a different IP address to use than the previous replication update.

The issue could happen while the source initiates a replication update.  This may also happen during takeover if the source IP could not be resolved to a hostname before the takeover and then, due to the takeover, the source IP could be resolved to a hostname.

The issue could occur during a rolling upgrade as follows:

  1. Upgrade the first head to the affected release.
  2. Upgrade the second head to the same release.
  3. The first head does a takeover due to the second head rebooting to complete the upgrade. If a replication receive request happens while the takeover is occurring and the takeover does not complete, then this issue may have occurred.

To further identify this issue, a bundle should be taken.  This issue happens when the Appliance Kit Daemon (akd) converts the IP address of the source to a hostname. The peer log in the bundle will contain a line as follows:

Changes

Issue Found In:

OS8.8.39, ak-2013.06.05.8.39

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.