Jgroup Initialization Hanging While Restarting IC Servers

(Doc ID 2362337.1)

Last updated on FEBRUARY 16, 2018

Applies to:

Oracle Knowledge - Version 8.5.1.4 and later
Information in this document applies to any platform.

Symptoms

When restarting some IC instances out of many IC instance occasionally they will not restart and appear to hang at jgroup initialization:

2018/01/30 14:14:47 | INFO | Property type=VARCHAR2
2018/01/30 14:14:47 | INFO | Property name=varchar500
2018/01/30 14:14:47 | INFO | Property type=VARCHAR2
2018/01/30 14:14:48 | INFO |
2018/01/30 14:14:48 | INFO | -------------------------------------------------------------------
2018/01/30 14:14:48 | INFO | GMS: address=moic-wc-a4p-39866, cluster=EOFGroup851PROD, physical address=172.28.66.141:62604
2018/01/30 14:14:48 | INFO | -------------------------------------------------------------------

Other IC instances in the same cluster start receiving these messages and have to be shut down.

these messages start happening in the logs. on some instance. They have 80 prod IC instances and maybe one or two start getting these messages. These messages will just grow the log really big really fast and they stop being able to host users. So they are basically incapcitated. I dont know what this looks like on the user side.
2018/01/30 00:00:00 | INFO | 1157942897 [INT-2,EOFGroup851PROD,moic-wc-a4p-13313] ERROR org.jgroups.protocols.UDP - JGRP000029: moic-wc-a4p-13313: failed sending message to moic-wc-a37p-41692 (99898 bytes): java.lang.Exception: message size (99898) is greater than max bundling size (64000). Set the fragmentation/bundle size in FRAG/FRAG2 and TP correctly, headers: GMS: GmsHeader[VIEW], NAKACK2: [XMIT_RSP, seqno=712], UDP: [cluster_name=EOFGroup851PROD]
2018/01/30 00:00:00 | INFO | 1157942897 [INT-1,EOFGroup851PROD,moic-wc-a4p-13313] ERROR org.jgroups.protocols.UDP - JGRP000029: moic-wc-a4p-13313: failed sending message to moic-wc-a3p-25450 (874118 bytes): java.lang.Exception: message size (874118) is greater than max bundling size (64000). Set the fragmentation/bundle size in FRAG/FRAG2 and TP correctly, headers: GMS: GmsHeader[INSTALL_MERGE_VIEW], UNICAST3: DATA, seqno=11, conn_id=1255, UDP: [cluster_name=EOFGroup851PROD]

When these messages happen no other instances can be restarted until these instances are stopped.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms