RP/TUX 8.0 - ISL hangs in infinite loop at ocm_get_entry() with a circular OCM entry hashlist. (Doc ID 769241.1)

Last updated on NOVEMBER 04, 2016

Applies to:

Oracle Tuxedo / Tuxedo / 8.0
Information in this document applies to any platform

Goal

Following are the settings and ISL hang scenario:
In UBB ISL settings:
"ISL"	SRVGRP="SYS_GRP4"	SRVID=15
	CLOPT="-A -- -n //193.168.2.17:5000 -m50 -M100 -x50 -O -o150 -C warn -d /dev/tcp" 
	RQPERM=0666	REPLYQ=N	RPPERM=0666	MIN=1	MAX=1	CONV=N
	SYSTEM_ACCESS=PROTECTED
	MAXGEN=10	GRACE=86400	RESTART=Y
Notice that -o is 150 and -u is default to be 10.

When the ISL hang at ocm_get_entry() tight looping situation occurs, the ocm_print_ocm() output is follows:
(for complete printout, see Clarifypc:\attachments\222xxx\222700\new_ISL_hang\ULOG.042701)
005634.ndcetms17!ISL.632288: OCM CORRUPTED bkt = 757
005634.ndcetms17!ISL.632288: ocmtab = 1073747412
005634.ndcetms17!ISL.632288: ocmhsh = 1073797812
005634.ndcetms17!ISL.632288: ocmfreehead = 8
005634.ndcetms17!ISL.632288: ocminuse = 2
005634.ndcetms17!ISL.632288: ocmhw = 9
005634.ndcetms17!ISL.632288: ------------------------------------------
005634.ndcetms17!ISL.632288: 	OCM ENTRY 0	
005634.ndcetms17!ISL.632288: fhlink = 2
005634.ndcetms17!ISL.632288: bhlink = 2
005634.ndcetms17!ISL.632288: rlink = 0
005634.ndcetms17!ISL.632288: host = iii.ppp.168.190
005634.ndcetms17!ISL.632288: port = 2539
005634.ndcetms17!ISL.632288: refcount = 10
005634.ndcetms17!ISL.632288: core ocmflags = 0x4
005634.ndcetms17!ISL.632288: -------------------------------------------
005634.ndcetms17!ISL.632288: ------------------------------------------
005634.ndcetms17!ISL.632288: 	OCM ENTRY 1	
005634.ndcetms17!ISL.632288: fhlink = 9
005634.ndcetms17!ISL.632288: bhlink = -1
005634.ndcetms17!ISL.632288: rlink = 1
005634.ndcetms17!ISL.632288: host = 
005634.ndcetms17!ISL.632288: port = 0
005634.ndcetms17!ISL.632288: refcount = 0
005634.ndcetms17!ISL.632288: core ocmflags = 0x10
005634.ndcetms17!ISL.632288: -------------------------------------------
005634.ndcetms17!ISL.632288: ------------------------------------------
005634.ndcetms17!ISL.632288: 	OCM ENTRY 2	
005634.ndcetms17!ISL.632288: fhlink = 0
005634.ndcetms17!ISL.632288: bhlink = 0
005634.ndcetms17!ISL.632288: rlink = 2
005634.ndcetms17!ISL.632288: host = iii.ppp.168.190
005634.ndcetms17!ISL.632288: port = 2539
005634.ndcetms17!ISL.632288: refcount = 10
005634.ndcetms17!ISL.632288: core ocmflags = 0x4
005634.ndcetms17!ISL.632288: -------------------------------------------
005634.ndcetms17!ISL.632288: ------------------------------------------
005634.ndcetms17!ISL.632288: 	OCM ENTRY 3	
005634.ndcetms17!ISL.632288: fhlink = 7
005634.ndcetms17!ISL.632288: bhlink = 8
005634.ndcetms17!ISL.632288: rlink = 3
005634.ndcetms17!ISL.632288: host = 
005634.ndcetms17!ISL.632288: port = 0
005634.ndcetms17!ISL.632288: refcount = 0
005634.ndcetms17!ISL.632288: core ocmflags = 0x10
005634.ndcetms17!ISL.632288: -------------------------------------------

The bucket#757 was bad which contained the looping OCM ENTRY #0 and #2.  (You can tell from the OCM ENTRY #0 and OCM
ENTRY #2 fhlink/bhlink values that they are looping at each other!)  Also, you can see that they both have a
ocmrefcount of 10 which is the maxmium default settings for -u.
In the isl_ocm.c file, ocm_get_entry() function, there is a tight loop of:
for (idx=OCMHASH(bkt); idx !=NIL; idx=ocme->ocmhashlist.fhlink) {
...
if ((ocme->ocmport == port) && (strcmp(ocme->ocmhost,host)==0) &&
    (ocme->ocmrefcount < ISLT->maxcltspconn)) 
   { break; }
}
Since (ocme->ocmrefcount < ISLT->maxcltspconn) will always be false (maxcltspconn is the default -u value 10
for this case), this for-loop will go on forever with OCM ENTRY #0 & #2 pointing to each other.

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms