Bad Performance nn a Web Cache Cluster when the Network Interface Cards of One of the Members is Marked Down
(Doc ID 1931261.1)
Last updated on OCTOBER 02, 2023
Applies to:
Portal - Version 11.1.1.1.0 to 11.1.1.6.0 [Release FMW11g]Web Cache - Version 11.1.1.0.0 to 11.1.1.7.0 [Release Oracle11g]
Information in this document applies to any platform.
Symptoms
In a high availability setup with multiple Portal midtiers, an performance for invalidation requests degrades when the the network interface card (NIC) of one of the web cache cluster members is not reachable anymore on the network. Upon making the NIC available again on the network, performance is restored immediately. Typical scenarios which can lead to the performance degradation are :
- Web Cache cluster member has been shutdown for maintenance
- Network cable was detached from the NIC on one of the web cache cluster members
- Web Cache Cluster member has crashed
The performance degradation can be demonstrated with the Web Cache Invalidation API. When all cluster members are up and running, invalidation requests through the API take no more than milliseconds. Performance of the invalidation requests drops to the second range when the NIC is not avaialble anymore.
Test results
Invalidation performance when the NIC is up:
<!DOCTYPE INVALIDATIONRESULT SYSTEM "http://www.oracle.com/webcache/90400/WCSinvalidation.dtd">
<INVALIDATIONRESULT VERSION="WCS-1.1">
<SYSTEM>
<SYSTEMINFO NAME="WCS_CACHE_NAME" VALUE="mt1.acme.org-webcache1"/>
<SYSTEMINFO NAME="WCS_NUM_OBJECT" VALUE="1"/>
</SYSTEM>
<OBJECTRESULT>
<BASICSELECTOR URI="http://servername:80/portal/page/portal/Design_Time_PG/Welcome" />
<RESULT ID="1" STATUS="SUCCESS" NUMINV="1"/>
</OBJECTRESULT>
</INVALIDATIONRESULT>
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.01
Invalidation performance when the NIC is down
<!DOCTYPE INVALIDATIONRESULT SYSTEM "http://www.oracle.com/webcache/90400/WCSinvalidation.dtd">
<INVALIDATIONRESULT VERSION="WCS-1.1">
<SYSTEM>
<SYSTEMINFO NAME="WCS_CACHE_NAME" VALUE="servername-webcache1"/>
</SYSTEM>
<OBJECTRESULT>
<BASICSELECTOR URI=""/>
<RESULT ID="1" STATUS="Cannot connect to WebCache invalidation port"
NUMINV="0"/>
</OBJECTRESULT>
</INVALIDATIONRESULT>
</INVALIDATIONRESULTDETAIL>
PL/SQL procedure successfully completed.
Elapsed: 00:00:05.69
Symptoms
- Performance spikes are seen in Oracle Portal
- Performance is not affected when one of the web cache processes is not available. The performance degradation is only seen when the NIC is down
- Other requests for content are still fast
- In the database, the wait event "TCP Socket (KGAS)" occurs
Environment
The issue was tested in a six node Portal high availability cluster with Fusion Middlware 11g R1 (11.1.1.7)
Test suite setup
- On the Portal midtier server, change directory to $ORACLE_HOME/webcache/toolkit. This directory contains the code for the Web Cache Invalidation API (wxvutil.sql)
- From this directory, start SQL*PLUS and connect to the database, preferably with a dummy user or any other account for testing purposes
- Connect as database administrator and grant privileges to the test user to use the network layer from within the database:
DECLARE
BEGIN -- Only uncomment the following line if ACL "network_services.xml" has already been created
--DBMS_NETWORK_ACL_ADMIN.DROP_ACL('network_services.xml'); DBMS_NETWORK_ACL_ADMIN.CREATE_ACL(
acl => 'network_services.xml',
description => 'NETWORK ACL',
principal => '<USERNAME>',
is_grant => true,
privilege => 'connect'); DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE(
acl => 'network_services.xml',
principal => '<USERNAME>',
is_grant => true,
privilege => 'resolve'); DBMS_NETWORK_ACL_ADMIN.ASSIGN_ACL(
acl => 'network_services.xml',
host => '*'); COMMIT; END;
(Replace the user <USERNAME> with the schema for testing in the above code) - Connect as the test user and run the wxvutil.sql to create the INVALIDATION API
You are now setup to use the INVALIDATION API. Next step is to test the invalidation - Create a script file to invalidate the Portal Design Time page:
set serveroutput on
set timing on
exec wxvutil.invalidate_reset;
exec wxvutil.invalidate_uri('http://<servername>:<entry port>/portal/page/portal/Design_Time_PG/Welcome',0,null);
exec wxvutil.invalidate_exec('<servername>',<invalidation_port>,'<invalidation_password>');
(Change the strings between <> with the values for your environment)
Note : Use the instructions in section A2 from Note 1075472.1 - "XML Parsing Error: Syntax Error, Line Number 1, Column 1" and "Oracle-Auth-Token Does Not Match Error" When Accessing Portal 11g" should the invalidation password be unknown - Run the script file from the test account when all nodes are available. The output should be similar to this:
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options PL/SQL procedure successfully completed. Elapsed: 00:00:00.01 PL/SQL procedure successfully completed. Elapsed: 00:00:00.00
<?xml version="1.0"?>
<!DOCTYPE INVALIDATIONRESULT SYSTEM
"http://www.oracle.com/webcache/90400/WCSinvalidation.dtd">
<INVALIDATIONRESULT VERSION="WCS-1.1">
<SYSTEM>
<SYSTEMINFO NAME="WCS_CACHE_NAME" VALUE="servername-webcache1"/>
<SYSTEMINFO NAME="WCS_NUM_OBJECT" VALUE="1"/>
</SYSTEM>
<OBJECTRESULT>
<BASICSELECTOR URI="http://servername:80/portal/page/portal/Design_Time_PG/Welcome" />
<RESULT ID="1" STATUS="SUCCESS" NUMINV="1"/>
</OBJECTRESULT>
</INVALIDATIONRESULT> PL/SQL procedure successfully completed. Elapsed: 00:00:00.01
====== - Repeat the same test with the NIC of one of the cluster members down. The reported elapsed time for the invalidation request will be higher.
Changes
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |