Cannot Connect To Production Database On Exadata
(Doc ID 1520686.1)
Last updated on JUNE 05, 2019
Applies to:
Oracle Database - Enterprise Edition - Version 11.2.0.2 and laterOracle Exadata Hardware - Version 11.2.0.2 and later
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.
Goal
Frequently the database is hanging and clients are unable to connect or takes long time to establish the connection.
Trace file from the cell node during the problem time shows the following :
---------------------------
/opt/oracle/cell11.2.2.3.5.0.DIAG13800389_LINUX.X64_110815/log/diag/asm/cell/exacel05/trace/svtrc_5033_*.trc
*** 2012-04-08 21:46:28.135
UserThread: LWPID: 5669 userId: 51 kernelId: 51 pthreadID: 0x62561940
SKGXP:[22ebc228.0]{0}: skgxpdocon: warning outstand accept handle count has
reached new high water mark 1000
SKGXP:[230fcaa8.1]{E}: (5033 <- 11376)SKGXPLOSTACK: message seqno: 32890
expected seqno: 32890 accept connection number 0x4cd9ead1
SKGXP:[230fcaa8.2]{E}: SKGXPLOSTACK: port 0x2304db20 msgsz=1048856 nreqs=50
free=0 hwm=0
SKGXP:[230fcaa8.3]{E}: SKGXP: WARN: send attempted to portwith no buffer <========
SKGXP:[230fcaa8.4]{E}: SKGXP: WARN: message seq 32890 expecting 32890
SKGXP:[230fcaa8.5]{E}: SKGXPGPID Internet address <IP Address> RDS port
number 16955
SKGXP:[230fcaa8.10]{{E}}: SSKGXP_PERFORMZCPY: sendmsg failed: (errno 105) - <=========
dest [<IP Address>:49739 - fd 203] RD rval 4
Trace file
/opt/oracle/cell11.2.2.3.5.0.DIAG13800389_LINUX.X64_110815/log/diag/asm/cell/exacel05/trace/svtrc_5033_*.trc
*** 2012-04-08 21:46:28.142
UserThread: LWPID: 5716 userId: 97 kernelId: 97 pthreadID: 0x7f18f940
SKGXP:[230fcaa8.7]{{E}}: SSKGXP_PERFORMZCPY: sendmsg failed: (errno 105) - <=======
dest [<IP Address>:49739 - fd 203] RD rval 4
SKGXP:[230fcaa8.8]{{E}}: SSKGXP_PERFORMZCPY: sendmsg failed: (errno 105) - <=======
dest [<IP Address>:49739 - fd 203] RD rval 4
------------------------------------------------
It seems there is a buffer sizing issue or some RDS issue which occurred. A lot of errno 105 (ENOBUFS, congestion) on both
receive and send ports. Congestion on receive port (rdma_pull stalled due to congestion) can cause a stall.
Solution
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |