RP/MSQ 4.0A(VMS) -RP35 - SBS server accvios in AVLtree_find
(Doc ID 776241.1)
Last updated on NOVEMBER 27, 2023
Applies to:
Oracle MessageQ - Version 4.0 and laterInformation in this document applies to any platform.
Information in this document applies to any platform
Goal
Product: MessageQ, V4.0A-RP35 Component: SBS server Operating system: OpenVMS Alpha V7.3-1 PROBLEM DESCRIPTION SBS server is accvio'ing in customer's production system. To date, three crashes have been reported on different nodes. SBS Server Access Violation: $ SET NOVERIFY %DMQ-S-SETLNM, Set to MessageQ LNM table DMQ$LNM_5001_14011 (LNM$PROCESS_TABLE) "DMQ$ENTRYRTL" = "DMQ$EXE:DMQ$<CODE>.EXE" "DMQ$EXECRTL" = "DMQ$EXE:DMQ$<CODE>.EXE" "DMQ$GROUP_OUTPUT" = "EVL_LOG" = "CONSOLE" "DMQ$PROCESS_OUTPUT" = "SYSOUT" "DMQ$PSSRTL" = "DMQ$EXE:DMQ$<CODE>.EXE" "DMQ$TRACE_OUTPUT" = "SYSOUT" (DMQ$LNM_5001_14011) "DMQ$ACCESS" = "$1$DGA0:[DMQ$V40.USER.5001_14011]" "DMQ$BUS_GROUP" = "5001_14011" "DMQ$CHKPT_FILE" = "DMQ$USER:DMQ$<CODE>.DAT" "DMQ$COM_SERVER_UP" = "YES" "DMQ$DISK" = "DISK$AXPVMS73:" "DMQ$DOC" = "DMQ$DISK:[DMQ$V40.DOC]" "DMQ$ENTRYRTL" = "DMQ$EXE:DMQ$<CODE>.EXE" "DMQ$EVENT_LOGGER_MBX" = "MBA91:" "DMQ$EXAMPLES" = "DMQ$DISK:[DMQ$V40.EXAMPLES]" "DMQ$EXE" = "DMQ$DISK:[DMQ$V40.EXE]" "DMQ$EXECRTL" = "DMQ$EXE:DMQ$<CODE>.EXE" "DMQ$INIT_FILE" = "DMQ$USER:DMQ$INIT.TXT" "DMQ$LIB" = "DMQ$DISK:[DMQ$V40.LIB]" "DMQ$LOG" = "DMQ$DISK:[DMQ$V40.LOG.5001_14011]" "DMQ$MRS" = "DMQ$MRS_DISK:[DMQ$V40.MRS.5001_14011]" "DMQ$MSGSHR" = "DMQ$EXE:DMQ$MSGSHRV40.EXE" "DMQ$PSSRTL" = "DMQ$EXE:DMQ$<CODE>.EXE" "DMQ$ROOT" = "DMQ$V40" "DMQ$SET_LNM" = "DISK$AXPVMS73:[DMQ$V40.EXE]DMQ$SET_LNM_TABLEV40.EXE" "DMQ$TCPIP_LD" = "DEC" "DMQ$TERMINATION_MBX" = "MBA92:" "DMQ$USER" = "DMQ$DISK:[DMQ$V40.USER.5001_14011]" "DMQ$VERSION" = "V4.0A-111(RP35)" (LNM$JOB_81AF9F00) (LNM$GROUP_000001) (LNM$SYSTEM_TABLE) "DMQ$DISK" = "SYS$SYSDEVICE" "DMQ$EXE" = "DMQ$DISK:[DMQ$V40.EXE]" "DMQ$MRS_DISK" = "DISK11" (LNM$SYSCLUSTER_TABLE (DECW$LOGICAL_NAMES) 15-NOV-2004 18:47:09.64 User: <SYSTEM_USER> Process ID: <PID> Node: <NODE> Process name: "DMQ_S_500114011" Terminal: User Identifier: [<SYSTEM_USER>] Base priority: 9 Default file spec: SYS$SYSROOT:[SYSMGR] Number of Kthreads: 1 Process Quotas: Account name: <SYSTEM_USER> CPU limit: Infinite Direct I/O limit: 100 Buffered I/O byte count quota: 698464 Buffered I/O limit: 100 Timer queue entry quota: 499 Open file quota: 497 Paging file quota: 394688 Subprocess quota: 10 Default page fault cluster: 64 AST quota: 499 Enqueue quota: 375 Shared file limit: 0 Max detached processes: 0 Max active jobs: 0 Accounting information: Buffered I/O count: 57 Peak working set size: 2512 Direct I/O count: 19 Peak virtual size: 185984 Page faults: 259 Mounted volumes: 0 Images activated: 3 Elapsed CPU time: 0 00:00:00.07 Connect time: 0 00:00:00.13 Authorized privileges: CMKRNL EXQUOTA NETMBX OPER SYSGBL SYSLCK SYSNAM SYSPRV TMPMBX WORLD Process privileges: CMKRNL may change mode to kernel EXQUOTA may exceed disk quota NETMBX may create network device OPER may perform operator functions SYSGBL may create system wide global sections SYSLCK may lock system wide resources SYSNAM may insert in system logical name table SYSPRV may access objects via system protection TMPMBX may create temporary mailbox\ WORLD may affect other processes in the world Process rights: SYSTEM resource BATCH NET$MANAGE System rights: SYS$<NODE> Auto-unshelve: on Image Dump: off Soft CPU Affinity: off Parse Style: Traditional Case Lookup: Blind Home RAD: 0 Scheduling class name: none Process Dynamic Memory Area Current Size (KB) 128.00 Current Size (Pagelets) 256 Free Space (KB) 113.15 Space in Use (KB) 14.84 Largest Var Block (KB) 112.57 Smallest Var Block (By) 16.00 Number of Free Blocks 9 Free Blocks LEQU 64 bytes 5 There is 1 process in this job: DMQ_S_500114011 (*) ------------------------------------------------ ~~~~~~~~~~~~ System Description ~~~~~~~~~~~~~~~~ ------------------------------------------------ System name: <HOST_NAME> Harware type: AlphaServer 4100 5/533 4MB Software type: OpenVMS Alpha V7.3-1 Physical memory: 6291456 pagelets (3072Mb) CPUs (total/active): 4/4 Cluster: Yes, 2 nodes Global pages free: 4623520 Global sections free: 1827 Pagefile free: 249984 Global page file: 300000 Bug check fatal: FALSE Virtual page count: 2147483647 ------------------------------------------------ $ fcmd := $DMQ$EXE:DMQ$SBS_SERVER.EXE $ fcmd Copyright ) BEA Systems, Inc. 1998. All rights reserved. DMQ Server starting... %SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=0000000000000000, PC=000000000023020C, PS=0000001B %TRACE-F-TRACEBACK, symbolic stack dump follows image module routine line rel PC abs PC DMQ$SBS_SERVER SBS_CPI_SUBS sbs_cpi_avail_compare <PID> 000000000000926C 000000000023020C DMQ$SBS_SERVER AVL_TREE AVLtree_find <PID> 00000000000009B0 000000000026E5C0 DMQ$SBS_SERVER SBS_CPI_AVAIL sbs_cpi_add_avail_list <PID> 000000000000034C 000000000024C42C DMQ$SBS_SERVER SBS_CPI_MSGS handle_msg_event_reg <PID> 0000000000015850 0000000000246300 DMQ$SBS_SERVER SBS_CPI_MSGS sbs_cpi_handle_sbs_msg <PID> 0000000000000B7C 000000000023162C DMQ$SBS_SERVER DMQ$SBS_SERVER main <PID> 00000000000008FC 00000000002208FC DMQ$SBS_SERVER DMQ$SBS_SERVER __main 0 000000000000006C 000000000022006C 0 FFFFFFFF8028B63C FFFFFFFF8028B63C DmQ I 19:57.7 Time Stamp - 18-JAN-2005 07:19:57.72 DmQ I 19:57.7 MSGPURGED, The MessageQ exit handler has purged 2 incoming messages SYSTEM job terminated at 18-JAN-2005 07:19:57.76 Accounting information: Buffered I/O count: 114 Peak working set size: 8064 Direct I/O count: 54 Peak virtual size: 542512 Page faults: 790 Mounted volumes: 0 --------------- Other stack traces (all around the AVLtree_find) area are: DMQ Server starting... %SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=000000005336C601, PC=000000000023020C, PS=0000001B %TRACE-F-TRACEBACK, symbolic stack dump follows image module routine line rel PC abs PC DMQ$SBS_SERVER SBS_CPI_SUBS sbs_cpi_avail_compare 28880 000000000000926C 000000000023020C DMQ$SBS_SERVER AVL_TREE AVLtree_find 8823 00000000000009B0 000000000026E5C0 DMQ$SBS_SERVER SBS_CPI_AVAIL sbs_cpi_add_avail_list 25910 000000000000034C 000000000024C42C DMQ$SBS_SERVER SBS_CPI_MSGS handle_msg_avail_reg 33403 0000000000013A84 0000000000244534 DMQ$SBS_SERVER SBS_CPI_MSGS sbs_cpi_handle_sbs_msg 27853 00000000000006FC 00000000002311AC DMQ$SBS_SERVER DMQ$SBS_SERVER main 26282 00000000000008FC 00000000002208FC DMQ$SBS_SERVER DMQ$SBS_SERVER __main 0 000000000000006C 000000000022006C 0 FFFFFFFF8028B63C FFFFFFFF8028B63C DmQ I 03:47.9 Time Stamp - 18-JAN-2005 08:03:47.94 DmQ I 03:47.9 MSGPURGED, The MessageQ exit handler has purged 3 incoming messages SYSTEM job terminated at 18-JAN-2005 08:03:47.98 Accounting information: Buffered I/O count: 114 Peak working set size: 6608 Direct I/O count: 54 Peak virtual size: 917184 Page faults: 1000 Mounted volumes: 0 ----------------------- Copyright ) BEA Systems, Inc. 1998. All rights reserved. DMQ Server starting... %SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=0000000000000000, PC=000000000023013C, PS=0000001B %TRACE-F-TRACEBACK, symbolic stack dump follows image module routine line rel PC abs PC DMQ$SBS_SERVER SBS_CPI_SUBS sbs_cpi_compare_by_process 28789 000000000000919C 000000000023013C DMQ$SBS_SERVER AVL_TREE AVLtree_find 8823 00000000000009B0 000000000026E5C0 DMQ$SBS_SERVER SBS_CPI_SUBS sbs_cpi_fnd_process_entry 26084 000000000000094C 00000000002278EC DMQ$SBS_SERVER SBS_PPI_SUBS sbs_ppi_msg_gen 25789 00000000000003F8 00000000002591E8 DMQ$SBS_SERVER SBS_CPI_MSGS msg_gen_go 32348 000000000000F14C 000000000023FBFC DMQ$SBS_SERVER SBS_CPI_MSGS sbs_cpi_handle_mot_msg 31732 000000000000D7E0 000000000023E290 DMQ$SBS_SERVER DMQ$SBS_SERVER main 26330 0000000000000B2C 0000000000220B2C DMQ$SBS_SERVER DMQ$SBS_SERVER __main 0 000000000000006C 000000000022006C 0 FFFFFFFF8028B63C FFFFFFFF8028B63C DmQ I 22:59.0 Time Stamp - 19-JAN-2005 07:22:59.07 DmQ I 22:59.0 MSGPURGED, The MessageQ exit handler has purged 1 incoming message <MSG_ID> job terminated at 19-JAN-2005 07:22:59.12 Accounting information: Buffered I/O count: 125 Peak working set size: 34816 Direct I/O count: 201 Peak virtual size: 542384 Page faults: 23449 Mounted volumes: 0 Charged CPU time: 0 00:00:08.84 Elapsed time: 0 22:40:55.01 ---------------------- Notes: ------ 1) Latest kit is RP68; customer is made aware of that 2) There is an SBS accvio fixed in RP49, but is not related to this 3) I have asked the customer what has changed in the system/environment leading to these crashes, but they have not been able to provide any info as of yet 4) Customer indicates that "Tracing was turned on for the SBS server on the <NODE> node only (the application runs only on <NODE>). The application was then failed over from the <NODE2> cluster to the <NODE2> cluster. We will run the application on <NODE> until the SBS server accvio's again". 5) Hardened code around the AVLtree_find area so even if the tree becomes corrupted, the SBS server doesn't accvio. 6) Also considering putting in some debug statements to see give more information when the SBS fails.
Solution
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Goal |
Solution |