My Oracle Support Banner

Cell reboot due to OOM while patching to 20.1 after flash replacement (Doc ID 2766322.1)

Last updated on JUNE 08, 2021

Applies to:

Oracle Database - Enterprise Edition - Version 20.3.0.0.0 Preview and later
Exadata X6-2 Hardware - Version All Versions and later
Exadata X6-8 Hardware - Version All Versions and later
Zero Data Loss Recovery Appliance X6 Hardware - Version All Versions and later
Oracle SuperCluster M6-32 Hardware - Version All Versions and later
Information in this document applies to any platform.

Symptoms

While upgrading cell from 19.3 to 20.1 image ,after initial few cell patching,one of the cells
was rebooting due to out of memory issue.

Flash was replaced recently on that cell recently .
Aura6 flash replaced existing Aura3 flash .
Aura6 flash size is different from aura3.

From 20.1 , we started calculating hugepages size based on flash size.

Working::


Memory reserved for cellsrv: 125503 MB Memory for other processes: 2900 MB
_cell_fc_persistence_state=WriteBack (ossp_conf=2)
Adjusted fc_columnar_size_limit 131072
Configured Scratch Buffer size 36MB
FLASH: Reserve hugepage memory 43224 MB

                                                ^^^^^^^^^^
Successfully allocated 3328 MB for Storage Index. Storage Index memory usage

 

 

Non-working ::


Memory reserved for cellsrv: 125503 MB Memory for other processes: 2900 MB
_cell_fc_persistence_state=WriteBack (ossp_conf=2)
Adjusted fc_columnar_size_limit 131072
Configured Scratch Buffer size 36MB
FLASH: Reserve hugepage memory 108066 MB
                                                    ^^^^^^^^^^
Successfully allocated 3328 MB for Storage Index. Storage Index memory usage

[ 85.122766] MST:: : get_space_support_status 441: Device 0x1003
(0:b0:0.0) doesn't support CR_SPACE capability.
[ 87.788387] ip6_tables: (C) 2000-2006 Netfilter Core Team
[ 97.564977] RDS/IB: Active conn ffff96433cfee000 i_cm_id
ffff963ecb99d000, frag 16KB, connected
<::ffff:192.168.10.40,::ffff:192.168.10.40,0> version 4.1
[ 97.565892] RDS/IB: Passive conn ffff96415efc4000 i_cm_id
ffff96454b705c00, frag 16KB, connected
<::ffff:192.168.10.40,::ffff:192.168.10.40,0> version 4.1
[ 98.588103] RDS/IB: Active conn ffff963c87632000 i_cm_id
ffff964b5c03f400, frag 16KB, connected
<::ffff:192.168.10.39,::ffff:192.168.10.39,0> version 4.1
[ 98.588879] RDS/IB: Passive conn ffff96427c602000 i_cm_id
ffff96454b7d6800, frag 16KB, connected
<::ffff:192.168.10.39,::ffff:192.168.10.39,0> version 4.1
[ 113.195939] cellrssrm invoked oom-killer:
gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=(null),
order=0, oom_score_adj=0
[ 113.195942] cellrssrm cpuset=/ mems_allowed=0
[ 113.195948] CPU: 9 PID: 39876 Comm: cellrssrm Tainted: G O
4.14.35-1902.306.2.2.el7uek.x86_64 #2
[ 113.195949] Hardware name: Oracle Corporation ORACLE SERVER X6-2L/ASM,MOBO
TRAY,2U, BIOS 39320100 04/15/2020
[ 113.195950] Call Trace:
[ 113.195959] dump_stack+0x63/0x7f
[ 113.195965] dump_header+0x9f/0x233
[ 113.195967] ? get_page_from_freelist+0x11d/0xadd
[ 113.195970] out_of_memory+0x450/0x499
[ 113.195972] __alloc_pages_slowpath+0x946/0xb95
[ 113.195974] __alloc_pages_nodemask+0x2b1/0x2fc
[ 113.195977] ? __radix_tree_lookup+0x84/0xef
[ 113.195981] alloc_pages_current+0x6a/0xb0
[ 113.195983] __page_cache_alloc+0x85/0x8e
[ 113.195985] filemap_fault+0x402/0x7bb
[ 113.195990] ? page_add_file_rmap+0x127/0x176
[ 113.195992] ? filemap_map_pages+0x187/0x410
[ 113.196018] ext4_filemap_fault+0x31/0x50 [ext4]
[ 113.196020] __do_fault+0x24/0x75
[ 113.196022] __handle_mm_fault+0xcab/0xf1e
[ 113.196024] handle_mm_fault+0xcc/0x1d5
[ 113.196027] __do_page_fault+0x264/0x519
[ 113.196030] do_page_fault+0x38/0x150
[ 113.196032] ? page_fault+0x137/0x152
[ 113.196034] page_fault+0x14d/0x152
[ 113.196035] RIP: 0033:0x7fe2216c4119
[ 113.196036] RSP: 002b:00007ffcb8453518 EFLAGS: 00010246
[ 113.196038] RAX: eb4ad0b6d4ce0689 RBX: 0000000027c561cf RCX:
0000000000000000
[ 113.196039] RDX: 0000000000000000 RSI: 00007fe2218deaac RDI:
0000000000000000
[ 113.196039] RBP: 00007ffcb8453520 R08: 0000000000000000 R09:
0000000000000000
[ 113.196040] R10: 00007ffcb84535a0 R11: 0000000000000000 R12:
0000000001d99ea0
[ 113.196041] R13: 0000000001dc6300 R14: 0000000000000006 R15:
00000000123e93f8
[ 113.196043] Mem-Info:
[ 113.196048] active_anon:36479 inactive_anon:70203 isolated_anon:0
active_file:260 inactive_file:172 isolated_file:0
unevictable:3326 dirty:2 writeback:0 unstable:0
slab_reclaimable:11802 slab_unreclaimable:37461
mapped:36374 shmem:36746 pagetables:3897 bounce:0

File_name:: vmcore-dmesg.txt

 

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.