[Linux OS] System Hung with Large Numbers of Page Allocation Failures with "order:5" on Exadata Environments
Last updated on MAY 04, 2017
Applies to:Linux OS - Version Oracle Linux 5.5 and later
This is commonly seen on Exadata platforms that use Infiniband for cluster communications.
System is under memory pressure. Exadata Infiniband using IPoIB has an MTU of 65520. This MTU plus overhead puts the memory allocation for IP based packets at 32 4k pages (order 5), which have to be contiguous. If the system is under memory pressure it can become very difficult to find 32 contiguous pages of memory. Note also that this can occur with lower order allocation requests as well.
/var/log/messages or serial console will show numerous messages like the following page allocation failure and associated call trace showing the int network code (*sock*, *tcp*, etc), especially key being the __alloc_skb() call in the trace as this is they call where the memory allocation is being attempted:
Along with the "page allocation failure" messages in system log, the server may become hanging (unresponsive or freeze for any action), sometimes server reboots after above failure.
Sign In with your My Oracle Support account
Don't have a My Oracle Support account? Click to get started
Million Knowledge Articles and hundreds of Community platforms