My Oracle Support Banner

Oracle Linux: System Crashes with "pcieport PCIe Bus Error: AER / bnxt_en [ n] Bad TLP" Messages in Dmesg (Doc ID 2645910.1)

Last updated on MARCH 14, 2020

Applies to:

Linux OS - Version Oracle Linux 7.7 and later
Linux x86-64

Symptoms

Before the system panics, the following messages can be observed in vmcore (crash dump):

[308002.015648] bnxt_en 0000:18:00.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1801(Transmitter ID)

[308002.015648] bnxt_en 0000:18:00.1: device [14e4:16d9] error status/mask=000010c0/00000000
[308002.015649] bnxt_en 0000:18:00.1: [ 6] Bad TLP                 
[308002.015650] bnxt_en 0000:18:00.1: [ 7] Bad DLLP               
[308002.015651] bnxt_en 0000:18:00.1: [12] Replay Timer Timeout
[308002.015656] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[313876.144532] pcieport 0000:17:00.0: AER: Multiple Corrected error received: id=1700
[313876.144539] pcieport 0000:17:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1700(Transmitter ID) <<<<<<<<<
[313876.144541] pcieport 0000:17:00.0: device [8086:2030] error status/mask=00001100/00000000
[313876.144543] pcieport 0000:17:00.0: [ 8] RELAY_NUM Rollover
[313876.144544] pcieport 0000:17:00.0: [12] Replay Timer Timeout
[313876.144545] pcieport 0000:17:00.0: Error of this Agent(1700) is reported first
[313876.144547] bnxt_en 0000:18:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1800(Transmitter ID)
[313876.144548] bnxt_en 0000:18:00.0: device [14e4:16d9] error status/mask=00001000/00000000
[313876.144549] bnxt_en 0000:18:00.0: [12] Replay Timer Timeout
[313876.144551] bnxt_en 0000:18:00.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1801(Transmitter ID)
[313876.144552] bnxt_en 0000:18:00.1: device [14e4:16d9] error status/mask=00001000/00000000
[313876.144553] bnxt_en 0000:18:00.1: [12] Replay Timer Timeout
[313876.144558] pcieport 0000:17:00.0: AER: Multiple Corrected error received: id=1700
[313876.814380] pcieport 0000:17:00.0: AER: Multiple Corrected error received: id=1800
[313876.814388] bnxt_en 0000:18:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1800(Transmitter ID)
[313876.814390] bnxt_en 0000:18:00.0: device [14e4:16d9] error status/mask=00001100/00000000
[313876.814392] bnxt_en 0000:18:00.0: [ 8] RELAY_NUM Rollover
[313876.814393] bnxt_en 0000:18:00.0: [12] Replay Timer Timeout
[313876.814393] bnxt_en 0000:18:00.0: Error of this Agent(1800) is reported first
[313876.814396] bnxt_en 0000:18:00.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1801(Transmitter ID)
[313876.814397] bnxt_en 0000:18:00.1: device [14e4:16d9] error status/mask=00001100/00000000
[313876.814398] bnxt_en 0000
:18:00.1: [ 8] RELAY_NUM Rollover

// Kernel Panic:

[354601.207239] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
[354601.207251] IP: bnxt_tx_int+0xe5/0x330 [bnxt_en]
[354601.207252] PGD 0 P4D 0
[354601.207255] Oops: 0000 [#1] SMP NOPTI
[354601.207256] Modules linked in: macsec sctp_diag sctp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag fuse macvtap tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache macvlan vfat fat rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp kvm_intel bnxt_re kvm ipmi_ssif cdc_ether usbnet ib_core mii irqbypass crct10dif_pclmul crc32_pclmul
[354601.207298] ghash_clmulni_intel pcbc sg aesni_intel crypto_simd glue_helper cryptd pcspkr ipmi_si i2c_i801 lpc_ich ipmi_devintf wmi ipmi_msghandler ioatdma shpchp pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace
[354601.207310] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[354601.207312] sunrpc binfmt_misc
[354601.207314] bnxt_en 0000:18:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1800(Receiver ID)
[354601.207315] ip_tables xfs libcrc32c
[354601.207318] bnxt_en 0000:18:00.0: device [14e4:16d9] error status/mask=00000040/00000000
[354601.207319] uas usb_storage sd_mod mgag200 drm_kms_helper
[354601.207323] bnxt_en 0000:18:00.0: [ 6] Bad TLP
[354601.207323] syscopyarea sysfillrect sysimgblt fb_sys_fops ttm igb drm ahci megaraid_sas libahci
[354601.207329] pcieport 0000:17:00.0: AER: Corrected error received: id=1700
[354601.207329] crc32c_intel libata ptp bnxt_en pps_core
[354601.207332] pcieport 0000:17:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1700(Transmitter ID)
[354601.207333] dca i2c_algo_bit dm_mirror
[354601.207335] pcieport 0000:17:00.0: device [8086:2030] error status/mask=00001000/00000000
[354601.207336] pcieport 0000:17:00.0: [12] Replay Timer Timeout
[354601.207338] dm_region_hash dm_log dm_mod
[354601.207340] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[354601.207342] CPU: 1 PID: 30686 Comm: kworker/1:2 Not tainted 4.14.35-1902.10.7.el7uek.x86_64 #2
[354601.207344] Hardware name: Oracle Corporation ORACLE SERVER X8-2/ASM,MB,X8-2, BIOS 51021000 01/09/2020
[354601.207348] Workqueue: events macvlan_process_broadcast [macvlan]
[354601.207350] task: ffffa01b7906dc40 task.stack: ffffbd6361b34000
[354601.207355] RIP: 0010:bnxt_tx_int+0xe5/0x330 [bnxt_en]
[354601.207355] bnxt_en 0000:18:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1800(Receiver ID)
[354601.207357] bnxt_en 0000:18:00.0: device [14e4:16d9] error status/mask=00000040/00000000
[354601.207358] bnxt_en 0000:18:00.0: [ 6] Bad TLP
[354601.207360] RSP: 0018:ffffa01b7f643e40 EFLAGS: 00010246
[354601.207361] RAX: ffffa01b78468900 RBX: ffffa01b7f36e0a0 RCX: 00000000000003d8
[354601.207362] RDX: ffffbd634ddf1000 RSI: ffffa01b7d988040 RDI: ffffa01b78468900
[354601.207363] RBP: ffffa01b7f643e90 R08: 0000000000000003 R09: ffffffffa86fbb10
[354601.207364] R10: ffffa01b7f669fb0 R11: ffffa01b7d988100 R12: 0000000000000149
[354601.207365] R13: ffffa01b7cae4000 R14: ffffbd634ddf2ec0 R15: 0000000000000000
[354601.207366] FS: 0000000000000000(0000) GS:ffffa01b7f640000(0000) knlGS:0000000000000000
[354601.207367] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[354601.207368] CR2: 0000000000000080 CR3: 000000016f40a004 CR4: 00000000007626e0
[354601.207369] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[354601.207370] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[354601.207371] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[354601.207372] PKRU: 55555554
[354601.207373] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[354601.207374] Call Trace:
[354601.207376] <IRQ>
[354601.207376] pcieport 0000:17:00.0: AER: Corrected error received: id=1800 <<<<<<<
[354601.207379] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[354601.207381] __bnxt_poll_work_done+0x71/0xe0 [bnxt_en]
[354601.207384] bnxt_poll+0xc0/0x5e0 [bnxt_en]
[354601.207389] net_rx_action+0x289/0x3f4
[354601.207393] __do_softirq+0xd9/0x28d
[354601.207394] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[354601.207397] do_softirq_own_stack+0x2a/0x35
[354601.207399] </IRQ>
[354601.207402] do_softirq+0x4d/0x6a
[354601.207403] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[354601.207404] netif_rx_ni+0x38/0x85
[354601.207406] macvlan_broadcast+0x107/0x170 [macvlan]
[354601.207408] macvlan_process_broadcast+0x14b/0x160 [macvlan]
[354601.207410] process_one_work+0x169/0x399
[354601.207412] worker_thread+0x4d/0x3e5
[354601.207415] kthread+0x105/0x138
[354601.207416] bnxt_en 0000:18:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1800(Receiver ID)
[354601.207417] bnxt_en 0000:18:00.0: device [14e4:16d9] error status/mask=00000040/00000000
[354601.207418] bnxt_en 0000:18:00.0: [ 6] Bad TLP
[354601.207420] ? rescuer_thread+0x380/0x375
[354601.207422] ? kthread_bind+0x20/0x15
[354601.207424] ret_from_fork+0x24/0x49
[354601.207425] Code:
[354601.207426] pcieport 0000:17:00.0: AER: Corrected error received: id=1700
[354601.207427] 60 01 48 8b 45 d0
[354601.207430] pcieport 0000:17:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1700(Transmitter ID)
[354601.207431] 48 8d 0c
[354601.207434] pcieport 0000:17:00.0: device [8086:2030] error status/mask=00001000/00000000
[354601.207435] pcieport 0000:17:00.0: [12] Replay Timer Timeout
[354601.207436] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[354601.207437] 52 49 8b 55 60 66 44 23 a0 b4 00
[354601.207441] pcieport 0000:17:00.0: AER: Corrected error received: id=1700
[354601.207441] 00 00 4c 8d 34 ca 41 80 7e 11 00 4d 8b 3e 49 c7 06 00 00 00 00 75 8e <41> 8b 97 80 00
[354601.207451] pcieport 0000:17:00.0: AER: Corrected error received: id=1800
[354601.207453] bnxt_en 0000:18:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=1800(Receiver ID)
[354601.207453] 00 00 41
[354601.207455] bnxt_en 0000:18:00.0: device [14e4:16d9] error status/mask=00000040/00000000
[354601.207456] 2b 97 84 00
[354601.207458] bnxt_en 0000:18:00.0: [ 6] Bad TLP
[354601.207459] 00 00 48 85 db 49 8b 76 08
[354601.207466] RIP: bnxt_tx_int+0xe5/0x330 [bnxt_en] RSP: ffffa01b7f643e40
[354601.207467] CR2: 0000000000000080
[354601.207469] ---[ end trace 34f93ec590ce63dc ]---
[354601.273863] Kernel panic - not syncing: Fatal exception in interrupt
[354601.274879] Kernel Offset: 0x27000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.