RHEL8: kernel panic at svc_rdma_encode_write_chunk()

Solution Verified - Updated -

Issue

  • Kernel panic with logs:
[55567.293206] VFS: file-max limit 105542640 reached
[55567.293207] VFS: file-max limit 105542640 reached
[55628.447719] rpcbind: server rpc.statd not responding, timed out
[55628.447763] lockd: cannot unmonitor 1.1.1.1
[55689.886737] rpcbind: server rpc.statd not responding, timed out
[55689.886783] lockd: cannot unmonitor 1.1.1.1
[55847.541471] general protection fault, probably for non-canonical address 0xde7db016e6b8646e: 0000 [#1] SMP NOPTI
[55847.541511] CPU: 2 PID: 72581 Comm: nfsd Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-513.11.1.el8_9.x86_64 #1
[55847.541539] Hardware name: GIGABYTE R272-Z32-00/MZ32-AR0-00, BIOS R23 03/30/2021
[55847.541557] RIP: 0010:svc_rdma_encode_write_chunk+0x1f/0x190 [rpcrdma]
[55847.541588] Code: 48 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 41 57 48 89 f8 41 56 48 05 a8 00 00 00 41 55 41 54 49 89 f4 55 53 48 83 ec 20 <8b> 6e 18 be 04 00 00 00 48 89 7c 24 08 48 89 c7 48 89 04 24 e8 88
[55847.541631] RSP: 0018:ffffc15206f6fe10 EFLAGS: 00010282
[55847.541645] RAX: ffff9b6f90e63ea8 RBX: ffff9b741a8f0600 RCX: ffff9b741a8f0700
[55847.541663] RDX: de7db016e6b8646e RSI: de7db016e6b8646e RDI: ffff9b6f90e63e00
[55847.541681] RBP: ffff9b6f90e63e00 R08: ffff9b61c1955000 R09: ffff9b61c1956000
[55847.541698] R10: 0000000000000000 R11: 0000000000000246 R12: de7db016e6b8646e
[55847.541715] R13: ffff9bb42cf54000 R14: ffff9b6f90e63ea8 R15: 0000000000000000
[55847.541980] FS:  0000000000000000(0000) GS:ffff9b6aaee80000(0000) knlGS:0000000000000000
[55847.542147] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[55847.542314] CR2: 000000000622b9b8 CR3: 00000088e90f8000 CR4: 0000000000350ee0
[55847.542480] Call Trace:
[55847.542636]  ? __die_body+0x1a/0x60
[55847.542794]  ? die_addr+0x38/0x51
[55847.542947]  ? do_general_protection+0x135/0x280
[55847.543102]  ? general_protection+0x1e/0x30
[55847.543254]  ? svc_rdma_encode_write_chunk+0x1f/0x190 [rpcrdma]
[55847.543418]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[55847.543582]  ? svc_rdma_send_ctxt_alloc+0x149/0x2a0 [rpcrdma]
[55847.543743]  svc_rdma_sendto+0x143/0x360 [rpcrdma]
[55847.543901]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[55847.544061]  svc_send+0x51/0x170 [sunrpc]
[55847.544231]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[55847.544391]  nfsd+0xe3/0x140 [nfsd]
[55847.544546]  kthread+0x134/0x150
[55847.544688]  ? set_kthread_struct+0x50/0x50
[55847.544830]  ret_from_fork+0x35/0x40
[55847.544970] Modules linked in: binfmt_misc tcp_diag udp_diag inet_diag overlay nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache bonding vfat fat dm_service_time dm_multipath dm_mod intel_rapl_msr intel_rapl_common rpcrdma rdma_ucm ib_srpt amd64_edac_mod ib_isert edac_mce_amd iscsi_target_mod amd_energy target_core_mod kvm irqbypass ib_iser crct10dif_pclmul libiscsi crc32_pclmul ghash_clmulni_intel scsi_transport_iscsi ib_umad rapl pcspkr ccp rdma_cm ib_ipoib iw_cm ib_cm ses enclosure zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) ipmi_ssif zcommon(POE) znvpair(POE) spl(OE) mlx5_ib ib_uverbs ib_core sp5100_tco ptdma i2c_piix4 k10temp acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c raid1 sd_mod sg crc32c_intel mlx5_core ast drm_shmem_helper drm_kms_helper syscopyarea sysfillrect ahci sysimgblt libahci drm igb mpt3sas libata dca mlxfw i2c_algo_bit nvme pci_hyperv_intf tls raid_class nvme_core
[55847.545028]  scsi_transport_sas psample t10_pi fuse
  • Another pattern of log:
[ 2937.236876] rpcbind: server rpc.statd not responding, timed out
[ 2937.236942] lockd: cannot monitor 1.1.1.1
[ 2998.675847] lockd: cannot monitor 1.1.1.1
[ 3048.850929] general protection fault, probably for non-canonical address 0x6425fbba1adc94f7: 0000 [#1] SMP NOPTI
[ 3048.850968] CPU: 20 PID: 90591 Comm: nfsd Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-513.11.1.el8_9.x86_64 #1
[ 3048.850998] Hardware name: GIGABYTE R272-Z32-00/MZ32-AR0-00, BIOS R23 03/30/2021
[ 3048.851017] RIP: 0010:__list_del_entry_valid+0x0/0x50
[ 3048.851036] Code: b8 01 00 00 00 e9 30 22 52 00 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 78 d9 b3 8e e8 8f 7e c7 ff 0f 0b 66 0f 1f 84 00 00 00 00 00 <48> 8b 07 48 8b 57 08 48 b9 00 01 00 00 00 00 ad de 48 39 c8 0f 84
[ 3048.851079] RSP: 0018:ffffafdc54083e68 EFLAGS: 00010296
[ 3048.851094] RAX: 6425fbba1adc94f7 RBX: 6425fbba1adc94f7 RCX: 0000000000009949
[ 3048.851113] RDX: 00000000000000ca RSI: ffffa15de419da00 RDI: 6425fbba1adc94f7
[ 3048.851131] RBP: ffffa15de419daf8 R08: 0000000080000000 R09: 0000000000000276
[ 3048.851149] R10: 0000000000000001 R11: ffffa1182ef31dc4 R12: ffffa15de419db00
[ 3048.851168] R13: dead000000000200 R14: dead000000000100 R15: ffffa0db0e684000
[ 3048.851186] FS:  0000000000000000(0000) GS:ffffa1182ef00000(0000) knlGS:0000000000000000
[ 3048.851206] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3048.851221] CR2: 000000c000dfe000 CR3: 0000010032dba000 CR4: 0000000000350ee0
[ 3048.851239] Call Trace:
[ 3048.851249]  ? __die_body+0x1a/0x60
[ 3048.851263]  ? die_addr+0x38/0x51
[ 3048.851274]  ? do_general_protection+0x135/0x280
[ 3048.851288]  ? general_protection+0x1e/0x30
[ 3048.851301]  ? __list_add_valid+0x50/0x50
[ 3048.851316]  pcl_free+0x46/0x90 [rpcrdma]
[ 3048.851341]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 3048.851369]  svc_rdma_recv_ctxt_put+0x31/0x70 [rpcrdma]
[ 3048.851393]  svc_xprt_release+0x20/0x150 [sunrpc]
[ 3048.851431]  svc_process+0xe1/0xf0 [sunrpc]
[ 3048.851464]  nfsd+0xe3/0x140 [nfsd]
[ 3048.851488]  kthread+0x134/0x150
[ 3048.851500]  ? set_kthread_struct+0x50/0x50
[ 3048.851513]  ret_from_fork+0x35/0x40
[ 3048.851526] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sr_mod cdrom joydev cdc_ether usbnet mii uas usb_storage bonding vfat fat dm_service_time dm_multipath dm_mod intel_rapl_msr intel_rapl_common amd64_edac_mod edac_mce_amd amd_energy kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl pcspkr rpcrdma rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_umad ccp rdma_cm ib_ipoib iw_cm ib_cm ses enclosure zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) ipmi_ssif zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) mlx5_ib ib_uverbs acpi_ipmi ib_core sp5100_tco i2c_piix4 ptdma k10temp ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c raid1 sd_mod sg crc32c_intel ast mlx5_core drm_shmem_helper drm_kms_helper syscopyarea sysfillrect ahci sysimgblt igb libahci drm mpt3sas libata dca mlxfw nvme i2c_algo_bit pci_hyperv_intf tls nvme_core raid_class
[ 3048.851586]  scsi_transport_sas psample t10_pi fuse

Environment

  • Red Hat Enterprise Linux 8
  • [rdma][nfsd]

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content