RHEL 8/9: Hard LOCKUP at blkcg_iostat_update() or cgroup_rstat_flush_locked()
Issue
- Hard LOCKUP with logs:
[1225791.947622] NMI watchdog: Watchdog detected hard LOCKUP on cpu 32
[1225791.947653] CPU: 32 PID: 0 Comm: swapper/32 Kdump: loaded Not tainted 4.18.0-513.24.1.el8_9.x86_64 #1
[1225791.947668] <IRQ>
[1225791.947668] __blkcg_rstat_flush.isra.29+0xd2/0x120
[1225791.947668] __blkg_release+0x46/0xf0
[1225791.947669] rcu_do_batch+0x1c5/0x4b0
[1225791.947669] rcu_core+0x14c/0x210
[1225791.947669] __do_softirq+0xdc/0x2cf
[1225791.947670] irq_exit_rcu+0xc6/0xd0
[1225791.947670] irq_exit+0xa/0x10
[1225791.947670] smp_apic_timer_interrupt+0x74/0x130
[1225791.947670] apic_timer_interrupt+0xf/0x20
[1225791.947671] </IRQ>
- Another pattern:
[5105610.937988] NMI watchdog: Watchdog detected hard LOCKUP on cpu 7
...
[5105610.938012] CPU: 7 PID: 1166250 Comm: kworker/u1025:1 Kdump: loaded Tainted: G OE --------- - - 4.18.0-477.27.1.el8_8.x86_64 #1
[5105610.938012] Hardware name: XXX, BIOS F12 08/31/2023
[5105610.938013] Workqueue: events_unbound flush_memcg_stats_dwork
[5105610.938013] RIP: 0010:cgroup_rstat_flush_locked+0x122/0x280
[5105610.938013] Code: 00 00 4d 85 f6 0f 84 01 01 00 00 49 8b 96 28 01 00 00 49 8b 86 78 03 00 00 48 85 d2 0f 84 9b 00 00 00 4b 03 44 e5 00 48 8b 38 <48> 8b 70 08 48 2b 78 18 48 2b 70 20 48 8b 48 10 48 2b 48 28 49 01
[5105610.938014] RSP: 0018:ff32a1643623fe20 EFLAGS: 00000086
[5105610.938014] RAX: ff64a163fc0d3be8 RBX: ff1e2ce3ce366318 RCX: ff1e283c0deb5000
[5105610.938014] RDX: ff1e2778fb2e0000 RSI: ff1e2d776bec0000 RDI: 0000000000000000
[5105610.938015] RBP: 00000000000000b3 R08: ff1e2778fb2e0000 R09: ff1e2778fb2e0000
[5105610.938015] R10: 0000000000000400 R11: 000000000000000a R12: 00000000000000b3
[5105610.938015] R13: ffffffffb5db6840 R14: ff1e283c0deb5000 R15: ff1e2ce3ce366380
[5105610.938015] FS: 0000000000000000(0000) GS:ff1e28344f1c0000(0000) knlGS:0000000000000000
[5105610.938016] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[5105610.938016] CR2: 0000148b54a0a010 CR3: 000001a5d3e10003 CR4: 0000000000771ee0
[5105610.938016] PKRU: 55555554
[5105610.938016] Call Trace:
[5105610.938016] cgroup_rstat_flush_irqsafe+0x23/0x40
[5105610.938017] __mem_cgroup_flush_stats+0x5a/0x80
[5105610.938017] flush_memcg_stats_dwork+0xa/0x30
[5105610.938017] process_one_work+0x1a7/0x360
[5105610.938017] ? create_worker+0x1a0/0x1a0
[5105610.938017] worker_thread+0x30/0x390
[5105610.938018] ? create_worker+0x1a0/0x1a0
[5105610.938018] kthread+0x134/0x150
[5105610.938018] ? set_kthread_struct+0x50/0x50
[5105610.938018] ret_from_fork+0x35/0x40
Environment
- Red Hat Enterprise Linux 9
- Red Hat Enterprise Linux 8
- Seen on
kernel-4.18.0-553.5.1.el8_10
- Seen on
kernel-4.18.0-513.24.1.el8_9
- Seen on
kernel-4.18.0-477.55.1.el8_8
- Seen on
- Supermicro Super Server/X13DEI, BIOS 2.1 12/13/2023
- Intel(R) Xeon(R) Gold 6416H 0x2b0004d0
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.