After updating to RHEL7.5 or later kernel panics due to invalid memory access provided by "KsFree()" function of "oracleoks" kernel module.

Solution Verified - Updated -

Environment

  • kernel-3.10.0-862.el7 or later
  • Red Hat Enterprise Linux 7.5
  • oracleoks third party kernel module.

Issue

  • After updating to RHEL7.5 or later kernel panics due to invalid memory access provided by "KsFree()" function of "oracleoks" kernel module.
crash> bt
PID: 33293  TASK: ffff982bc5e3af70  CPU: 4   COMMAND: "modprobe"
 #0 [ffff982caff9fad8] machine_kexec at ffffffffbde629da
 #1 [ffff982caff9fb38] __crash_kexec at ffffffffbdf16692
 #2 [ffff982caff9fc08] crash_kexec at ffffffffbdf16780
 #3 [ffff982caff9fc20] oops_end at ffffffffbe51d728
 #4 [ffff982caff9fc48] no_context at ffffffffbe50c6cd
 #5 [ffff982caff9fc98] __bad_area_nosemaphore at ffffffffbe50c764
 #6 [ffff982caff9fce8] bad_area_nosemaphore at ffffffffbe50c8d5
 #7 [ffff982caff9fcf8] __do_page_fault at ffffffffbe5206e0
 #8 [ffff982caff9fd60] do_page_fault at ffffffffbe5208d5
 #9 [ffff982caff9fd90] page_fault at ffffffffbe51c758
    [exception RIP: kfree+85]
    RIP: ffffffffbdffa5f5  RSP: ffff982caff9fe40  RFLAGS: 00010282
    RAX: ffffdbf1b0600040  RBX: ffffb2f758001000  RCX: 0000000000000004
    RDX: 000067e440000000  RSI: 0000000003d09000  RDI: ffffb2f758001000
    RBP: ffff982caff9fe58   R8: 000000000001bb20   R9: ffffffffc081e9dd
    R10: ffff983afe71bb20  R11: ffffdb868549fe00  R12: 0000000003d09000
    R13: ffffffffc081e9dd  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#10 [ffff982caff9fe60] KsFree at ffffffffc081e9dd [oracleoks]
#11 [ffff982caff9fe80] odlm_rsb_free_tbl at ffffffffc082bb4f [oracleoks]
#12 [ffff982caff9fe90] odlm_subsys_unconfigure at ffffffffc082b52b [oracleoks]
#13 [ffff982caff9feb8] cleanup_module at ffffffffc08538ae [oracleoks]
#14 [ffff982caff9fec8] sys_delete_module at ffffffffbdf0fe8e
#15 [ffff982caff9ff50] system_call_fastpath at ffffffffbe52579b
    RIP: 00007fc51c8f5027  RSP: 00007ffd37575f08  RFLAGS: 00000246
    RAX: 00000000000000b0  RBX: 00000000025ca6b0  RCX: ffffffffffffffff
    RDX: 0000000000000000  RSI: 0000000000000800  RDI: 00000000025ca718
    RBP: 0000000000000000   R8: 00007fc51cbbd060   R9: 00007fc51c9691a0
    R10: 0000000000000000  R11: 0000000000000206  R12: 00000000025c8210
    R13: 0000000000000000  R14: 00000000025c8508  R15: 0000000000000000
    ORIG_RAX: 00000000000000b0  CS: 0033  SS: 002b

Resolution

  • Contact Oracle Support for further investigation as the panic occurs inside the oracleoks driver. Access to oracleoks source code will be required to troubleshoot further.
  • If Oracle support has feedback or questions for Red Hat pass this information into the Red Hat case.
  • If required Red Hat and Oracle can collaborate on this issue

Root Cause

  • The vmcore indicates an invalid paging request at address as the root cause of the panic.
  • This occurs in the kernel function kfree().
  • kfree() expects a single argument, a pointer to an object allocated with kmalloc(). However we can see that the argument does not point to any object in a slab cache.
  • kfree() is called by the "oracleoks" function KsFree().
  • This address instead marks the start of vm_struct address range. This address range was allocated by the "oracleoks" function KsMalloc().

Diagnostic Steps

  • Vmcore findings.

CPUS: 6 DATE: Fri Sep 28 16:04:49 2018 UPTIME: 00:54:41 LOAD AVERAGE: 1.48, 1.43, 1.29 TASKS: 362 RELEASE: 3.10.0-862.11.6.el7.x86_64 VERSION: #1 SMP Fri Aug 10 16:55:11 UTC 2018 MACHINE: x86_64 (3099 Mhz) MEMORY: 128 GB PANIC: "BUG: unable to handle kernel paging request at ffffdbf1b0600040" crash> bt PID: 33293 TASK: ffff982bc5e3af70 CPU: 4 COMMAND: "modprobe" #0 [ffff982caff9fad8] machine_kexec at ffffffffbde629da #1 [ffff982caff9fb38] __crash_kexec at ffffffffbdf16692 #2 [ffff982caff9fc08] crash_kexec at ffffffffbdf16780 #3 [ffff982caff9fc20] oops_end at ffffffffbe51d728 #4 [ffff982caff9fc48] no_context at ffffffffbe50c6cd #5 [ffff982caff9fc98] __bad_area_nosemaphore at ffffffffbe50c764 #6 [ffff982caff9fce8] bad_area_nosemaphore at ffffffffbe50c8d5 #7 [ffff982caff9fcf8] __do_page_fault at ffffffffbe5206e0 #8 [ffff982caff9fd60] do_page_fault at ffffffffbe5208d5 #9 [ffff982caff9fd90] page_fault at ffffffffbe51c758 [exception RIP: kfree+85] RIP: ffffffffbdffa5f5 RSP: ffff982caff9fe40 RFLAGS: 00010282 RAX: ffffdbf1b0600040 RBX: ffffb2f758001000 RCX: 0000000000000004 RDX: 000067e440000000 RSI: 0000000003d09000 RDI: ffffb2f758001000 RBP: ffff982caff9fe58 R8: 000000000001bb20 R9: ffffffffc081e9dd R10: ffff983afe71bb20 R11: ffffdb868549fe00 R12: 0000000003d09000 R13: ffffffffc081e9dd R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff982caff9fe60] KsFree at ffffffffc081e9dd [oracleoks] #11 [ffff982caff9fe80] odlm_rsb_free_tbl at ffffffffc082bb4f [oracleoks] #12 [ffff982caff9fe90] odlm_subsys_unconfigure at ffffffffc082b52b [oracleoks] #13 [ffff982caff9feb8] cleanup_module at ffffffffc08538ae [oracleoks] #14 [ffff982caff9fec8] sys_delete_module at ffffffffbdf0fe8e #15 [ffff982caff9ff50] system_call_fastpath at ffffffffbe52579b RIP: 00007fc51c8f5027 RSP: 00007ffd37575f08 RFLAGS: 00000246 RAX: 00000000000000b0 RBX: 00000000025ca6b0 RCX: ffffffffffffffff RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00000000025ca718 RBP: 0000000000000000 R8: 00007fc51cbbd060 R9: 00007fc51c9691a0 R10: 0000000000000000 R11: 0000000000000206 R12: 00000000025c8210 R13: 0000000000000000 R14: 00000000025c8508 R15: 0000000000000000 ORIG_RAX: 00000000000000b0 CS: 0033 SS: 002b crash> mod -t NAME TAINTS oracleoks POE crash> kmem ffffdbf1b0600040 kmem: WARNING: cannot make virtual-to-physical translation: ffffdbf1b0600040 ffffdbf1b0600040: kernel virtual address not found in mem map - There's only one variable here - the object pointer passed into kfree. crash> kmem ffffb2f758001000 VMAP_AREA VM_STRUCT ADDRESS RANGE SIZE ffff982cdbe42500 ffff982cdfbb4380 ffffb2f758001000 - ffffb2f75bd0b000 64004096 PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffffdb86863cf400 118f3d0000 0 0 1 2fffff00000000 - This address range was originally allocated via KsMalloc -> vmalloc crash> vm_struct ffff982cdfbb4380 struct vm_struct { next = 0x0, addr = 0xffffb2f758001000, size = 64004096, flags = 18, pages = 0xffffb2f74c655000, nr_pages = 15625, phys_addr = 0, caller = 0xffffffffc081caf8 } crash> sym 0xffffffffc081caf8 ffffffffc081caf8 (w) KsMalloc+392 [oracleoks]

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments