linux-kernelkernelu-bootpanic

Calling user space application from kernel panic


I need a way to notify that a kernel panic occurred to U-Boot on my system, I have configured all the related applications like fw_setenv and it is working when launched manually. Now I need to automate this process, in case that a kernel panic occur it should change the U-Boot variable, to do that I’m trying with call_usermodehelper() function but not working, the result of calling this function is 0 but nothing is being launched. I have tried just doing touch with call_usermodehelper() to create a file when panic, but not working either, the file is not being created. I have isolated the code related with this and created a kernel module just to probe the behavior, on this module I simply call to call_usermodehelper() and is working perfectly, but when I move the code to the panic function nothing happens. I read something about call_usermodehelper() function not working within IRQ handlers so I tried with a worker too, but no success. This code is my last attempt, any help with this will be really appreciated.

struct work_cont {
        struct work_struct real_work;
        char cmd[250];
};

static struct work_cont execwq;

void cmdexec_worker(struct work_struct *work)
{
    static char *envp[] = { "HOME=/", "TERM=linux", "PATH=/sbin:/usr/sbin:/bin:/usr/bin", NULL };
    char *argv[] = { "/bin/touch", "/a.txt", NULL };
        // struct work_cont *c_ptr = container_of(work, struct work_cont, real_work);
        set_current_state(TASK_INTERRUPTIBLE);
    printk(KERN_ERR "Executing worker\n");
        call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);

        return;
}

void panic(const char *fmt, ...)
{
    schedule_work(&execwq.real_work);
...
}

static int __init setup_crash_kexec_post_notifiers(char *s)
{
        INIT_WORK(&execwq.real_work, cmdexec_worker);
...
}

Load the kernel module manually works, with and wihtout worker. No program or script works from panic function, the printk() is showing so the code is being executed but the external application is not being called.


Solution

  • Calling user space application from kernel panic

    You seem to fail to grasp that a kernel panic is not a normal situation where you can access the rootfs to load & execute a userspace application (which may require linking with shared objects) to perform even more I/O. You're expecting a lot of system features to be intact and available even though the kernel has panicked and essentially declared the system to be unstable.
    What if the panic is related to the rootfs? How can you then access that rootfs for a program? Could this cause an infinite panic loop? IMO you need to rethink this overly complex scheme, which also puts at risk U-Boot's saved environment.

    I need a way to notify that a kernel panic occurred to U-Boot on my system

    So you actually have an XY problem.

    Your proposed solution for Y has U-Boot simply test one of its environment variables, but requires the kernel to perform the impossible task of initiating a user program to (1) read a file (or raw sector), (2) modify the contents and perform a CRC32 calculation, and then (3) write that contents back. All of these steps would be performed while the system is unstable because of a kernel panic. Note that if step 3 is initiated but fails to complete successfully, then U-Boot will have to either use a backup copy (if available) or revert to the default version of the environment.

    One possible and simpler (from a kernel perspective) solution for X would utilize existing kernel capabilities. The system can be built to dump memory (aka "core") when a panic occurs. The Ubuntu documentation describes this capability as:

    When a kernel panic occurs, the kernel relies on the kexec mechanism to quickly reboot a new instance of the kernel in a pre-reserved section of memory that had been allocated when the system booted (see xxx). This permits the existing memory area to remain untouched in order to safely copy its contents to storage.

    The bulk of new development will now be on the U-Boot side, which will have to determine if a (new) kernel dump has been written.


    So I need a way to ensure that when something crash the system is able to recover itself ...

    Then at least be sure to change CONFIG_PANIC_TIMEOUT to something other than its default value of 0.