OpenBSD Journal

[Ask OBSDJ] What makes a process unkillable?

Contributed by Dengue on from the what-about-kill--9 dept.

Matt Van Mater writes :
"A week ago I had some sharity-light mounts die on my OpenBSD box, which then messed up my apache daemon (which I was trying to read the mounted drives from). I couldn't kill any of the processes and had to reboot the box after 65 days uptime (boo hoo i know). What I want to know is why could I not kill those processes, and how can I go about preventing something like that from happening again?"

(Comments are closed)


Comments
  1. By panda () panda@NeOpSiPtAaM.fr on mailto:panda@NeOpSiPtAaM.fr

    Well, basically, you were facing the problem of
    zombie processes.
    A Unix C program can instantiate a copy of itself
    with the fork(2) syscall, the spawned copy is called a child process. The child process can
    pretty much have a life of it's own, but is bound
    , eventually, to die. Once it's dead, the parent
    has to issue a wait(2) syscall. If the wait(2) call is not issued, the child process is put in zombie state, waiting for it's parent to wait(2).

    In your particular case, Apache and Sharity light acted in such a way that they will never
    wait(2), leaving those unkillable processes.

    As for the solution, well i don't think there's any way to kill zombie processes, since their existence is merely due to programming errors, you simply faced a bug.

  2. By sangdrax () lucifer@vengeance.et.tudelft.nl on http://i.die.ms

    I haven't had the following problem with (Open)BSD because I use that for firewalling mainly, but under Linux, if a process is hanging doing a kernel call (say, waiting for SCSI I/O and timing out), the process is unkillable too.

    I get this stuff with for instance 'cdparanoia' when trying to read a damaged CD and the cdrom-drive keeps failing. Cdparanoia then just 'eats' all kill -9's i send to it.

    Any BSD guru like to comment on this? :-)

  3. By ruben () ruben@su.cx on http://www.su.cx/~ruben/

    Be careful with kill -9, only use it as last save for killing something as i had killed some processes and they didnt closed from /var/run/utmp and still got 3 users logged in that actually aint and only way to fix it is to reboot or manually edit the /var/run/utmp which is a binairy file and i think you don't wanna do that. Just a tip, may be usefull for somebody.

  4. By Anonymous Coward () on

    it sounds like its blocking on disk wait, ps is your friend (iirc a 'D' in the STAT column indicates a disk wait)

    i noticed a similar thing when a remote nfs mount crashes

  5. By Chris () on http://www.dejection.org.uk/

    You could just do power cycle like I do with my sparc LX... :D

    /me awaits flames...

  6. By Vincent Keßler () kessler at nscit dot com on mailto:kessler at nscit dot com

    I experienced similar behavior with sharity light and was able to kill it using the debugger even though it wasnt killable by kill -9.

  7. By Matt Van Mater () vanmatmm@jmu.edu on mailto:vanmatmm@jmu.edu

    I neglected to mention that I tried kill -9 PID on each of the processes (both sharity-light and httpd) and of course it didn't work. any time i tried to access the share afterwards would freeze those processes as well (ex: ls -al /mnt/share ) after a short while of trying to figure out what was wrong, i had a handful of other processes sitting there that couldn't be killed cuz they were all blocking on I/O.

    Someone mentioned that when you kill a process and some children are left behind they are adopted by init. If this is the case, then there is no way to kill the processes (because killing init is death to the system... right?) could i have sent init a HUP instead in this situation?

    Someone else mentioned having the same problem with nfs shares. sharity-light actually puts the samba shares in a 'nfs wrapper' so the kernel thinks it's mounting a nfs share. This makes me suspicious that the problem might lie in NFS not working properly rather than in sharity-light as I originally suspected. I suppose if a process is blocking for I/O then there really isn't too much we can do to kill it since it won't accept an interrupt. It's the same idea as a cdrom mount dieing on you.

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]