Dylan Gardner

Posts About Me

Trying to kill a UNIX Hydra Process

The joys of trying to kill a process that keeps respawning.

September 22nd, 2020

Earlier today I was doing some dirty scripting with rsync and ssh where I was attempting to launch an interactive command on a remote machine over SSH, while syncing the temporary files created from that command down to my Macbook. The script looked a little like this:

#!/bin/bash
target="$1"
remotecmd="..."
sync() {
    while true; do
        rsync -a "$target:/tmp/.remotecmd/" `pwd`
        sleep 3
    done
}

echo "Spawning off syncing agent..."
sync &

echo "Launching $remotecmd..."
ssh -t "$target" "cd /tmp/.remotecmd && $remotecmd"

It’s not the prettiest, but you get the gist of it. It worked for my purposes and I continued on my merry way.

Which brings me to a not too long ago when I was closing out my Terminal sessions for the day and decided I wanted to clean up some of those pesky files that were synced down. A simple rm of the files will do I thought, so I did this:

$ ls
tmpfile1
tmpfile2
tmpfile3
$ rm tmpfile1 tmpfile2 tmpfile3

Then I ran ls again to make sure I got everything:

$ ls
tmpfile1
tmpfile2
tmpfile3

Huh? My first instinct without thinking much was just to try harder, so I did that by blindly running rm -f, thinking automatically that of course that’ll fix it.

$ rm -f tmpfile1 tmpfile2 tmpfile3
$ ls
tmpfile1
tmpfile2
tmpfile3

Nope. I then thought about why that might be for a second and realized that I probably accidentally left rsync running in the background when I ctl-c’d my dirty script. Naturally knowing a thing or two about UNIX commands, my quick solution was to use pkill -9 to clean up the leftover processes:

$ pkill -9 -U $UID rsync
$ rm tmpfile1 tmpfile2 tmpfile3

(The -U flag causes pkill to only kill processes owned by the given UID)

It didn’t work. The strange thing is that the rsync daemons are like Hydra heads—literally reappearing after I kill them!

$ ps aux | grep rsync
dylngg           18728   0.0  0.0  4298384    620 s006  U+    now      0:00.00 grep rsync
root             18669   0.0  0.0  4287656   2696   ??  Ss    5s ago   0:00.01 /opt/bin/daemondo --label=rsyncd --start-cmd ...
dylngg           18398   0.0  0.0  4298268   2300   ??  S     5s ago   0:00.01 ssh -l pi 10.0.0.3 rsync --server ...
dylngg           18396   0.0  0.0  4291992   1644   ??  S     5s ago   0:00.01 rsync -a pi@10.0.0.3:/tmp/.remotecmd/ ...

Perhaps since pkill defaults to matching against the process name, rather than all the arguments, I wasn’t killing everything? I see a ssh ... rsync --server in my ps output, so I tried pkill -f -9 -U $UID rsync and obviously that also didn’t work.

Okay. I’m starting to get paranoid. I know there’s a thing called rsyncd that does rsync stuff as a root daemon, so maybe that’s it? On my MacBook that’s the /opt/bin/daemondo process in the ps output because I use MacPorts. (Hint: I literally have no idea what rsyncd does) My solution:

$ sudo pkill -9 -f rsync
$ sudo rm -rf tmpfile1 tmpfile2 tmpfile3

It’s a bit heavy handed since I’ll kill any other rsync processes, but at this point I’m fine with that. sudo certainly has my back right!?

$ ls
tmpfile1
tmpfile2
tmpfile3

Nope.


At this point I’m almost at a loss, nearly resigned to letting the all mighty rsync continue to ensure I have garbage on my disk until I reboot the machine and can clean up the mess I have made. That is, until I remembered an important lesson taught to me by a mentor a year or so back: processes in UNIX are hierarchical, meaning they are spawned off by a parent process and when that parent process dies, the now orphan process is inherited by the root of all processes—init. So in my case when my script exits, either init keeps respawning rsync (unlikely) or I have a parent process respawning rsync.

It’s at this point that I realize that my script is running a sync function in the background without killing it when the main script exits. It dawns on me that perhaps functions in bash can be independent processes when the main process exits. This would mean that on exit the sync function will simply get adopted by init and continue running. Furthermore with my script, attempting to kill rsync will always fail because the parent process—the sync bash function—keeps respawning rsync every 3 seconds after rsync dies! The neat pstree command shows this:

$ pstree -s rsync
-+= 00001 root /sbin/launchd
 |-+- 04722 dylngg /bin/bash ./dirty.sh pi@10.0.0.3
 | \-+- 20513 dylngg rsync -a pi@10.0.0.3:/tmp/.remotecmd/ ...
 |   \--- 20516 dylngg ssh -l pi 10.0.0.3 rsync --server ...
 \-+= 20382 root /opt/bin/daemondo --label=rsyncd --start-cmd ...
   \--- 20383 root (bash)

(The -s flag filters down the tree to only include rsync and it’s ancestors)

So the super simple and now plainly obvious solution (while still blindly killing all rsync processes) is to do the following:

$ pkill -9 -f dirty.sh
$ pkill -9 rsync

Which fixed the problem! Yay!

I guess the lesson here is that you should a) reap all your background processes in shell scripts, shells won’t do that for you and b) be wary of the fact that functions in bash scripts can be considered their own processes and also won’t be reaped by the shell when they run in the background and the shell exits. That lesson was quite informative albeit very annoying.

goto top;