Unplugging external drive doesn't trigger a shutdown

Hello Kicksecure folks,

I installed Debian on my brand new Western Digital external SSD drive. Then I proceeded to distro morph it following the guide (prob should have went with the ISO route) and everything works as intended.

However when I unplug my SSD or it comes unplugged somehow (has happened before) it doesn’t trigger a full shutdown?

Why doesn’t it fully shutdown when unplugged?
I do want to add that also set up LUKS when I installed Debian.

Do I need to create a udev rule that runs a shutdown script (unmount, cryptsetup luksClose, and sync)?

panic button / panic shutdown / BusKill - The USB kill cord for your laptop - Development - Whonix Forum alike functionality is not a standard, default functionality of Debian (or any operating system that I am aware of).

Adding such a feature is planned but will take time.

2 Likes

Ok, in the meantime though is there a udev starting point I could try to test?

I’d be willing to test anything related to implementation of such a feature.

@NTH9R6 I think you might be confusing TAILS and kicksecure. TAILS is the only OS that I know-of that is designed to trigger an emergency shutdown when the USB stick that it’s installed-on is removed:

But, as Patrick pointed-out, probably the easiest way to get your system to shutdown when a USB device ejects is BusKill, which can be installed on debian-based systems:

sudo apt-get install buskill
buskill

But if you prefer to do this with udev, we have a guide on that too:

Update: it actually looks like privacy__tech_tips@tube.tchncs.de just published a video showing how to do this on kicksecure today

They show how to do the self-destruct trigger, but you can trivially change that to lock the screen instead, if you prefer.

Feel free to experiment with this.

TAILS is the only OS that I know-of that is designed to trigger an emergency shutdown when the USB stick that it’s installed-on is removed

but you can trivially change that to lock the screen instead, if you prefer.

Yeah to be clear I’m not looking to shred the LUKS Header I don’t want to destroy my data. While I do understand that such an emergency shutdown might loose some data (like that of the given booted session) I’m okay with that.

What I’m trying to achieve is that in the case of an emergency or that my SSD gets unplugged that it triggers a full shutdown meaning that the LUKS volume is in a closed state. Ideally would like to go further with something similar to Tails that clears wipes RAM also so that things like mentioned by @Penthouse would clear the LUKS password from system memory.

I need to look more into all this and report back. The emergency shutdown seems like it could be implemented but the clearing of RAM due the different design (Tails vs dracut) would need to be figured out more. BTW I was not aware of the ram-wipe package until I made this reply.

www .kicksecure.com/wiki/Dev/RAM_Wipe#Differences_of_ram-wipe_versus_Tails_Memory_Erasure

@maltfield
Does the shutdown -h now command close the LUKS volume or should unmount and cryptsetup luksClose used before that command?
If so I wonder if replacing the RUN+="shutdown -h now" with one of the dracut ram-wipe module hooks would be efficient enough for me to test in non live mode?

Take a look at luksSuspend

I would execute a luksSuspend followed by shutdown. If you want a faster shutdown, try echo o > /proc/sysrq-trigger &. See how we do this at the end of the BusKill self-destruct script:

#################################
# WIPE DECRYPTION KEYS FROM RAM #
#################################

# suspend each currently-decrypted LUKS volume
${ECHO} "INFO: removing decryption keys from memory"
for device in $( ${LS} -1 "/dev/mapper" ); do

	${ECHO} -e "\t${device}";
	${CRYPTSETUP} luksSuspend "${device}" &

	# clear page caches in memory (again)
	sync; echo 3 > /proc/sys/vm/drop_caches

done

#############################
# (IMMEDIATE) HARD SHUTDOWN #
#############################

# do whatever works; this is important.
echo o > /proc/sysrq-trigger &
sleep 1
shutdown -h now &
sleep 1
poweroff --force --no-sync &

This is definitely a good idea. I took a look at the linked video about BusKill, the idea seems good but I don’t think that’s the method we should use for Kicksecure, because it’s bound to the USB product ID. You have to custom-write the udev rule for the specific USB drive in use. We could in theory detect this on boot I guess, but that seems a bit clunky. I also don’t trust USB drive manufacturers to choose truly unique IDs, as opposed to just putting in some example ID they found in documentation somewhere. (If even AMD can make this mistake with encryption keys related to microcode security, I have no reason to trust off-brand USB drive makers to do any better.) The last thing a user needs is to have their system suddenly shut down because they removed a drive that happened to look suspiciously similar to the main drive.

I looked at how Tails has this implemented, and basically:

  • They have a custom C executable, udev-watchdog, which appears to watch for a specified device to be removed and send some sort of signal to whatever called it when that happens.
  • There’s then a wrapper around it that figures out which device is the boot drive, uses udev-watchdog to watch for when it gets removed, and then triggers a hard shutdown when that happens.

This is a somewhat better solution IMO, and the code behind it is open-source so in theory we could just integrate it directly into Kicksecure. It does have some possible downsides though, most notably I found some reports that the feature didn’t always work three years ago, when the last code change was four years ago. Maybe that was just user error though.

For Kicksecure’s implementation, I’d like to avoid custom C code since Kicksecure tries to use architecture-independent code wherever possible. We can use a udev-based approach similar to Tails, but written using Python and pyudev instead. I’ll try to implement this probably tonight and see what happens.

2 Likes

Won’t be able to finish this tonight, determining the block device that’s actually underneath the filesystem is a bit of a challenge. One of the comments I wrote in my current code reads:

    # ... the real underlying device may be many
    # layers deep. For instance, with the Kicksecure ISO, the root filesystem
    # is mounted from an overlayfs which uses a lower directory mounted from a
    # loop device backed by a squashfs located in a directory under a
    # mountpoint backed by /dev/sr0 (when booting from a CD-ROM drive that
    # is).

There’s also some higher-priority work for me to tackle before I finish this. I have most of the research done though, including how to find underlying devices or mountpoints in a wide variety of situations, and how to monitor the real device once you find it. Also wrote a rather thorough “force shutdown” script based partially on @maltfield’s suggestions above.

3 Likes

Did a lot of research into this, the approach I was taking technically worked but was not at all guaranteed to work. The issues (and how to work around them) I documented pretty well in a code comment in the program I’m currently writing to implement this:

/*
 * This program is designed specifically to immediately and forcibly power off
 * the system in the event the device providing the root filesystem is
 * abruptly removed from the system. The idea is that a user can shut down
 * a portable installation of Kicksecure by simply yanking the USB drive
 * containing the installation from the computer. Tails provides essentially
 * the same feature, however it is known for occasionally failing to do its
 * job properly.
 *
 * The fact that we're triggering a shutdown when the device containing the
 * root filesystem vanishes presents a number of significant challenges:
 *
 * - The device providing the entire operating system is gone. The only things
 *   we will still have left are the kernel, files loaded into RAM (for
 *   instance under /run), and anything that happens to still be in the
 *   system's disk cache.
 * - Virtually any process on the system may abruptly crash at any time. This
 *   isn't just because applications may be unable to access files. The Linux
 *   kernel's virtual memory subsystem doesn't just page out RAM contents to a
 *   swap file, it will sometimes simply erase pages containing executable
 *   code from memory if it can reload that code from disk later when needed.
 *   If part of a program isn't present in memory, and then the root device
 *   vanishes, any attempt to use code in the absent part of the application
 *   will result in the application crashing. (Attempts to access data in RAM
 *   that happened to be paged out will result in a similar crash.)
 * - We have no control over what is and isn't in the disk cache, which makes
 *   it unsafe to launch any dynamically linked executable. What happens if we
 *   need to load a missing part of libc? What if the dynamic linker itself
 *   needs loaded from disk?
 * - Systemd could lock up at any time, since the init process isn't immune to
 *   having bits of it erased from RAM to free up memory. If systemd receives
 *   a SIGSEGV, rather than crashing (which would panic the kernel), it goes
 *   into an "emergency mode" that tries to keep the system as operational as
 *   possible even though PID 1 is now out of service.
 *
 * Circumventing this set of difficulties is not easy, and it might not even
 * be entirely possible. To give our feature the highest chance of success:
 *
 * - We use memlockd to lock systemd into memory. It can holds its own pretty
 *   well in the event of a segfault, but if its crash handler ends up
 *   re-segfaulting, that could get ugly. For good measure, we lock libc,
 *   libsystemd-core, and libsystemd-shared into memory too so that the crash
 *   handler has the highest possible chance of not re-crashing.
 * - We compile the utility at boot time, statically link it against all of
 *   its dependencies (really only one, glibc), and load it into /run. This
 *   allows for decent architecture independence while removing any dependency
 *   on anything that isn't in RAM, thus (hopefully!) making the process
 *   crash-immune.
 * - Because we're static-linking against glibc, we cannot call anything
 *   defined in stdio.h. This is because glibc uses dlopen() to load iconv
 *   modules, which are used internally by glibc for locale support. Things
 *   defined in stdio.h may use iconv, so calling anything there will
 *   basically make our static-linked executable become dynamically linked,
 *   which could segfault it since the root filesystem is gone. We can't call
 *   anything that could touch Name Service Switch (NSS) either, but we have
 *   no need to do so, so we should be safe there. See
 *   https://stackoverflow.com/questions/57476533/why-is-statically-linking-glibc-discouraged
 * - We can't use udev either because libudev is only available as a dynamic
 *   library. That means we have to listen to kernel uevents directly to
 *   determine when the root device vanishes. Thankfully this isn't as much of
 *   a pain as it might sound like.
 * - We don't call out to any external process, since those external processes
 *   could segfault.
 *
 * This is likely superior to Tails' implementation, which uses udev (and thus
 * dynamic linking), uses an interpreter-driven script to shut down the system
 * when the root device vanishes, and calls out to external executables to
 * actually shut the system down. These issues are likely why Tails'
 * implementation of emergency shutdown occasionally fails. See
 * https://www.reddit.com/r/tails/comments/xh8njn/tails_wont_shutdown_when_i_pull_usb_stick/
 * (there are other similar posts as well).
 */

The rest of my research was focused mainly on figuring out how to read kernel uevents without involving libudev. This took a while, partially because libudev itself was sticking its head in the way and confusing my program’s output, and partially because man 7 netlink was extraordinarily misleading (it told me the data from the kernel I was reading was going to be in one format, it turned out to be in a completely different format). However, I’ve gotten to the point where I can read kernel uevents, which will let me detect when the root filesystem drive is removed without needing any other libraries or executables (which is important for the reasons described above).

Once this is done and working, it might be worth trying to contribute it back to Tails (or at least let them know about our work) so they can improve their implementation as well.

2 Likes