Dracut compression research

Patrick asked me to look into the compression options offered by Dracut for initramfs compression. This is the results of that research, written here since the results are a bit lengthy and require the ability to post images for sharing the charts I made.

tl;dr: I think we should use xz compression. It’s acceptably fast, it compresses way better, and there doesn’t seem to be any compelling reason to avoid it.

The initramfs has to be decompressed by the kernel on boot, so I only researched compression algorithms supported by the kernel in Debian Bookworm. (Trixie supports the exact same compression algorithms, so this research should apply to Trixie as well, assuming the compression utilities perform similarly in Trixie to how they perform in Bookworm, which I find likely given the age of these utilities.)

Bookworm’s kernel and Dracut both support the following compression algorithms:

  • gzip (this is what we’re using now)
  • lz4
  • lzma
  • lzo
  • xz
  • zstd
  • cat (uncompressed, taken into consideration to provide a best-case scenario for time and a worst-case scenario for size)

Dracut also supports bzip2, but Bookworm’s and Trixie’s kernels both did not appear to support that.

To determine which algorithm was most likely desirable, I benchmarked them against each other. The following testing methodology was used:

  • Create a file, /etc/dracut.conf.d/99-compress.conf to set the compression algorithm in.
  • For each compression algorithm, change 99-compress.conf to specify the desired algorithm, then run time sudo dracut --force three times.
  • Record the output of time after each dracut invocation.
  • Record the size of the output initramfs after each dracut invocation.

In all instances, the file size of the output initramfs was identical (to within a kilobyte at least) across all three runs of Dracut when using a single compression algorithm, so I only recorded file size once per algorithm. All tests were run in a KVM/QEMU virtual machine with 4 virtual CPUs, 4 GB RAM, and all host CPU features passed through to the guest (-cpu host). The host system has an i9-14900HX processor. The raw results of the benchmarks are as follows:

  • gzip
    • run 1: 0.06s user 0.10s system 1% cpu 12.240 total
    • run 2: 0.01s user 0.02s system 0% cpu 10.907 total
    • run 3: 0.01s user 0.02s system 0% cpu 10.912 total
    • file size: 39724K
  • lz4
    • run 1: 0.02s user 0.01s system 0% cpu 7.476 total
    • run 2: 0.02s user 0.01s system 0% cpu 7.408 total
    • run 3: 0.01s user 0.02s system 0% cpu 7.541 total
    • file size: 48064K
  • lzma
    • run 1: 0.09s user 0.07s system 0% cpu 46.595 total
    • run 2: 0.01s user 0.02s system 0% cpu 45.677 total
    • run 3: 0.02s user 0.01s system 0% cpu 45.339 total
    • file size: 26292K
  • lzop
    • run 1: 0.09s user 0.08s system 0% cpu 38.810 total
    • run 2: 0.02s user 0.01s system 0% cpu 37.862 total
    • run 3: 0.01s user 0.02s system 0% cpu 37.837 total
    • file size: 44880K
  • xz
    • run 1: 0.09s user 0.08s system 1% cpu 11.655 total
    • run 2: 0.01s user 0.01s system 0% cpu 10.542 total
    • run 3: 0.02s user 0.01s system 0% cpu 10.571 total
    • file size: 28288K
  • zstd
    • run 1: 0.10s user 0.07s system 1% cpu 8.676 total
    • run 2: 0.08s user 0.09s system 2% cpu 7.680 total
    • run 3: 0.01s user 0.01s system 0% cpu 7.404 total
    • file size: 33248K
  • cat (no compression, baseline)
    • run 1: 0.09s user 0.08s system 3% cpu 5.279 total
    • run 2: 0.01s user 0.02s system 0% cpu 4.189 total
    • run 3: 0.01s user 0.02s system 0% cpu 4.252 total
    • file size: 136088K

I did not benchmark boot speed with each of the different algorithms used, although I did verify that the virtual machine booted with an initramfs made with each algorithm. The boot speed seemed pretty much the same to me with each algorithm, and would have been difficult to measure in an objective, reliable fashion. I will note, I may have noticed a very slight speedup during boot with the zstd algorithm.

The following two graphs are provided to visualize the data above (created using LibreOffice):

Takeaways from the above data:

  • From a size standpoint, lzma performed the best, lz4 performed the worst.
  • From a speed standpoint, zstd and lz4 are about tied for best speed, while lzma is the worst for speed.
  • lzo/lzop is certainly a bad choice - it approaches lzma in slowness while also making a file larger than gzip and almost as large as lz4. It’s the worst of all worlds combined.
  • xz is probably the most compelling as far as a balance between good speed and good compression - it’s almost as good as lzma as far as size, while being just a hair faster than gzip.
  • zstd is a bit of an improvement over gzip in size, while also being very very fast.

I am of the opinion that xz is the best choice here based on this data, as our existing compression speed has been acceptable and shaving off 2-3 seconds per initramfs generation doesn’t seem that compelling except perhaps when doing ARM builds of Kicksecure on x86_64 hardware.

It is worth noting, people who know much more about compression tools under the hood than I do have complaints about design flaws in xz, which are documented here. I do not believe the issues mentioned in this article are of concern for the following reasons:

  • The article primarily relates to xz’s suitability for long-term archival. Kernel initramfs files aren’t really something where “long-term archival” is a concern.
  • Most of the article focuses on xz’s lack of resiliency in the face of partial archive corruption. But we don’t care about this at all, we assume the initramfs is bit-for-bit identical to when it was created, and indeed in the future we will likely be signing the initramfs as part of Verified Boot (which will mandate that the initramfs be bit-for-bit idential to when it was created).
  • Other parts of the article focus on compatibility issues with different versions of xz. This also is not a concern - as long as dracut produces an initramfs that the Linux kernel can read and boot with, things are compatible enough for us.
  • The rest of the article appears to focus on various design decisions in xz that could have been made better. None of that is relevant for us though, since whatever feature set Dracut is using in xz is good enough to give it a very acceptable compression speed while also providing the second-smallest file size of any of the compression algorithms documented here. Even if xz could be better than it is, right now it’s better than everything else except maybe zstd if you really care about speed.

Someone else I saw did similar compression performance benchmarking and concluded that zstd was the best general-purpose algorithm, though the data they present shows that parallel xz actually performed better than zstd’s best compression in both compression size and speed. One concerning thing this article does point out though is that xz can use a lot of RAM. For this reason, I tried generating a dracut initramfs using xz compression in a Kicksecure-CLI VirtualBox VM with only 512 MB RAM. Memory consumption during the initramfs generation rose to a maximum of 318M according to htop, and went down to 263M once the initramfs was generated, meaning that dracut and whatever tools it ran (including xz) used about 55M of memory during the generation process. Only about 1.97M of swap ended up used. This is much more than both gzip and zstd (which both only require about 5M of memory), but seems acceptable to me. 55M is not that much, especially given that fwupd is just sitting there eating 97M while doing basically nothing.

As a final note, I checked cve.org to see if security vulnerabilities were found in either the xz or zstd compressors in the Linux kernel. I found no vulnerabilities for either algorithm (using the search terms “linuz kernel xz” and “linux kernel zstd”).

In conclusion, I believe xz is the best compression method for us to use with dracut, due to its slightly better performance and much better compression compared to our current default, gzip. zstd is my second choice, and may be what we want to use if we have problems with xz’s speed or memory consumption.

2 Likes

Fedora were using xz and done the comparison of xz VS zstd, the conclusion reached is that zstd overall will give better performance for the users:

https://fedoraproject.org/wiki/Changes/Switch_RPMs_to_zstd_compression#Use_case:_Firefox_installation

Another comparison by different project (not against xz but useful):

Second issue of xz is the Nebraska issue (too much work for very important software for single or very few devs over long time):

The XZ maintainer experienced burnout and consequently scaled back some of the contribution reviews, which led to the XZ issue. GNOME discussed this issue as well:

Bottom line is that zstd is the way to go.

Good thing that you reached to the conclusion second best choice.

1 Like

It’s worth noting that Fedora switched to zstd specifically for RPM
compression. The initramfs use case is substantially different - size is
more important than speed (to a point) because a too-big initramfs might
fail to boot entirely. This may become a real risk once we start using
Dracut’s “sloppy hostonly” improvements, which include a lot more kernel
modules and thus increase the size of the initramfs. The performance impact
of zstd is benchmarked for initramfs compression above, and the impact is
arguably negligible.

We shouldn’t let XZ’s reputation be tarnished after the xz-utils backdoor,
in my opinion. The attack’s level of sophistication was high enough that I
doubt most people would have handled things any better than the primary
maintainer did, and after this experience I don’t believe he’ll make the
same mistakes again. XZ’s source code is, to the best of our knowledge,
clean currently, so I don’t believe the attack is a good reason to avoid it.

As for the Nebraska issue, that is a real problem. Zstd has the full force
of Facebook backing it up, so it’s in a much better position to remain
maintained. (Arguably it’s in a good position to be backdoored by a
nation-state too, but that depends on how much you trust Meta I guess :P)
That being said, as long as xz works reliably for initramfs compression
when Debian 13 rolls around, I don’t see an issue - Debian will be
responsible for keeping it safe and maintained at that point, and we can
change our minds for a future release of Kicksecure. If we were using a
rolling form of Debian, I’d probably pick zstd due to this concern, but
since we’re using stable releases, I don’t see any substantial danger in
using xz.

1 Like

It was done as change everything to zstd:

Lol yeah thought about that from this perspective, but if we come to the linux kernel itself, these companies hands already there anyway (intel,nvidia…etc). So we go with the free software or not (and reliably secure) regardless the source from where, otherwise its a lost case.

Yeah guess who discovered xz backdoor? not a debian maintainer or so, but a microsoft engineer…

So if there are no big issues with zstd + debian, i dont see it a bad choice imho.

1 Like

OK, good to know.

Indeed, I made that comment mostly as a joke, thus the “:P”.

The XZ backdoor never affected any stable release of Debian, because Debian doesn’t introduce new versions of software into older releases. So while the backdoor was not discovered by Debian, Debian did a very good job of defending the vast majority of their users from it even before it was known to be a thing.

I believe I pointed out a potential issue when mentioning overly large initramfs files causing boot failures on some machines. Is it a big issue? I guess that depends, but I know from experience that there are systems that, when faced with certain kernel drivers, will fail to boot with an initramfs compressed with a less efficient algorithm, and will succeed booting when that same initramfs is compressed with xz.

1 Like

I am not worried about xz. What happen to xz can happen to any project. After the xz backdoor has been spotted, there have been a lot eyes on xz. Therefore xz is now one of the projects that has received extensive code review.

I was wondering what should be the priority here. These are the choices that I could identify:

  • A) small size: optimize Kicksecure images for the smallest possible size?
  • B) fastest initrd creation time: The smallest amount of time spent on initrd creat during upgrades?
  • C) boot speed?
  • D) best boot compatibility for low RAM systems?

The C) and D) seem more important than A) and B).

This probably means, for a brief moment during decompression, both the compressed image and the uncompressed (expanded) contents may reside in RAM at the same time. On systems with very limited memory, one might have a RAM usage spike and need enough RAM to hold both the compressed image and the fully decompressed contents simultaneously, at least during the extraction phase.

But I don’t know if that is a realistic consideration as user space might need more memory than this spike.

Keeping the default, uncompressed, also remains an option.


https://www.reddit.com/r/archlinux/comments/ytk6t3/blog_should_i_compress_my_initramfs/


Debian:
Quote Debian bug report: initramfs-tools: [RFC] Compress initramfs file with zstd

  • [742c8ee] Use zstd as default compression for initramfs (Closes: #976054)

But we’re not getting this change because we’re using dracut instead of initramfs-tools.
(replacing initramfs-tools with dracut - Development - Whonix Forum)

related:


dracut allegedly according to the following configuration file uses zstd by default. I wonder why ours is uncompressed by default.

The default is gzip-compressed, not uncompressed. Dracut will indicate that it is using an automatically detected compression method (“pigz”, which is parallel gzip as I understand it) when the compression method is not set. So right now we’re using gzip compression in all instances.

Systems with multiple tens of gigabytes of RAM can error out with “no available memory” if trying to boot with an initramfs that is too large. The issue is not “total amount of memory”, the issue is “total amount of contiguous memory that GRUB can see” (if I’m understanding correctly), and on some hardware, GRUB can’t find a very large contiguous block.

xz isn’t the fastest, but it’s faster than gzip, and much more efficient size-wise. I didn’t notice any impact on boot speed, and smaller size provides better compatibility, so that’s why I think xz strikes a good balance between all four points above. zstd sacrifices size (and therefore possibly compatibility) in favor of speed.

1 Like

Testing the settings from Fedora CoreOS:

  • zstd with max compression:
    • run 1: 0.01s user 0.03s system 0% cpu 18.719 total
    • run 2: 0.02s user 0.02s system 0% cpu 17.553 total
    • run 3: 0.01s user 0.03s system 0% cpu 17.478 total
    • file size: 29628K

We get a slightly (though only very slightly) larger file than xz would produce, and it takes about 7 seconds longer to produce it than xz or gzip would have taken. xz wins for both size and performance.

1 Like

Alright. Please implement xz.

1 Like