Disabling SMT in security-misc may make security worse on some POWER9 systems

security-misc's policy of disabling SMT may actually make security worse on POWER9 systems that use paired cores (e.g. the 18-core and 22-core CPU’s sold by Raptor). It would be useful to figure out whether we want to disable that Kicksecure feature on POWER9 systems with paired cores only (I don’t know how to detect this programatically), or perhaps disable that feature completely on POWER9 (given that with SMT enabled on POWER9, there are either 4 or 8 threads per cache, and the documented attacks on SMT only work with 2 threads per cache). Maybe factor out that specific policy into a separate optional package, and advise that POWER9 users should only install it if they have a CPU with unpaired cores? I don’t know whether it’s possible to disable paired cores in software, but if so, that would also be worth considering.

OK, so this is a little more complex than I thought.

  • On both Intel/AMD x86 and POWER9, L1 cache is per-core.
  • On Intel/AMD x86, L2 cache is per-core, while on POWER9, it is per-chiplet (a chiplet is 2 cores).
  • On Intel x86, L3 is shared between all cores; apparently some newer AMD x86 CPU’s have two L3 caches, each of which is shared between 4 cores; on POWER9, it is per-chiplet.
  • This information is available to both the kernel and userspace; the lstopo utility will show it on GNU/Linux.
  • It is possible to disable arbitrary logical CPU’s from userspace if you have root privileges (I tested this on Intel x86 and POWER9, not sure if it’s supported on all Linux architectures); you can combine this information with lstopo's output to make sure that caches are not shared between logical CPU’s.

So, here are some questions to ponder:

  1. Is L2 sharing between threads a security risk that Kicksecure wants to prevent, even if it damages performance? If so, Intel/AMD x86 is already fine, as are POWER9 CPU’s with unpaired cores, but we will want to add a mitigation for POWER9 CPU’s with paired cores. The POWER9 paired core mitigation will impact performance by a factor of 2. Disabling SMT, as Kicksecure currently does, probably magnifies the risk without an additional mitigation.
  2. Is L3 sharing between threads a security risk that Kicksecure wants to prevent, even if it damages performance? If so, POWER9 CPU’s with unpaired cores are already fine, but both Intel/AMD x86 and POWER9 CPU’s with paired cores will need mitigations. The L2 POWER9 paired core mitigation will also cover L3. Intel x86 mitigation will cut performance down to 1 thread; AMD x86 mitigation will cut to either 1 or 2 threads. Disabling SMT, as Kicksecure currently does, probably magnifies the risk without an additional mitigation.

I’m not totally clear on how risky L2 and L3 (or whatever other state that is correlated with them) are. Maybe madaidan would be able to comment?

I have a Bash+Python script that works for disabling cache-sharing cores on my Intel x86 machine. If Kicksecure is interested in mitigating L2/L3 sharing vulnerabilities, I’m happy to post the code as GPLv3+ and contribute it to Kicksecure.

Thank you for bringing this up!

That’s currently far to deep down the rabbit hole than I am ready to research. There are far more fundamental tasks to get the Kicksecure project fully bootstrapped, that is ISO and then BIOS booting, EFI booting and the generally messy situation of Linux versus SecureBoot RestrictedBoot.

Could you please consider asking these general security questions in general Linux computer security related places and/or directing at security researchers/experts that we could ask for what’s best here?

Related to:
“Decades of research by the security community lead to many best practices. Kicksecure implements the consensus of these reasonable security measures.”

Serious support for platforms other than Intel/AMD64 is difficult and might be impossible without dedicated maintainer.

As for performance versus security, Kicksecure should prefer security over performance as long as that’s reasonable.

(The line here might be blurry. For a clear case, I mean, if in theory there was a security setting that for example makes using any use of a graphical user interface totally unusable, then we of course shouldn’t do it.)

Not sure it can be used but might help to understand which extend of settings change is currently required.

For a clean implementation, a declarative expression (static config files) should be preferred over a functional (scripted) approach.

Or perhaps just user documentation as a first step?

It seems that this would require some functional (scripted) approach anyhow. First, determine the state of the CPU (paired vs unpaired) and then set some different settings based on that? That seems kinda awful, complex. Perhaps worth a bug report / feature request against the Linux kernel?

[Imprint] [Privacy Policy] [Cookie Policy] [Terms of Use] [E-Sign Consent] [DMCA] [Contributors] [Investors] [Priority Support] [Professional Support]