Tetragon: case study of security product's self-protection
Story background
CTO of cloud-native security company Isovalent announced that their eBPF-based Security Observability and Runtime Enforcement solution Tetragon ( WayBackMachine 20220516 ) become open source after years of development in May 16 2022.
Tetragon drew some attentions since it claims to be able defeat container escape against Linux kernel vulnerablity. Its rule-based detection/prevention has been questioned by security researcher Felix Wilhelm and Tetragon seems only implemented the defense mechanism for post-exploitation stage. It caught the attention of more security researchers in the following days. Pawel Wieczorkiewicz, a member of the PaX/GRsecurity team defeat the Tetragon’s defense after two hours of research, via extra a couple of lines of source code based on a public exploit of CVE-2021-22555. PaX/GRsecurity team revealed the technical details and why this type of mitigation must be failed.
Why Tetragon (and other security products) is meant to be doomed
The mitigation for exploit methods is usually aimed at three phases: Pre-exploitation, Exploitation, and Post-exploitation. It’s more difficult to design and implement for earlier stage mitigation(e.g: against Pre-exploitation), while the more difficult for attackers to bypass, e.g: PaX/GRsecurity’s KERNSEAL/AUTOSLAB. On the other hand, it’s easier to develop for the latter stage mitigation (more precisely to say, the detection mechanism) but it’s easier to be bypassed by attackers. Let’s see the mitigation assessment overview from our previous write-up:
Mitigation | Pre-exploitation | Exploitation | Post-exploitation | Bypassable |
---|---|---|---|---|
PaX RAP | N | Y | N | L4: Hardcore |
PaX KERNSEAL/AUTOSLAB | Y | N | N | L5: Nightmare |
VED wCFI | N | Y | N | L3: Hurt Me Plenty |
Metadata integrity | N | N | Y, AKO/LKRG/VED | L2: Bring It On |
VED self-protection | N | Y | N | L3: Hurt Me Plenty |
The self-protection level of typical security products (like EDE/XDR) is usually only reach the “L1: I can win!”. Tetragon’s defense capabilities are almost at the same level, but Tetragon’s threat model is so wrong because Tetragon needs to protect against attacks on the Linux kernel rather than the application layer. PaX/GRsecurity team’s public bypass method for kprobes can also be applied to any hook-based implementaiton, which means most of the LKM-based security (including rootkits) are affected by this method. Fortunately, VED’s self-protection is immune to this attack method. Security product must have some levels of self-protection against advanced threats, otherwise they will probably fall into the repeating cycle of “who’s watching the watcher” issue.
Why VED can immune to this method
We mentioned about how important of VSPP (Vault self-protection project) could be in the situation where multi-players come into play in our previous write-up:
In the realm of advanced threat protection, this type of mitigation is just tip of iceberg.
Issues in eBPF/LKM based security implementation
-
Any security mitigation introduced into Linux kernel may have side effects such as lock issues, and more severe cases may lead to race condition which is likely turns into an exploitable bug.
-
Typical regression testing can not reproduce some bugs, e.g: special signal ops needed to be re-impl.
-
Performance impact, the security mechanism must at least take into account the target scenario at the beginning of the design.
-
Self-protection
-
Threat model, the wrong threat model is worse than no threat model. It may cause the butterfly effect in the whole process of “design -> implementation -> testing/QA -> delivery - integration -> deployment -> production” and eaten by complexity. If your adversaries are skiddies, the typical GNU/Linux with best-practices CIS/STIG deployment should be good to deal with. If your adversary is the hidden player (We’ll pray for you and hope you’ll never have to face them), well, from Linux kernel security’s perspective, even our VED (Vault Exploit Defense) might not be a good option. You’re only have one option: PaX/GRsecurity.
Conclusion
Linux kernel security has become an unavoidable problem since the popularity of cloud native is rising. A successful container escape will compromise the host which means your whole clusters could be endanger. You don’t want your security products spent million of dollars with years of development ended up in two hours of “researcher” by a guy you never met (well, you should thank to the guy who found the problem) and get pwned, do you? if the anwser is “no”, then you should consider build the “proper” threat model in the beginning.
Btw, from Vault Labs’s perspective, eBPF is better for monitoring than prevention. You should let the SIEM/SOC team do the rest. KISS principle works for infosec management as well.