Avoiding Single-Point-of-Failure and securing the Root Infrastructure: TCG TPM 2.0

March 7, 2023 | 7 min Read

Avoiding Single-Point-of-Failure and securing the Root Infrastructure: TCG TPM 2.0

What is TCG TPM 2.0?

Trusted Computing Group (TCG) Trusted Platform Module (TPM) 2.0 is a hardware-based security solution that provides a secure foundation for computing devices. TPM 2.0 is a chip that is installed on or embedded in the motherboard of a device, such as a laptop, desktop, workstation or server, and is designed to ensure that the device is running in a trusted state. It provides a range of security services, including secure boot, device authentication, key management, and cryptographic operations. TPM 2.0 is designed to be interoperable across different platforms and operating systems, and is supported by a wide range of hardware and software vendors. With TPM 2.0, users can be assured that their devices are protected against a range of threats, including malware, phishing attacks, and unauthorized access.

The vulnerablity in TCG TPM 2.0 reference implementations

The recent full disclosure of two vulnerabilities, namely CVE-2023-1017 and CVE-2023-1018, has had a significant impact on the security of systems that widely deploy TPM devices. The wide impact of these two vulnerabilities is due to the fact that they are vulnerabilities of the reference implementation. (Update: Quarkslab released a technical analysis Vulnerabilities in the TPM 2.0 reference implementation code in Mar 2023)

The TCG standard has already been fixed and some OEM vendors have also responded to the fixes for their implementations. In the TPM2 standard, incorrect boundary checks were executed when processing parameter encryption, allowing the host side to implement out-of-bounds reads and writes to the last two bytes of the command buffer. The incorrect boundary check exists in both the host-side protocol stack and the TPM2 implementation (including firmware of TPM2 chips and software implementations such as swtpm). Therefore, updating only the host-side software is not enough to completely eliminate this vulnerability. The TPM2 implementation must also be updated; otherwise, an attacker can still use the vulnerable host-side software to communicate with the vulnerable TPM2 and trigger this vulnerability.

This type of out-of-bounds read and write allows an attacker to read the last two bytes of the command buffer through carefully constructed TPM2 commands. These bytes may contain sensitive data left over from previous calculations (especially since internal resources in hardware TPM2 are very limited, and memory region permission control is usually not implemented). An attacker can also modify the last two bytes of the command buffer to affect the internal calculation process of TPM2. However, using this vulnerability solely, attackers cannot easily specify or predict the specific location of the command buffer in the address space of TPM2 implementation.

The design principles of TPM2 are as follows: TPM2 can generate keys and seal data, but by default, it does not store the generated keys and sealed data internally. Instead, they are stored in a temporary storage area that can be referenced by the corresponding handle files. Data objects inside TPM2 are organized by hierarchies, and lower-level objects are encrypted using the seed of the upper-level object and a policy (such as PCR status) . They can only be used when the upper-level object is usable and the associated policy (such as PCR status) defined upon its creation is consistent. The top of each hierarchy is the primary key, which is only restricted by the associated policy.

Any valid handle file can transfer the object it references to the persistent storage area inside TPM2. Handle files of unpersisted objects will become invalid after a restart.
The peculiarity of the primary key is that as long as the seed saved inside TPM2 during its reset and the parameters (including policies) provided when generating the primary key do not change, the generated primary key will be exactly the same. Therefore, the primary key can be recreated each time it is used without being persisted.
The generation of lower-level keys has randomness, but their private keys (encrypted by the upper-level object and the policy) and public keys can be saved to the file system and loaded into TPM2 when needed, and referenced with handle files.

As for sealing, TPM1 uses an SRK to encrypt all sealed data, and the set of PCR status that allows data to be unsealed can be different for each. The SRK is regenerated and stored internally when TPM1 ownership is obtained. TPM2 treats sealing as special lower-level keys that can be encrypted by a policy and any upper-level object that can be used for encryption. When unsealing, it also needs to be loaded and referenced using handle files like lower-level keys (and can also be persisted). When the same upper-level object exists and the policy is consistent, the data can be decrypted.

In theory, attacks such as code injection and return-oriented programming (ROP) could obtain sensitive data from objects loaded in the TPM2. However, executing such attacks within the TPM2 has certain difficulties. Therefore, if an attacker can physically access the host and the TPM2 device, any object that the attacker can successfully load and use is at risk of data leakage. The risk to objects that the attacker cannot successfully use depends on whether the TPM2 implementation can ensure that even if higher-level objects are leaked, lower-level objects cannot be decrypted in case of policy inconsistency, i.e., whether policies deeply involve the encryption process of lower-level objects or simply serve as authentication. However, this type of attack is obviously ineffective for objects that do not exist in the TPM2, i.e., objects that are not persisted and not loaded into the TPM2.

First in-the-wild UEFI bootkit bypassing UEFI Secure Boot

ESET uncovered a Bootkit attack named “BlackLotus” that has been widely used. It’s not the first Bootkit attack that has been detected, but it’s the first one to leverage a vulnerability (PoC/CVE-2022-21894) to bypass UEFI Secure Boot and exploit it on a large scale.

UEFI Secure Boot is a firmware feature that is designed to prevent unauthorized software from loading before the operating system loads. By exploiting this vulnerability, the Bootkit attack can bypass this security feature and infect the system at a very early stage, making it extremely difficult to detect and remove. What makes this attack particularly concerning is the potential to combine it with other vulnerabilities like the one in TPM2 we’ve discussed above. If RaaS (Ransomware as a Service) framework is combined with a vulnerability like the one in TPM2, it could have disastrous consequences for both individuals and enterprises. Attackers could gain complete control over the system, steal sensitive data, or launch attacks that could cripple the system or the entire network. This makes it all the more important for organizations and individuals to adopt a defense-in-depth strategy to prevent such attacks.

Mitigation

To mitigate the risks posed by such vulnerabilities:

Limit the access to the TPM, ensuring that only specific GID/UID and root can access it.
OS runtime protection, such as VED (Vault Exploit Defense), PaX/GRsecurity, etc.
Mandatory Acccess Control for the apps via AppArmor and Seccomp.
TPM should only be used for measurement, remote attestation, and unsealing other keys stored in the file system.
Audit the process control logic in the TPM and give priority to using TPMs whose policies deeply involve the encryption process of lower-level objects rather than merely serving as authentication.

The key takes here for building the proper threat model is that the most dangerous are the keys that already exist in the TPM and are usable (ready to be operated or unsealed), while the safest are the keys that have not been loaded into the TPM. Anyway, it’s recommended that decentralized nodes that do not rely on physical measures from Tier 3 or Tier 4 data centers consider implementing a combination of OS runtime protection and attestation via TCG TPM 2.0. For more information on the EVIL MAID scenario and how it can be addressed, please feel free to contact us.