In order to trust a computer application it is necessary to be able to trust all the hidden infrastructure supporting it aswell. Some of this is obvious: If your operating system contains a root kit, application level security becomes almost useless. To fix this we try to ensure that all privileged code is under our control, which if we want to prevent offline attacks involves signing the binaries and verfiying those signatures at load time.

As with all security, the weakest link determines the strength of such a system: as soon as one stage of the boot process fails to verify the next, an attacker can install malicious code without the later stages knowing. This is why code signing applications while running an unsigned operating system offers very little protection against sophisticated attacks, and similarly why code signing the operating system and boot loader does not work without signed firmware.

Intel’s solution to this problem is Boot Guard, where the chain of trust starts in the CPU microcode and Management Engine firmware. There have been various (Ermolov, 2016) attacks (“Bypassing Intel Boot Guard,” 2015) on this technology, most of which targetting configuration mistakes by vendors or vendor code running after verification is done.

The first element of Boot Guard that is executed as normal code on the host CPU is the ACM, which is an Intel binary blob verifying the vendor firmware. It gets the security policy and keys from the Management Engine and verifies the vendor firmware, after which it sets up a safe environment for the vendor Initial Boot Block to execute in.

At this point in the boot process there is no DRAM available yet and as such the ACM will configure the CPU cache not to evict any lines, which allows the firmware to use the cache as RAM. This cache-as-RAM (CAR) is used to store state for the ACM and to provide a secure copy of the IBB to execute.

Usually any form of verification will look somewhat like this:

	memcpy( safe_buffer, data, data_size );
	if ( !verify_signature( safe_buffer, data_size ) )
		goto error;
	use_data( safe_buffer, data_size );

in order to prevent time of check/time of use attacks. The ACM code does no such thing: It simply verifies the data in place and proceeds to run it! However, things are not like they seem: The cache is in no evict mode and thus the verification itself implicitly copied the data. When the code is executed the fetches will hit the cache and the TOCTOU risk is averted.

This clever strategy does have some downsides: the implicit protection is easily overlooked and to make matters worse it is not failsafe. Any issue that forces the code out of the cache or makes fetches bypass it will silently open up the system to a TOCTOU attack on the data source.

In the case of Boot Guard that data source is usually a flash ROM on the motherboard attached via the relatively simple SPI bus. This bus is easy to monitor as it has a low clock speed (smaller than 100 MHz) and a low pin count. Capturing this bus allows watching ROM fetches (and thus, cache misses) from outside the system.

To find a vulnerable region of data it suffices to look for an address being read multiple times. One such address was 0xffcc40:

The SPI capture

This address turns out to be part of the SecCore EFI module, which is responsible for early initialization and security during EFI bringup. The code at those addresses turned out to be:

# SecCore::PeiTemporaryRamDone
FFFFCC42    mov ecx, IA32_MTRR_DEF_TYPE
FFFFCC47    rdmsr
FFFFCC49    and eax, ~IA32_MTRR_ENABLE
FFFFCC5A    mov ecx, IA32_MTRR_DEF_TYPE
FFFFCC5F    wrmsr

This shows precisely the risk of having implicit protection: it is easy to forgot it is being used and to accidentally break it. In this case the system has initialized DRAM and starts to tear down CAR, which amongst other things consists of disabling the caches. Having disabled the caches, the code is now executing in place from memory-mapped ROM.

A simple attack against this flaw is to have a circuit monitor the SPI bus for the end of the verification and then switch the bus from the real ROM to a second one that contains malicious code. This is easy to do because SPI by design shares all but one of its signals between multiple targets on the bus, the CS signal is not shared and selects which device is being addressed. The proof of concept attack used an FPGA to intercept that signal and route it to a second chip when needed.

The trigger for the override was an address range chosen from the SPI capture, which was seen only to be accessed after verifying the code.

The proof of concept setup is shown below, with an insert showing the modifications to the Lenovo T460 motherboard. POC

This allowed code excecution, but not yet booting the system. In order to boot the system and not alert any security mechanism in the vendor firmware, the device would have to hide itself again. This was implemented by having a second range of addresses that would deactivate the device. As execution is under attacker control this address can be one that is otherwise never read, to prevent accidentally disabling the device.

By doing this, the system can be made to boot as normal and will not show any signs of tampering: The TPM measurement registers are not affected and the system is thus entirely unaware of the attack.

This specific vulnerability yields control before most of the sensitive MSRs are locked and thus allows installing a persistent backdoor before transfering control back to the vendor firmware.

The POC may seem awfully impractical, but a much simpler approach is possible: instead of intercepting the CS signal, one can simply override it with a stronger output and a series resistor on the motherboard will protect the chipset from being damaged by this. This allows simply clipping on the exploit device.

Another improvement is to not switch to a second ROM but instead serve the data directly from the FPGA.

These improvements were suggested and implemented by Trammell Hudson, who independently discovered the TOCTOU while working on his SPI flash emulator (missing reference). and helped me turn my POC into a much more realistic attack and report the issue to Intel Corp.

Intel Corporation recognized that even though the vulnerable code was in the UEFI reference implementation and not the ACM, it still impacted all Boot Guard implementations and thus considered the report to be in scope.

They not only fixed this specific issue, but also addressed the general class of ROM TOCTOU vulnerabilities by:

requiring EFI firmware to migrate all code and data to RAM
enabling paging after DRAM init and marking the IBB flash area not present

The mitigation code can be found at the EDK2 staging repository

Tramell Hudson and I presented this work at Hack in the Box Amsterdam, our slides are available at the conference site and the talk itself is on YouTube

I would like to thank

Trammell Hudson for guiding me through the vulnerability disclosure
RevSpace for providing me with a workshop and tools to develop the POC and test my findings
Intel Corp for recognizing the vulnerability, and going to great lengths to fix it while also allowing me and Trammell to present our work at HITB during the disclosure process.

References

Bypassing Intel Boot Guard. (2015). Embedi. Retrieved from https://embedi.org/blog/bypassing-intel-boot-guard/
Ermolov, A. (2016). Safeguarding rootkits: Intel BootGuard. ZeroNights. Retrieved from https://2016.zeronights.ru/wp-content/uploads/2017/03/Intel-BootGuard.pdf
Hudson, T. (2019). Spispy. Retrieved from https://trmm.net/Spispy