Posts

Showing posts from 2018

Processor micro-architecture internals (branch prediction, branch predictor and indirect branch)

Image
Introduction Recently, our security research on leverage a Performance Monitor Unit as a technique for monitoring a function call and control-flow integrity. We leverage a following perf event , and we faced an interesting problem ,  Figure[1] The one of the following event is almost always get counted by Performance Counter. BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL The interesting question is that why such instruction is always get mispredicted ?  There are couples of things we need to clarify and dive into.... Indirect Branch jmp rax ; Indirect jmp call  rax ; Indirect call Branch Target Buffer BTB is a table that in a processor internal, for optimising the processor performance during it's making a branch decision (yes/no), and it is indexed by current RIP (instruction pointer) and the value is branch target address ,  BTB's structure as following figure Figure[2] Branch Predictor Branch predictor leverages BTB and perform as

Dig into IRQL

Image
1. Introduction ------------------------------------------------------------------------------------------------------------------- Interrupt Request Level (IRQL), is a software concept that provided by Windows, which supports an ability that management and hidden the detail of the low-level complexity of interrupt. However, as a kernel enthusiast, security researcher, it is necessaryfor understanding what it hides?? how does it worsk? This article are going to provide a simplest explanation for the IRQL. 2. Exception v.s. Trap v.s. Interrupt from processor perspectives ------------------------------------------------------------------------------------------------------------------- There are so many differences between these stuff, however, they share the only characteristic is that they also are delivered by Interrupt Descriptor Table (IDT). Exception and Trap is officially document in Intel SDM , which fixed in IDT from interrupt vector from 0 to  20 (include trap,

How does Nested-Virtualization works?

Image
What is Nested virtualization? ----------------------------------------------------------------------------------------------------- Nowadays, Software Security is becoming more important criteria in the industry, and in recent years, virtualization as a popular topic for protecting / attacking a software, however, most of the virtualization technology framework (bluepill-liked) is not provide an ability that let a guest virtualize one more layer, we called it "Nested Virtualization", level 2. Basic Virtual Machine Monitor Architecture ------------------------------------------------------------------------------------------------------ Figure[1] Host VMM trap any type of event which wants to monitor, such as, Interrupt, exception, privileged register access, one of this event is VMX instruction, after VMM loaded, VMM can always monitor a any one of the  VMX instructions, which provide a good chance for us. As following chart: Figure[2] VMM Life Cycle

About Spectre

引言 現今操作系統(OS)設計一般分為應用層 (Ring 3) 與 內核層(Ring 0) , 應用層屬於普通應用程式級別, 而內核層屬於OS的代碼, Intel CPU Spectre漏洞產生後, 有大量的非技術性文章, 但Google Project Zero的文章講得比較艱深, 因此筆者在這做一個比較簡單的解釋與定義 理論背景 Out-of-Order Execute(OoOE) -  非順序執行 即表示正常匯編語言(Assembly Language) 不按正常的順序執行, 這是因為處理器中的各個運算單元實際上是可以異步工作的, 不需要像8086等老CPU, 同步執行指令。常見的如Cache Load/Store Unit, ALU 等等 Indirect Branch Prediction (分支預測) - CPU在執行過程中, 遇上了分支的話, 會先進行分支預測, 如常見的if 則是分支指令之一, 而他對應的匯編指令一般不會等到比較后才執行, 而cpu發現比較需要更久的時間的話, 那就會把if中的內容優先執行, 而執行的內容不一定會影響到結果, 但是執行的內容使用到的緩存則不會被修改(L1/2 DCache) 詳見: The Intel Optimization Reference Manual section 2.3.2.3 ("Branch Prediction"): Spectre漏洞的產生就是基於以上兩個處理器優化機制而出現的 見以下代碼, 處理器在執行過程中, 假如arr1-length不在cache中, CPU則不會等到條件判斷完成才執行if{..}中的內容, 如果條件不成立, 才會退回對寄存器的影響, 但是arr1->data則會一直在L1 DCache中 struct array { unsigned long length; unsigned char data[]; }; struct array *arr1 = ...; unsigned long untrusted_offset_from_caller = ...; if (untrusted_offset_from_caller < arr1->length) {   unsi