Upload
bryce-hunter
View
214
Download
0
Embed Size (px)
Citation preview
DOWN TO THE BARE METAL:USING PROCESSOR FEATURES FOR BINARY ANALYSIS
Carsten Willems1, Ralf Hund1, Andreas Fobian1, Thorsten Holz1, Amit Vasudevan2
1Ruhr-University Bochum, Germany2Carnegie Mellon University
Annual Computer Security Applications Conference (ACSAC) 2012
左昌國2013/02/25 Seminar @ ADLab, NCU-CSIE
• Introduction• Software Emulators• Delusion Attacks• Binary Analysis with Branch Tracing• Experiments• Limitations• Conclusion
Outline
2
• Binary(malware or vulnerable software) analysis• Static• Dynamic
• Number of execution paths• (on behavior analysis) Every Instruction or Critical Point• Native Machine or Emulation/Virtualization
Introduction
3
• Native Machine• The analysis result must be unaffected by malicious code• Reverting to clean states• Lack of monitoring abilities
• Emulator• Artificial environment detection• Delusion attacks
• No explicit test
Introduction
4
• Contributions:• Introducing several delusion attacks• An approach to perform behavior analysis
• Branch tracing feature of x86 CPU
• Implementing a prototype that shows the usefulness of this approach
Introduction
5
• BOCHS• QEMU
• Dynamic Translation• Guest code block (before branch) intermediate code optimization
translated to host instruction code block (Translation Block) saving TBs in code cache
• Isolated Memory
• BitBlaze and Anubis• Taint Propagation Tracking
Software Emulators
6
• Current emulator detection techniques consist of 2 steps:(1) Probing the existence of a non-native system environment
(2) Depending on the outcome of (1), different actions are performed
• These techniques are easy to spot and mitigate• Powerful analysis methods like multi-path execution
• This paper proposes detection methods that have no explicit check and do not have conditional branch
Delusion Attacks - Motivation
7
• Self-Modifying Code (SMC)• On a native system, handling SMC correctly is sophisticated
• Instruction prefetch• Multi-processor environment
• Modern CPUs can handle these problems correctly• In an emulator, the CPU facilities for SMC detection cannot be
utilized• Implemented in software• Preparing a list of addresses of instructions huge overhead• Most emulators (like QEMU) use page fault handling for SMC detection
• All executable memory pages are set read-only• If (memory write on executable memory), page fault handler triggered • (In the handler) If the target memory should be writable (writable in guest OS),
1. Memory protection is modified to writable
2. The memory write instruction is executed again
3. Memory protection is changed to read-only
Delusion Attacks – Basic Principle
8
• rep movs instruction• Copying a number of bytes, words, or double words within an
implicit loop• esi: source memory location• edi: destination location• ecx: loop counter, -1 for each loop, 0 for stopping loop
• On a real machine, the copy loop is atomically• In an emulator, if the destination is a code address,
• The first loop iteration triggers the page fault handler• Making it writable, re-executing the write operation, and making it
read-only• The instruction is re-read from memory (second loop iteration)• …
Delusion Attacks – REP MOVS
9
Delusion Attacks – REP MOVS
10
lea eax, BENIGNCODElea ebx, MALICIOUSCODElea esi, NEWlea edi, OLDmov ecx, 2
OLD+0x0: rep movsdOLD+0x2: nopOLD+0x3: nop
OLD+0x4: call eax //BENIGNCODEOLD+0x6: nopOLD+0x7: nop
ret
NEW+0x0: nopNEW+0x1: nopNEW+0x2: nopNEW+0x3: nopNEW+0x4: call ebx //MALICIOUSCODENEW+0x6: nopNEW+0x7: nop
ecx = 2
NEW+0x0: nopNEW+0x1: nopNEW+0x2: nopNEW+0x3: nop
ecx = 1
eip = OLD+0x0
NEW+0x4: call ebx //MALICIOUSCODENEW+0x6: nopNEW+0x7: nop
ecx = 0
eip = OLD+0x2
On a real machine
Double word
Delusion Attacks – REP MOVS
11
lea eax, BENIGNCODElea ebx, MALICIOUSCODElea esi, NEWlea edi, OLDmov ecx, 2
OLD+0x0: rep movsdOLD+0x2: nopOLD+0x3: nop
OLD+0x4: call eax //BENIGNCODEOLD+0x6: nopOLD+0x7: nop
ret
NEW+0x0: nopNEW+0x1: nopNEW+0x2: nopNEW+0x3: nopNEW+0x4: call ebx //MALICIOUSCODENEW+0x6: nopNEW+0x7: nop
ecx = 2
NEW+0x0: nopNEW+0x1: nopNEW+0x2: nopNEW+0x3: nop
ecx = 1
eip = OLD+0x0
NEW+0x4: call ebx //MALICIOUSCODENEW+0x6: nopNEW+0x7: nop
ecx = 1
eip = OLD+0x1
In QEMU
Double word
read-onlypage faultwritableread-only
re-read the instructionfrom memory
• Many kinds of caches are available on a contemporary system
• In an emulator, there is no explicit cache support, and all cache-related instructions have no effect
• On a real machine• The modification in cache will not be written back to memory
immediately
• On an emulated machine• The modification is written directly to RAM
Delusion Attacks - INVD
12
Delusion Attacks - INVD
13
lea eax, BENIGNCODElea ebx, MALICIOUSCODElea esi, Ainc esiwbinvdmov byte ptr [esi], 0xD0invd
A:
call ebx // FF D3 = call ebx// FF D0 = call eax
esi = A+0x0esi = A+0x1
On a real machine
The modification is donein cache, not yet writingback to memoryThe cache is now invalidated
MALICIOUSCODE
Delusion Attacks - INVD
14
lea eax, BENIGNCODElea ebx, MALICIOUSCODElea esi, Ainc esiwbinvdmov byte ptr [esi], 0xD0invd
A:
call ebx // FF D3 = call ebx// FF D0 = call eax
esi = A+0x0esi = A+0x1
In QEMU
call eax
The modification is directlywritten to memory
MALICIOUSCODEBENIGNCODE
• On x86/64 architectures from Intel and AMD, the branch tracing (BT) facilities can record all pairs of the source address and the destination address of branch operations
• The information can be used to reconstruct the execution/decision path taken during execution
Binary Analysis with Branch Tracing
16
• “Fuzzing” which produces a large number of crash reports is a kind of automated vulnerability analysis
• Binning: a technique to group similar root causes in the crash reports• This technique can also be used to group a set of exploits by the
categories of exploited vulnerability• By comparing with the control path generated from BT log, it is
easy to realize binning
Experiments 1: Binning of Malicious PDF Documents
17
• CWXDetector• A tool that is capable of detecting exploitation attempts and
extracting shellcode• It does not become active before the execution of the first
shellcode instruction
no information can be gained about the cause vulnerability
• By combining BT with CWXDetector, it is useful to trace back from the execution of the first shellcode instruction to the root cause of vulnerability
• The experiment• 4,869 malicious PDF documents• Each file exploits some kind of vulnerability in Acrobat Reader 9.00
Experiments 1: Binning of Malicious PDF Documents
18
• Normalization• Because of ASLR, the branch addresses are recorded in the form
of relative addresses• Collapsing loops• Removing internal exception handling of the Windows system• Ignoring the shellcode part
• Clustering algorithm• DBSCAN• Jaro-Winkler distance• Measure the difference between two strings
• Similar string higher score• Similar prefix higher score
Experiments 1: Binning of Malicious PDF Documents
20
Experiments 1: Binning of Malicious PDF Documents
21
k: minimum cluster sizeε: maximum distance of two objects to belong to the same cluster
• Comparing with Wepawet • 5 different vulnerability signatures (only addressing exploits of
Acrobat Reader 9.00)• A small number of samples not detected to have exploits to Acrobat
Reader 9.00
manually verified wepawet is wrong• Some samples are labeled incorrectly
manually verified wepawet is wrong
• Performance• Time from opening the documents to the execution of shellcode• Min: 11s (2s w/o BT)• Max: 406s (117s w/o BT)• Avg: 129s (11s w/o BT)
Experiments 1: Binning of Malicious PDF Documents
22
• See T.R. Appendix B• This sample in Anubis behaved normally
Experiment 3: Practical Delusion Attack with a PDF File
24
• The data from BT logs is coarse• The prototype could be detected by timing measurements• The attacker in ring-0 is capable of disabling the BT
• Could incorporate with a hardware-assisted hypervisor
Limitations
25
• Many analysis techniques utilize software emulators.• Attackers still have methods to evade the analysis under
the emulation environment• A new approach for dynamic code analysis that uses
CPU-assisted branch tracing offers a granularity between instruction- and function-level monitoring with reasonable overhead
• Practical results show that the BT traces contain enough information to assist some tasks in malware and vulnerability analysis
Conclusion
26