Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
컴퓨터 보안 특론 (Special Topics in Computer Security)
Malware Analysis (1/2)
조성제 (Cho, Seong-je)
Spring, 2020
Computer Security & OS Lab.
Dankook University
References
A general definition of malware, S. Kramer and J. C. Bradfield, J Comput Virol (2010) 6:105–114
Malware Incident Response - Static Analysis, CIS 6395, Incident Response Technologies, Fall 2016, Dr. Cliff Zou, UCF
Practical Malware Analysis, Kris Kendall and Chad McMillan, Mandiant (Intelligent Information Security, Black Hat
COEN 252 Computer Forensics, Investigating Hacker Tools
CS155: Computer and Network Security (Stanford Univ.)
Introduction to Malware, Murat Kantarcioglu, UT Dallas
Wikipedia
Many slides come from the references above, Please do not replicate, distribute, upload, and post this lecture notes.
2Computer Security & OS Lab, DKU
Contents
Malware Detection
● Signature-based, Fingerprint-based
● Behavior-based
● Heuristic-based
Statistical Structures: Fingerprinting Malware for Classification and Analysis
Malware Analysis
● Static Analysis
● Dynamic Analysis
3Computer Security & OS Lab, DKU
Computer Virus
Program that inserts itself into one or more files and performs some action
● Insertion phase is inserting itself into file
● Execution phase is performing some (possibly null) action
Pseudocodebeginvirus:
if spread-condition then begin
for some set of target files do begin
if target is not infected then begin
determine where to place virus instructions
copy instructions from beginvirus to endvirus into target
alter target to execute added instructions
end;
end;
end;
perform some action(s)
goto beginning of infected program
endvirus:
4Computer Security & OS Lab, DKU
Malware Detection
Anti-virus software (= Virus scanner) typically employ a variety of methods to detect malware programs
● Signature-based scanning
− Fingerprinting method
− A virus signature is the fingerprint of a virus. It is a set of unique data, or bits of code, that allow it to be identified. (source: Computer Hope)
− A virus signature is a continuous sequence of bytes that is common for a certain malware sample. (source: Kaspersky.com)
− Anti-virus software uses a virus signature to find a virus in a computer file system, allowing to detect, quarantine, and remove the virus.
− Anti-virus software uses the virus signature to scan for the presence of malicious code.
● Heuristic-based detection
● Behavioral detection
5Computer Security & OS Lab, DKU
Quarantine: 격리하다. 격리
Fingerprint
Fingerprinting algorithm is a procedure that maps an arbitrarily large data item (such as a computer file) to a much shorter bit string, its fingerprint
● Fingerprint uniquely identifies the original data for all practical purposes just as human fingerprints uniquely identify people for practical purposes.
● This fingerprint may be used for data deduplication purposes.
● This is also referred to as file fingerprinting, data fingerprinting, or structured data fingerprinting.
File fingerprinting
● unique identifiers for their corresponding data and files
● Fingerprinting does not always work for certain file types, including documents that are encrypted or password protected, images and videos, and data in which the text does not perfectly match a predefined document fingerprint.
6Computer Security & OS Lab, DKU
Fingerprint functions
Cryptographic hash functions generally can serve as high-quality fingerprint functions, are subject to intense scrutiny from cryptanalysts, and have the advantage that they are believed to be safe against malicious attacks.
7Computer Security & OS Lab, DKU
Source: Wikipedia and https://hackaday.com/2015/11/10/your-unhashable-fingerprints-secure-nothing/
• Hash value• One-way• Collision
File Fingerprinting
As a first step, fingerprint the files you are examining so you will know if they change during analysis
Used md5deep, md5sum, etc.
8Computer Security & OS Lab, DKU
When you have completed your analysis, or at various points
● It will verify (check) the contents of file
If no changed hello.c: OK
After changing the contents of file md5sum_hello_files.txt, the output will be hello.c: FAILED
Virus Scan
Multiple viruses may have the same virus signature, which allows antivirus programs to detect multiple viruses when looking for a single virus signature.
● Because of this sharing of the same virus signature between multiple viruses, antivirus programs can sometimes detect an unknown virus.
● New viruses have a virus signature that are not used by other viruses, but new "strains" of known virus sometimes use the same virus signature as earlier strains.
● Source: What is a Virus Signature? - Computer Hope
Virus Scan
● Always scan new malware with an up to date virus scanner
● If the code is not sensitive, consider submitting to https://www.virustotal.com/
Strain: (동식물질병등의) 종류[유형]
9Computer Security & OS Lab, DKU
Malware Defenses (Detection & Analysis)
Distinguish between data, instructions
Limit objects accessible to processes
Inhibit sharing
Detect altering of files
Detect actions beyond specifications
Analyze statistical characteristics
https://personal.utdallas.edu/~muratk/courses/dbsec09s_files/Malware-Intro.pdf
10Computer Security & OS Lab, DKU
Statistical Structures:Fingerprinting Malware forClassification and Analysis
Daniel Bilar
Wellesley College (Wellesley, MA)
Colby College (Waterville, ME)
Proceedings of Black Hat Federal 2006 (2006).
Computer Security & OS Lab, DKU 11
Why Structural Fingerprinting?
Goal: Identifying and classifying malware
Problem: For any single fingerprint, balance between over-fitting (type II error) and under-fitting (type I error) hard to achieve.
● Type I error: the rejection of a true null hypothesis (FP)
● Type II error: the non-rejection of a false null hypothesis (FN)
FN: The number of benign software cases incorrectly detected as malware
FP: The number of malware cases misclassified as benign software
TP: The number of goodware cases correctly classified
TN: The number of malware cases correctly classified
Approach: View binaries simultaneously from different structural perspectives and perform statistical analysis on these ‘structural fingerprints’
12Computer Security & OS Lab, DKU
Different Definitions: TP, TN, FP, FN
One definition
● True Positive (TP): Number of correctly identified goodware applications.
● True Negative (TN): Number of correctly identified malware applications.
● False Positive (FP): Number of wrongly identified malware applications.
● False Negative (FN): Number of wrongly identified goodware applications.
Source: “Permission-Based Android Malware Detection”
The other definition
● TP: The number of malware samples correctly classified
− Number of dirty files classifies as dirty
● TN: The number of benign data correctly classified
− Number of clean files classified as clean
● FP: The number of benign samples classified as malicious
− Number of clean files classified as dirty
− The number of benign apps that are incorrectly as malware
● FN: The number of malware samples classified as benign
− Number of dirty files classified as clean
Source: “Selecting Features to Classify Malware”
“PUMA: Permission Usage to detect Malware in Android“
13Computer Security & OS Lab, DKU
Different Perspectives
Idea: Multiple perspectives may increase likelihood of correct identification and classification
14Computer Security & OS Lab, DKU
Static + Dynamic
Fingerprint: Opcode frequency distribution
Synopsis: Statically disassemble the binary, tabulate the opcode frequencies and construct a statistical fingerprint with a subset of said opcodes.
Goal: Compare opcode fingerprint across non-malicious software and malware classes for quick identification and classification purposes.
Main result: ‘Rare’ opcodes explain more data variation then common ones
15Computer Security & OS Lab, DKU
Example of Disassembled Code
Source: https://www.fireeye.com/blog/threat-research/2017/12/recognizing-and-avoiding-disassembled-junk.html
Goodware: Opcode Distribution
Procedure:1. Inventoried PEs (EXE, DLL, etc) on XP
box with Advanced Disk Catalog
2. Chose random EXE samples with MS Excel and Index your Files
3. Ran IDA with modified InstructionCounter plugin on sample PEs
4. Augmented IDA output files with PEiDresults (compiler) and general ‘functionality class’ (e.g. file utility, IDE, network utility, etc.)
5. Wrote Java parser for raw data files and fed JAMA’ed matrix into Excel for analysis
Inventory: …의목록을만들다.
JAMA is a basic linear algebra package for Java. It provides user-level classes for manipulating real, dense matrices.
16Computer Security & OS Lab, DKU
Malware: Opcode Distribution
Procedure:1. Booted VMPlayer with XP image
2. Inventoried PEs from Chris Riesmalware collection with Advanced Disk Catalog
3. Fixed 7 classes (e.g. virus, rootkit, etc.), chose random PEs samples with MS Excel and Index your Files
4. Ran IDA with modified InstructionCounter plugin on sample PEs
5. Augmented IDA output files with PEID results (compiler, packer) and ‘class’
6. Wrote Java parser for raw data files and fed JAMA’ed matrix into Excel for analysis
VMPlayer: VMware Workstation Player
The authors did joint work with Chris Ries
17Computer Security & OS Lab, DKU
Aggregate (Goodware): Opcode Breakdown
20 EXEs
(size-blocked random samples from 538 inventoried EXEs)
~1,520,000 opcodes read
192 out of 398 possible opcodes found
72 opcodes in pie chart account for >99.8%
14 opcodes labelled account for ~90%
Top 5 opcodes account for ~64 %
18Computer Security & OS Lab, DKU
Aggregate (Malware): Opcode Breakdown
67 PEs
(class-blocked random samples
from 250 inventoried PEs)
~665,000 opcodes read
141 out of 398 possible opcodes found (two undocumented)
60 opcodes in pie chart account for >99.8%
14 opcodes labelled account for >92%
Top 5 opcodes account for ~65%
19Computer Security & OS Lab, DKU
Top 14 Opcodes: Frequency
20Computer Security & OS Lab, DKURK = Rootkit
Comparison Opcode Frequencies
Perform distribution tests for top 14 opcodes on 7 classes of malware:
● Rootkit (kernel + user)
● Virus and Worms
● Trojan and Tools
● Bots
Investigate: Which, if any, opcode frequency is significantly different for malware?
21Computer Security & OS Lab, DKU
Top 14 Opcode Testing (z-scores)
Tests suggests opcode frequencyroughly
1/3 same
1/3 lower
1/3 higher
vs
goodware
22Computer Security & OS Lab, DKU
Top 14 Opcodes Results Interpretation
23Computer Security & OS Lab, DKU
Rare 14 Opcodes (parts per million)
24Computer Security & OS Lab, DKU
Rare 14 Opcode Testing (z-scores)
Tests suggestsopcode frequency roughly
1/10 lower
1/5 higher
7/10 same
vs
goodware
25Computer Security & OS Lab, DKU
Rare 14 Opcodes: Interpretation
26Computer Security & OS Lab, DKU
Summary: Opcode Distribution
Compare opcode fingerprints against various software classes for quick identification and classification
Malware opcode frequency distribution seems to deviate significantly from non-malicious software
‘Rare’ opcodes explain more frequency variation then common ones
27Computer Security & OS Lab, DKU
Opcodes: Further directions
Acquire more samples and software class differentiation
Investigate sophisticated tests for stronger control of false discovery rate and type I error
● Type I error (FP)
− FP: The number of malware cases misclassified as benign software
Study n-way association with more factors (compiler, type of opcodes, size)
Go beyond isolated opcodes to semantic ‘nuggets’ (size-wise between isolated opcodes and basic blocks)
Investigate equivalent opcode substitution effects
Nugget: <작지만가치있는생각·사실등> (=snippet)
28Computer Security & OS Lab, DKU
Fingerprint: Win 32 API calls
Synopsis: Observe and record Win32 API calls made by malicious code during execution, then compare them to calls made by other malicious code to find Similarities
− Dynamic analysis
Goal: Classify malware quickly into a family (set of variants make up a family)
Main result: Simple model yields > 80% correct classification, call vectors seem robust towards different packer
29Computer Security & OS Lab, DKU
Win 32 API call: System overview
Data Collection: Run malicious code, recording Win32 API calls it makes
Vector Builder: Build count vector from collected API call data and store in database
Comparison: Compare vector to all other vectors in the database to see if its related to any of them
30Computer Security & OS Lab, DKU
Win 32 API Call: Data Collection
Malware runs for short period of time on VMWare machine, can interact with fake network
API calls recorded by logger, passed on to Relayer
Relayer forwards logs to file, console
31Computer Security & OS Lab, DKU
Win 32 API Call: Call Recording
Malicious process is started in suspended state
DLL is injected into process’s address space
When DLL’s DllMain() function is executed, it hooks the Win32 API function
Hook records the call’s time and arguments, calls the target, records the return value, and then returns the target’s return value to the calling function.
32Computer Security & OS Lab, DKU
Function call before hooking Function call after hooking
Win 32 API call: Call Vector
Column of the vector represents a hooked function and # of times called
1200+ different functions recorded during execution
For each malware specimen, vector values recorded to database
33Computer Security & OS Lab, DKU
Win 32 API call: Comparison
Computes cosine similarity measure csm between vector and each vector in the database
34Computer Security & OS Lab, DKU
If csm (vector, most similar vector in the database) > threshold vector is classified as member of familymost-similar-vector
Otherwise vector classified as member of familyno-variants-yet
Win 32 API call: Results
35Computer Security & OS Lab, DKU
Win 32 API call: Packers
Wide variety of different packers used within same families
Dynamic Win 32 API call fingerprint seems robust towards packer
8 Netsky variants in sample, 7 identified
36Computer Security & OS Lab, DKU
Summary: Win 32 API calls
Allows researchers and analysts to quickly identify variants reasonably well, without manual analysis
Simple model yields > 80% correct classification
Resolved discrepancies between some AV scanners
Dynamical API call vectors seem robust towards different packer
37Computer Security & OS Lab, DKU
API call : Further directions
Acquire more malware samples for better variant classification
Explore resiliency to obfuscation techniques (substitutions of Win 32 API calls, call spamming)
− Call spamming == Obfuscated API calls
Investigate patterns of ‘call bundles’ instead of just isolated calls for richer identification
Replace VSM with finite state automaton that captures rich set of call relations
● VSM (?)
− Vector Space Model (?), Virtual State Machines (?), Value Stream Mapping (?)
38Computer Security & OS Lab, DKU
Fingerprint: PDG measures
Program Dependence Graph
● Control dependence
● Data dependence
Synopsis: Represent binaries as a System Dependence Graph, extract graph features to construct ‘graph-structural’ fingerprints for particular software classes
Goal: Compare ‘graph structure’ fingerprint of unknown binaries across non-malicious software and malware classes for identification, classification and prediction purposes
Main result: Work in progress
39Computer Security & OS Lab, DKU
Program Dependence Graph
A PDG models intra-procedural
Data Dependence:
Program statements compute data that are used by other statements.
Control Dependence:
Arise from the ordered flow of control in a program.
40Computer Security & OS Lab, DKU
Picture from J. Stafford (Colorado, Boulder)
Fingerprint: PDG measures
For more info., please visit
● https://blackhat.com/presentations/bh-usa-06/BH-US-06-Bilar.pdf
● http://muhaz.org/program-slicing-theory-and-practice-tibor-gyimthy.html
41Computer Security & OS Lab, DKU
Malware Analysis
Static feature (?)
Dynamic feature (?)
for machine learning
Practical Malware Analysis, K. Kendall & C. McMillan, Mandiant
Computer Security & OS Lab, DKU 42
Program Analysis
Given an executable, how do we find out what it does?● Try to find the program online.
− Analyze source code to find clues.
− Search for the name of the program.
● Perform (source) code review.
● Execute the program in a sandbox.− Some programs can break out of a sandbox / jail.
☞ Program Compilation● Stripping: Removes all human-readable symbols from object code.
− Combats reverse engineering.
● Packing with UPX, etc.
− Compresses code (achieves ratios of 20%~40%)
43Computer Security & OS Lab, DKU
Cheat Sheet for Analyzing Malicious Software
General Approach
● Set up a controlled, isolated laboratory in which to examine the malware specimen.
● Perform behavioral analysis to examine the specimen’s interactions with its environment.
● Perform static code analysis to further understand the specimen’s inner-workings.
● Perform dynamic code analysis to understand the more difficult aspects of the code.
● If necessary, unpack the specimen.
● Repeat steps 2, 3, and 4 (order may vary) until sufficient analysis objectives are met.
● Document findings and clean-up the laboratory for future analysis.
https://zeltser.com/reverse-malware-cheat-sheet/
Specimen: 견본, 샘플, 표본, (의학검사용)시료
44Computer Security & OS Lab, DKU
Static vs. Dynamic Analysis
Static analysis
● Code is Not executed
● Static reverse engineering
− Viewers or Editors for executables: Binary viewer, PEiD, Hex Workshop, …
• Hex Workshop: Hex editor, Sector editor, Base converter and Hex calculator for Windows
− Disassembling, Decompiling
● Autopsy or Dissection of “Dead” Code
Dynamic analysis
● Observing and controlling running (“live”) code
● Dynamic reverse engineering
− Debugging
− Emulator (or VM)
● Ant farm
The fastest path to the Best answers will usually involve a combination of Both
45Computer Security & OS Lab, DKU
Static Analysis
Static Analysis● Determine the type of executable.
− ELF file in Unix
− PE file (Exe-type) in Windows
− DEX file in Android
● Symbol Extraction:− Use a program like strings to find symbols left in object code.
− Names give hints on program.
− Will not work for stripped files.
Static analysis is Safer
● Since we aren’t actually running malicious code, we don’t have to worry (as much) about creating a safe environment.
● If possible, perform static analysis in a different OS than your malware targets
− Analyst can reduce Risk using Platform Diversity
− IDA Pro for OS X (?)
46Computer Security & OS Lab, DKU
What is Static Analysis?
Analysis of malware performed without actually executing the rogue code
Analysis can be performed on any platform because you are not intending to run the malware which may be platform specific (e. g., a Win32 executable)
Some questions to be answered include:
◦ What type of file is this ?
◦ (batch file, shell script, Windows executable, Android DEX, Linux ELF, Javascript, etc.)
◦ What does it do?
◦ Does it spread itself via physical media or network resources?
◦ Does it steal, alter, or delete information?
Rogue: {형용사} 무리를떠나사는 (그래서위험할수도있는), {명사} 사기꾼, 범죄자 (=rascal), 악당
47Computer Security & OS Lab, DKU
General Procedures of Static analysis
Determine the type of file you are examining, its internal structures (sections and headers)
Review the ASCII and Unicode strings contained within the binary file
Submit the code to a virus program or online scanner such as https://www.virustotal.com;
signature analysis may help determine the name and functionality of the malware
Perform additional online research to determine the malware’s purpose and capabilities
48Computer Security & OS Lab, DKU
• string_ids: String where code contains (only Address)
• type_ids: Class, Method type container (only Address)
• proto_ids: Class, Method parameter return info (only Address)
• Method, Class,..: real Method, Class, Field, Data container
Source: Inc0gnito 2015 Android DEX Analysis Technique, 김남준
Disassembly
Disassembler:
● Decodes binary machine code into a readable assembly language text
Automated disassemblers can take machine code and “reverse” it to a slightly higher-level
Many tools can disassemble x86 code
● Objdump, Python w/ libdisassemble, IDA Pro
● ILDasm (Microsoft .Net IL disassembler)
But, IDA Pro is what everyone uses
Manual examination of disassembly is somewhat painstaking, slow, and can be hard
● Keep your goals in mind and don’t get bogged down
Bog: 수렁에빠뜨리다[빠지다], 꼼짝못하게하다[되다], 난항하다[하게하다] Bog down: 교착상태에빠지다. 꼼짝못하게하다.
49Computer Security & OS Lab, DKU
50Computer Security & OS Lab, DKU
Reverse Engineering Android: Disassembly
Unzip APK & disassemble classes.dex
51Computer Security & OS Lab, DKU
Source: Reverse Engineering Android: Disassembly & Code Injection, Thanasis Petsas, SYSSEC-Project.eu,
● http://www.syssec-project.eu/m/page-media/158/syssec-summer-school-Android-Code-Injection.pdf
smali/baksmali is an assembler/disassembler for the dex format used by dalvik.
● Baksmali takes a dex file and produces human readable assembly, and smali takes the human readable assembly and produces a dex file.
Apktool is a more general took for unpacking and repacking an apk.
● It uses smali/baksmali under the hood in order to assemble/disassemble the dex file.
● It unpacks the binary resources and binary xml files back into the standard textual format,
Decompile, Dump
Decompilers
● Attempt to produce a high-level language source-code-like
representation from a binary.
● Never completely possible because
− The compiler removes some information,
− The compiler optimizes the code.
Executable-dumping
● Dumpbin (MS)
● PEView
● PEBrowse Professional
52Computer Security & OS Lab, DKU
PEiD (PE iDentifier)
PEiD detects most common packers, cryptors and compilers for PE files.
It can currently detect more than 470 different signatures in PE files.
It seems that the official website (www.peid.info) has been discontinued.
● Hence, the tool is no longer available from the official website but it still hosted on other sites.
Source: https://www.aldeid.com/wiki/PEiD, https://tuts4you.com/e107_plugins/download/download.php?view.398
53Computer Security & OS Lab, DKU
Signature-based malware scanning
● Extremely low false positive (FP) rate
− Probability of mistaking a goodware program for a malware program is very low.
● Less proactive than desired
Most signatures used in existing signature-based malware scanners are hashsignatures, each of which is the hash of a malware file.
● The number of malware samples covered by each hash signature is low – typically one.
One possible solution is to replace hash signatures with string signatures, each of which corresponds to a short, contiguous byte sequence from a malware binary.
● Thus, each string signature can cover many malware les.
● Hancock is an automatic string signature generation system
− It generates high-quality string signatures with minimal FPs and maximal malware coverage.
54Computer Security & OS Lab, DKU
Source: “Automatic Generation of String Signatures for Malware Detection”, Sep. 2009
Signature-based malware scanning
Good signature on Aug. 2008
● First, it uses 16-bit registers, which is quite rare in goodware.
● Second, it has 8 constants with different, unusual values.
55Computer Security & OS Lab, DKU
Source: “Automatic Generation of String Signatures for Malware Detection”, Sep. 2009
Strings
Sometimes things are easy
First look at the obvious – strings
56Computer Security & OS Lab, DKU
Utilities: strings, Bintext, Hex Workshop, IDA Pro
● BinText:
− a file text scanner / extractor that helps find character strings buried in binary files
− It finds ASCII, Unicode and Resource strings in a file
Source: https://www.howtoforge.com/linux-strings-command/
Strings
Be careful about drawing conclusions
There is nothing stopping the attacker from planting strings meant to deceive the analyst
However, strings are a good first step and can sometimes even provide attribution
No Strings may be Attached
● Point-and-click “packers” make it easy for intruders to obfuscate the contents of binary tools
Point-and-click: 마우스로이용가능한
57Computer Security & OS Lab, DKU
Conducting Web Research
Look at unique strings, email addresses, network info
● But! The intruder/author could be watching for you
Search the web
● Be careful … Google cache != Anonymous
● You might find other victims, or complete analysis
● Don’t forget newsgroups
It helps if you know Chinese (or Russian, or Spanish)
● https://www.google.com/language_tools?hl=en
58Computer Security & OS Lab, DKU
Dynamic Analysis
Computer Security & OS Lab, DKU 59
Dynamic Program Analysis
Run the program and see what it is doing.
Requires security mechanisms:
● Dedicated machine.
● Not connected to the internet.
● Or: Virtual machine.
− However: Code can recognize whether it is running in VMWare.
• E.g. by the internal MAC addresses, …
Transport malware on a non-writable CD / DVD
60Computer Security & OS Lab, DKU
Dynamic Analysis
Static analysis will reveal some immediate information
Exhaustive static analysis could theoretically answer any question, but it is slow and hard
Usually you care more about “what” malware is doing than “how” it is being accomplished
Dynamic analysis is conducted by observing and manipulating malware as it runs.
● Analyst needs to intercept communication of program.
− Need to generate a fake network in a safe environment.
61Computer Security & OS Lab, DKU
Safe Environment
Nice, safe analytical environment wasn’t that important during static analysis
As soon as you run an unknown piece of code on your system, nothing that’s writable can be trusted
In general, we will need to run the program many times
● Snapshots make life easier
− Snapshot:
• 메모리바이트, 하드웨어레지스터, 상태표시기등의모든내용을포함한메모리의현재상태를저장한것
• 과거의한때존재하고유지시킨컴퓨터파일과디렉토리의모임
− A VMware snapshot is a copy of the virtual machine's disk file (VMDK) at a given point in time.
• Snapshots provide a change log for the virtual disk and are used to restore a VM to a particular point in time when a failure or system error occurs.
62Computer Security & OS Lab, DKU
Creating a Safe Environment
Do not run malware on your computer!
Old and Busted
● Shove several PCs in a room on an isolated network, create disk images, re-image a target machine to return to pristine state
The (not so) new hotness
● Use virtualization to make thinks fast and safe
● VMWare (Workstation, Server [free])
● Parallels (cheap)
● Microsoft Virtual PC (free)
● Xen (free)
Bust: 부수다, 고장내다, busted: (못된짓을하다가) 걸린
Shove: (거칠게) 밀치다[떠밀다], 아무렇게나놓다[넣다]
Pristine: 완전새것같은, 아주깨끗한 (=immaculate), 오염되지않은 (=unspoiled)
63Computer Security & OS Lab, DKU
Avoidance techniques of malware• Emulator detection• Anti-debugging
Creating a Safe Environment
It is easier to perform analysis if you allow the malware to “call home” …
However:
● The attacker might change his behavior
● By allowing malware to connect to a controlling server, you may be entering a real-time battle with an actual human for control of your analysis (virtual) machine
● Your IP might become the target for additional attacks
● You may end up attacking other people
End up: 결국 (어떤처지에) 처하게되다
64Computer Security & OS Lab, DKU
Creating a Safe Environment
Therefore, we usually do not allow malware to touch the real network
● Use the host-only networking feature of your virtualization platform
● Establish real services (DNS, Web, etc.) on your host OS or other virtual machines
● Use netcat to create listening ports and interact with text-based client
− netcat ( = nc)
• a computer networking utility for reading from and writing to network connections using TCP or UDP.
• It is a feature-rich network debugging and investigation tool.
● Build custom controlling servers as required (usually in a high-level scripting language)
65Computer Security & OS Lab, DKUSource: https://security.stackexchange.com/questions/205802/netcat-reverseshell-hanging-after-connection & Wikipedia
Virtualization Considerations
Using a virtual machine helps, but …
Set up the “victim” with no network or host-only networking
Your virtualization software is not perfect
Malicious code can detect that it is running in a virtual machine
A 0-day worm that can exploit a listening service on your host OS will escape the sandbox
● Even if you are using host-only networking!
66Computer Security & OS Lab, DKU
Dynamic Program Analysis
strace, systrace:● Run the programming, but keep track of the system calls that
it makes with parameters.
− More relevant calls (Unix):• open• read• write• unlink• lstat• socket• close
− strace has an option that intercepts all network related calls.
Use fport, netstat, … to determine ports opened by the program.
67Computer Security & OS Lab, DKU
System Monitoring
What we are after
● Registry Activity
● File Activity
● Process Activity
● Network Traffic
The tools
● SysInternals Process Monitor
− It records information about File system, Registry, and Process/Thread activity
● Wireshark
● + a whole bunch of other stuff
68Computer Security & OS Lab, DKU
Anti-malware evasion techniques
Malware writers can use anti-reversing techniques.● Eliminate symbolic information.
● Encrypt code.
● Code obfuscation.
− Make HLL constructs difficult to understand.
● Anti-debugger Methods:
− Use the IsDebuggerPresent API to protect against user-level debuggers.
− Use the NTQuerySystemInformation API to determine if a kernel debugger is attached to the system.
− Set a trap flag and check whether it is still there.
• A debugger would “swallow” it.
− Put in bogus bytes over which the code jumps.
• Does not work for all disassemblers.
69Computer Security & OS Lab, DKU
Summary
Malware Detection
● Anti-malware software = Malware scanner
● Signature(fingerprint)-based detection, Behavioral detection, …
Statistical Structures: Fingerprinting Malware for Classification and Analysis
● Opcode frequency distribution
● API call vector
● Graph structural properties: PDG
Malware Analysis
● Static analysis: Disassemble, Decompile, Hex edit, Binary viewer, …
− Executable file format: PE, ELF, DEX, …
● Dynamic analysis: Debugging on a secure VM
Anti-malware evasion techniques
70Computer Security & OS Lab, DKU