In this third post in our blog series on process snapshotting (see previous posts on PlugX and Shiz’ code injection), we will show how to dissect exploit payloads using the LLama full-process snapshot functionality.
Document exploits are particularly tedious to analyze using traditional analysis tools, as the vast majority of code (and/or data) located in the exploited process’ memory are benign (that is, are unrelated to the actual exploit). Lastline’s high-resolution malware analysis engine is able to track all data generated as part of opening/rendering a document, and, in turn, limits the process snapshots exported for analysis to those parts relevant to the exploit (and subsequent shellcode and payload).
LLama versus CVE-2012-0158
Like in previous posts, we will walk through the concrete analysis steps by looking at a real-world exploit:
Malware Family: Exploit.Win32.CVE-2012-0158
md5: 0ad65802ba0dc1ae0fa871907f83b729
VT link: https://www.virustotal.com/en/file/faced273353acea599d96e46c47d352beded0cc753867f02aa5f32744b692c1d/analysis/
Full analysis result link: https://user.lastline.com/malscape#/task/9c7452b11aa64204ad9e30cd7d1688fc (accessible to Lastline customers only, sign-up now)
The analysis engine is able to identify the execution of malicious code at runtime using various mechanisms. This allows an analyst, for example, to look at stack code execution after exploiting a vulnerability, as well as follow the entire infection process after the exploit code has been executed. Additionally, the analysis engine keeps track of all stages of the exploit, if the malicious payload happens in multiple phases, as one can see in the example below.
To start, let’s first look at the behavior observed during the dynamic analysis run, as seen in the behavior overview:
Behavior summary of exploit and dropped file
Analyzing Multi-Stage Document Exploits
The behavior summary already gives us a good idea about what the exploit, and subsequently dropped malware sample, did. But for this blog post, we want to focus on the how rather than the what, so let’s look at the stages of the exploit from the perspective of the process snapshots (or “dumps”) generated inside the vulnerable program (in this case Microsoft Word 2003/2007, labeled Analysis Subject 1 below):
Exported process snapshots
As one can see, three process snapshots have been generated for the MS Office program, each representing one stage of the exploit. So, let’s download all three process snapshots for the primary analysis subject and analyze them in IDA Pro!
The first snapshot was taken when the analysis engine detected code execution in stack memory, which already hints at the first stage of the exploit (executed right after triggering the vulnerability).
Opening the downloaded snapshot in IDA Pro, the overview points us right to the position-independent shellcode executed in the first stage:
NOP sled and first stage of the exploit
As one can see, the shellcode uses function name hashing to obfuscate the real semantics of the code. When using this trick, malicious code finds API functions indirectly by iterating through available imports and hashing the functions’ names. The idea is to obfuscate static strings used in many exploits to evade static analysis techniques.
Clearly, this trick does not work on a dynamic analysis system, as the executed code still has to jump to the API functions of interest, revealing the true behavior of the shellcode. Below, we highlight the individual functions used by the code:
NOP sled and first stage of the exploit (after de-obfuscation)
So, with very little work for identifying hashed function names, we can see the first stage of the exploit opening the original document to read, decrypt, and finally execute the second stage in memory.
This same information is also shown in the process snapshot overview: The second snapshot was taken after observing the execution of untrusted code in memory - exactly as described above. After opening this snapshot in IDA Pro, we see the second stage of the exploit:
First, the code obtains the base address of kernel32.dll from the PEB
which it then uses to read a list of API function addresses
used for executing API functions later on.
After obtaining a list of function names, the code hashes each function name, and builds a mapping table between hashed function name and function address. This way, it can later execute individual functions by looking them up via the hashed function ID in the shellcode.
Function name hash to address mapping built by second stage
The invocation of API functionality can be seen in the third process snapshot, labeled Observed API function invocation from untrusted memory region. When the analysis engine observes a call to an API function, it triggers another snapshot, allowing us to inspect the second exploit stage after the function hash table has been populated.
After opening this last snapshot in IDA Pro, the overview immediately highlights function tables (as mentioned in our previous blog post):
Function tables highlighted in process snapshot
Since the snapshot was taken after the hash-to-address mapping has already been done by the malware sample, this table contains API function addresses. Since shellcode uses a reference to this table for all API function invocations, and to simplify further analysis, we create a struct that contains this table of API function addresses as members.
Extracted function table struct
With this in place, it is easy to recognize the remaining behavior:
As one can see from that code above, the shellcode has the ability to drop executable files (EXE as well as DLLs), and invoke them subsequently:
In the analysis run that we have looked at for this post, the shellcode restarts MS Word with a dropped fake document to avoid raising suspicion with the user under attack. The rest of the analysis report contains details on the payload dropped by the exploit. Just like any other analysis report, it also contains full process snapshots for the processes spawned after successful exploitation.
Summary
Full process snapshots generated by Lastline’s high-resolution analysis engine provide a very easy way to observe all stages of document exploits. Each snapshot contains only information relevant to the exploit, executed shellcode, or subsequently-executed payload.
In combination with the full analysis report summarizing interesting behavior as well as the full activity of the dropped malware, the process snapshots provide an analyst with the perfect tool to analyze the full spectrum of malware behavior, from high-level API calls to low-level shellcode, without requiring a lot of manual work.
