- Challenge Link:
Your enterprise network is experiencing a malware infection, and your SOC L1 colleague escalated the case for you to investigate. As an experienced L2/L3 SOC analyst, analyze the malware sample, figure out what it does and extract C2 server and other important IOCs.
I will try to make a full analysis first and then answer the challenge questions.
1- SHA256 hash:
I don’t find many useful string unless these strings:
Sleep OutputDebugStringA ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
Just a few APIs like
OutputDebugStringA which may be used for anti-debugging and
Base64 index string.
3- PE header parsing:
If we open this malware on a tool like
PE-bear, we can get some useful information about it.
If we use tool like
KANAL plug-in to see if this malware uses any known encryption or encoding algorithms, we will find that this malware uses
CRC32 hashing and there is
Base64 table inside it which we see its indexing string in strings output.
I didn’t get any information more, so, I started the advanced analysis.
I get a hint from the challenge questions that this malware uses a technique to hide API calling during runtime called
API hashing, it was my first time facing this technique, but I searched for it on the internet and I found two incredible articles talking about this technique and I will put them in the Reference section below.
From reading the articles, API hashing technique will use a function that takes hashed
API names and loop on modules inside the PE file, get the hash for every single module to get the correct module name and then the same thing for getting the correct API name.
When I load this malware into IDA, I saw a similar technique performed in the main.
As we can see, the same function
sub_6015C0 takes the same first parameter twice and the return value of it will be called as a function. So, this function may be responsible to resolve APIs names we can rename it
Resolving_APIs. Let’s do deep inside it and see what happened.
Inside this function, we will see some comparisons but the first argument passed as a parameter to function
sub_607564 which may be responsible to get the correct module name, so, we can rename it to
Inside this function, our DLL hashed name will be moved to local variable
v2, so, we can rename it to
DLL_hashed_name and then also there are some comparisons will be performed, but there is access to
ProcessEnvironmentBlock, especially the
I searched on
mdsn for this structure, but I didn’t find anything useful, but I will put links in the Reference section for understanding this structure and its nested structures.
After using IDA structs it will be more readable for us like this.
The logic of getting the DLL name is very simple, first the malware will access the
ldr structure which contains two nested structures
InMemoryOrderModuleList both of them from type
LIST_ENTRY which are a pointer to
LDR_DATA_TABLE_ENTRY` structure. Previous and next module, it’s simply a linked list of all loaded.
If we see what is inside
LDR_DATA_TABLE_ENTRY structure, we will see information about each module.
If we look at the malware again we will see that it will access the fifth element of this structure which is
BaseDllName this field is also structure and its type is
_UNICODE_STRING which contains three fields inside it
The malware will access the
Buffer field which contains the module name and the
Then, the malware will loop on this name and convert each character to lowercase.
Then, the malware will take the name after converting to lowercase and pass it as a parameter to the function and the return value will be XORed with
0x38BA5C7B, and if we examine this function we will find strange logic and from the
KANAL output in the basic analysis, we can confirm that this function does the
So, from the rest of the function, we can see the rest of the API hashing algorithm definition, the malware will take every module name and perform the same hashing algorithm on it and compare the output with the DLL hashed name, if they are identical, so, this is out targeted DLL and the malware will return its
Base Address, if not, the malware will get the next module by using
Now back to
Resolve_APIs function, we will see that the return value of it will come from
sub_6067C8 which take two parameters [DllBase, API hashed name]
So, this is a big indicator that this function will resolve the API name, so, let’s go deep inside it.
Through this function, we will see that the malware will access the
NT header and especially
DataDirectory and the
Export_Drirectory inside it. “There are many nested things inside windows :(“
And the malware does this to get all exports from the targeted DLL and loop on them to get API by the same technique it uses to get the targeted DLL base address.
I will also put links for PE internals in the Reference section.
HashDB plug-in in IDA and configuring it to CRC32 and XORing with our key, we will see all imports
For now, we know the mechanism used by this malware to dynamically resolve the API calls.
The challenge questions after this stage will ask us to dive into
sub_607980 which was the first call in the
DllEntryPoint, so, let’s see what is inside it.
After viewing this function in IDA Pro, there are a lot of comparisons, but the interesting part is in this function we will see that the malware will get
and with this DLL the malware will get
if we search for this function and especially for Vector Exception Handle technique we will know that the malware will establish its custom handler.
And also as we can see the handler will be
Also if we dive deep inside the code, we will find that the malware will call the resolved APIs by using
int 3, retnand this is an anti-debugging technique.
and the vectored exception handler will call it.
The next part of the challenge for me was the hardest, but I searched a lot to solve it.
In this part, we need to know the decryption routine made by the malware.
I take a hint from the hint to use a tool called capa, this tool is from the flare team and can do some analysis and explore a lot about the samples.
Capa told me that this malware used
RC4 algorithm in
sub_61E5D0, and from searching I found that the malware stores the key and encrypted data in
.rdata, in every chunk the key is the first 40 bytes and followed by encrypted data
and also stored in reverse order.
And we take this key and the data to
CyberChef to decrypt it.
After resolving APIs, we can find that the malware will try to resolve
sub_623820, so, this is an indicator that these function will contain the network IOCs stuff.
While I’m searching I found this tool that extracts the network indicators from this malware like
C2 IPs and associated ports.
Now, I finished this challenge, really learned a lot and I discovered my weakness and I will work to solve them
1- API hashing:
2- Advanced Imports Obfuscation with source code:
3- PEB internals:
4- PE format internals:
- PE file structure “NT Headers”: https://0xrick.github.io/win-internals/pe4/
- Data Directories: https://0xrick.github.io/win-internals/pe5/
- Export Directory: https://resources.infosecinstitute.com/topic/the-export-directory/
5- VEH with Dridex:
6- Tools and plugins:
- HashDB: https://github.com/OALabs/hashdb-ida
- CAPA Explorer: https://github.com/mandiant/capa/blob/master/capa/ida/plugin/README.md
- AppGateLabs to extract network IOCs: https://github.com/mandiant/capa/blob/master/capa/ida/plugin/README.md