Understanding doesnât happen overnight; this article introduces the preliminary concepts of file types and signature recognition to help lay the foundation for Dynamic Analysis in malware analysis.
The objective of malware analysis is to understand how malware operates and how to detect and eliminate it.
Malware is a broad term referring to different types of malicious programs:
Virus or Worm: Self-replicates, spreads across networksTrojanBackdoor / Remote Access Trojan (RAT)Adware: Intrusive popup ad marketingBotnet: A network of bot programs through a C2 tunnel waiting for attackers to execute DDoS or other malicious activities. Nginx was developed because a large number of bot programs needed to return to the control terminal, prompting its development.Information Stealer: Refers to malware stealing sensitive data like banking information, including keyloggers, form grabbers, spyware, sniffers.Ransomware: Encrypts data to demand ransomRootkit: Designed as a highly stealthy privilege-persistent backdoorDownloader or Dropper: Similar to sending a scout first, which then releases the actual malware. Deploying and delivering malware in stages.
Basic terminology related to malware analysis, including but not limited to: https://blog.malwarebytes.com/glossary/
Malware Analysis Techniques:
Static Analysis: Refers to information like disassembly reflected at this moment without execution, indicating what the binary might approximately be.Dynamic Analysis (Behavioral Analysis): The process of executing a suspicious binary in an isolated environment and monitoring its behavior. The end of dynamic analysis means you have genuinely executed the malware once, observing step by step in debug mode.Code AnalysisMemory Analysis (Memory Forensics)
VMware Workstation, VMware Fusion (Mac OS), VirtualBox
Shared Folders Information:
https://www.virtualbox.org/manual/ch04.html#sharedfolders
Disable Windows Defender to avoid interference with malicious samples.
For very serious malware sample analysis, a complex laboratory environment is needed: with servers, clients, and other applications. You need a high-fidelity user environment to trick the malware into executing its true purpose program during the transition and reconnaissance period.
Security Awareness Tips:
Timely update of the virtual environment and use multi-layer nested setups to increase environment complexity (for instance, a base cloud virtual platform followed by a host, then VirtualBox): For malware exploiting environment complexity vulnerabilities, national efforts may be needed to tackle itâŠNetwork isolation, host-only, and host mode onlyDo not connect removable media on physical machinesLinux environments can be used to analyze Windows malware; if escaping from the virtual machine, it cannot infect the host.
Recommended tools for Ubuntu:
http://releases.ubuntu.com/ select LTS version, desktop or server edition.
INetSim (http://www.inetsim.org/index.html) simulates various Internet services (e.g., DNS, HTTP) for malware attacks.
Installation Documentation: http://www.inetsim.org/packages.html
Setting Static IP Addresses
Configuring INetSim
Hybrid Analysis:
https://www.hybrid-analysis.com
KernelMode.info:
http://www.kernelmode.info/forum/viewforum.php?f=16
VirusBay:
Contagio malware dump:
http://contagiodump.blogspot.com/
AVCaesar:
Malwr:
VirusShare:
theZoo:
http://thezoo.morirt.com/
The initial analysis method when encountering suspicious binary files is to use static analysis to extract useful information.
Determining the file type of a suspicious binary file will help identify the target operating system (Windows, Linux, etc.) and architecture (32-bit or 64-bit platforms) of the malware.
For example, if the file type of the suspicious binary is Portable Executable (PE), which is the file format for Windows executables (.exe, .dll, .sys, .drv, .com, .ocx, etc.), then you can infer that the file is intended for the Windows operating system.
Most Windows-based malware are executable files with extensions like .exe, .dll, .sys, etc. Attackers use various techniques to conceal their files by modifying file extensions and changing their appearance to trick users into executing them. File signatures can be used to determine file type instead of relying on file extensions.
A file signature is a unique byte sequence written in the file header. Different files have different signatures, which can be used to identify the file type. Windows executable files, also known as PE files (e.g., files ending in .exe, .dll, .com, .drv, .sys, etc.), have an MZ or the hexadecimal characters 4D 5A as the two bytes at the start of the file.
http://www.filesignatures.net/
We have accurately discovered that there are many PE files fundamentally identical to exe files.
Locate the file signature by opening the file in a hex editor.
Hex Editor
HxD hex editor (https://mh-nexus.de/en/hxd/)
Using the xxd tool in Linux to view hexadecimal (install with: yum install vim-common)
Linux: use file command
Windows: use the CFF Explorer tool from the Explorer Suite package
http://www.ntcore.com/exsuite.php
Signature recognition aims to generate an encrypted hash value for the content of suspicious binary files.
Note: The hash based on file content only represents a numeric identity, requiring querying a âpolice stationâ [like VirusTotal] to check for criminal activity. Not finding anything doesnât mean itâs safe; it only means it doesnât have a âcriminal recordâ for now. Any modification to a single bit in the file content will change the hash identity completely. It will appear as a completely different identity.
Of course, based on this principle, you can think about CS multiplayer toolsâ payloads, including the APIs they load and invoke by default. If you donât modify them, someone has found them and generated signatures to detect and eliminate them.
Refer here, they fingerprinted API function hashes: https://decoded.avast.io/threatintel/decoding-cobalt-strike-understanding-payloads/
Website: VirusTotal https://www.virustotal.com/gui/
Windows HashMyFiles
(http://www.nirsoft.net/utils/hash_my_files.html)
Python Hashing Real-World Environment: Kali
pestudio (https://www.winitor.com/) or PPEE (https://www.mzrst.com/). When loading a binary file, it will automatically query the hash value from the VirusTotal database and display the result as shown below:
After downloading, remember to verify the hash
[Please Note] If antivirus scanning engines do not detect a suspicious binary file, it does not necessarily mean that the suspicious binary file is safe. These antivirus engines rely on signatures and heuristics to detect malicious files.
Malware authors can easily modify their code and use obfuscation techniques to bypass detection. This is why relying on SOCs, EDRs, and other more data-associative platforms and multidimensional emergency response platform mechanisms is necessary.
Reason for querying hash values: Prevent confidential files from being uploaded to public platforms, especially if you are a red team member.
VirusTotal (http://www.virustotal.com)Habo: habo.qq.comMaldun: https://maldun.com/Online scanners, e.g., VirSCAN (http://www.virscan.org/)Jotti Malware Scan (https://virusscan.jotti.org/)OPSWATâs Metadefender(https://www.metadefender.com/#!/scan-file)Online sandbox: https://app.any.run/QiAnXin Sandbox: https://sandbox.ti.qianxin.com/sandbox/page
Offense-Defense Awareness: Strings are ASCII and Unicode printable characters embedded in files. Malware writing introduces many functional requirements, such as file behavior, network behavior, process behavior, registry behavior, etc. You need strings to write a callback IP address, domain name, registry entries, right? String extraction is designed to find these spots.
Once you understand this awareness, youâll know how to use good and bad strings, including open-source resources (like excluding all good string resources using YARA).
For instance, if malware creates a file, the file name is stored as a string in the binary. Or malware parses an attacker-controlled domain name, stored as a string. Strings extracted from binary files can include references to filenames, URLs, domain names, IP addresses, attack commands, registry entries, etc.
pestudio (https://www.winitor.com/) tool for string extraction
Using strings to detect msf payloads
We discovered the msf-released payload program ab.exe (apache benchmark), its default msfpayload template.
Aha, we also found the payload programâs functionalities and parameter options, all printed out.
API calls also printed out. If you want to take a deeper look at API string calls, you can reference the following threat indicators:
Note the techniques of malware persistent registry modification and adding firewall whitelist, which looks like this below.
Other Tools
Mark Russinovichâs strings utility ported to Windows (https://technet.microsoft.com/en-us/sysinternals/strings.aspx) and PPEE (https://www.mzrst.com/) are other tools for extracting ASCII and Unicode strings.
Using FLOSS to Decode Obfuscated Strings
Most times, malware authors use simple string obfuscation techniques to evade detection. In such cases, string printing wonât be visible. FireEye Labsâ Obfuscated String Solver (FLOSS) helps identify and extract obfuscated strings from malware. It assists in determining the strings malware authors wanted to hide from string extraction tools.
Reference:
Software Package Download
https://github.com/fireeye/flare-floss/releases
User Manual
https://github.com/fireeye/flare-floss/blob/master/doc/usage.md
Download and unpack, cmd ready-to-use
You can refer to scenarios involving strings in malware, as disclosed by the DeadEye Security team: targeting the supply chain for red teams.