1. Introduction
The famous disassembler and decompiler, IDA Pro has a signature matching mechanism called IDA F.L.I.R.T.[1] which identifies libraries statically linked into executable file. On Windows, it often links to runtime libraries built by Microsoft, which are limited in variety. On the other hand, on UNIX-like operating systems, where the source code is publicly available, distributors build individually runtime libraries, so there are many different types of runtime libraries. Thus, it is difficult to identify runtime library functions by the signatures included in IDA Pro.
In the fact, there are published studies and tools [2][3] to solve this problem which identifies functions in the runtime libraries. However, there are no IDA F.L.I.R.T. signatures or tools directly available for IDA Pro. Thus, we created “idaflirt-detector”[4] which is tools to identify runtime library functions by using IDA F.L.I.R.T. signatures generates from existing binary images of runtime libraries built by distributors, and published it on GitHub.
GitHub URL:https://github.com/SecureBrain/idaflirt-detector
We assume that readers have some knowledge of IDA Pro, Python and static analysis of Windows malware, and IoT malware is executed on UNIX-like operating system. This article describes one of the problems when starting a static analysis of IoT malware: identifying statically linked runtime library functions.
2. Executable File
The executable file format of the IoT malware is ELF (Executable and Linkable Format). In terms of identification of runtime library functions, ELF executable files can be classified into the followings.
- Dynamically Linked
- Statically Linked
- not stripped
- stripped
First, as with Windows executable file, there are two types of links: dynamically linked and statically linked. In the case of dynamically linked, it is not necessary to identify function names, because libraries are loaded at runtime and addresses of the functions are resolved from their names. In the case of statically linked and not stripped, it is not necessary to identify function names either, because symbolic information is not removed and function names are existing in the executable file. Therefore, function names must be identified, only if statically linked and stripped. (Except in cases where the symbolic information is spoofed or otherwise file is obstructed for static analysis.)
3. Usage
3.1. Install
“pkg2sig.py” in “script” folder will download the packages and generate the IDA F.L.I.R.T. signatures, but using the already generated signatures in the “deliverable” folder is easier. Copy the entire contents of “sig” folder in “deliverable” folder to “sig” folder of the IDA Pro installation folder shown in Figure 1 (usually “C:\Program Files\IDA Pro 7.6\sig”). Once this has been done, the installation is complete.
3.2. Identify runtime library
“chksig.py” in “script” folder identifies the runtime library. If the name of the executable file is “sample”, we can run it on the command line as the following.
- python chksig.py <sample’s path>
After “chksig.py” runs IDA Pro of the command line version several times, it will create “sample_chksig.json” (the target file name will be appended “_chksig.json”).
3.3. Apply the identified runtime library
Figure 2 shows that statically linked and stripped executable file is opened in IDA Pro. The function names of the runtime library has not been identified and their name is unknown.
On IDA Pro, “prepare.py” in the script folder will load the runtime library based on “sample_chksig.json”. “prepare.py” is also available as:
- create function from indipendent code
- normalize function name
- detect “main” function
- load type library
- apply function declaration
Figure 3 shows IDA Pro after executing “prepare.py”. The runtime library functions have been identified and their color is now cyan.
4. Detail
4.1. Distribution and Runtime Library
Since prior study [2] has shown that the many IoT malware build environments are Firmware Linux and Aboriginal Linux (Firmware Linux changed the name to Aboriginal Linux), the distributions supported by “pkg2sig.py” them only. The runtime libraries libc and libgcc are supported.
4.2. chksig.py
When “chksig.py” is executed for the first time, it executes IDA Pro and applies all signatures to the sample. At this time, the library flag of the function is turned off before applying the signature, but it cannot be fully initialized. Thus, the same signature will give different results between the first time and the second time onwards (after other signature is applied). Therefore, “chksig.py” stores the result of applying all signatures as an estimated value. After that, one by one, the signatures are applied in the order of increasing estimated value to get the determined value. If the determined value is greater than all the estimated values, then the runtime library is identified as the statically linked library in the executable.
4.3. Normalize Function Name
Functions in libraries may have more than one name. For example, “strcmp” can be “strcoll”, “__GI_strcmp”, and “__GI_strcoll”. If the same function is identified with different names, it becomes difficult to analysis. So “prepare.py” replaces the other names with the shortest and first name in the lexical order. “pkg2sig.py” generates “name_alternate.csv” for this name replacement when generating IDA F.L.I.R.T. signatures. This file has the same functions on the same lines, each line sorted by name in shortest to lexical order.
4.4. Apply Function Declaration
“prepare.py” allpies the function declaration based on “prepare.txt”. Currently, “prepare.txt” does not have enough declarations, so they need to be added as needed.
5. Acknowledgement
This research was conducted under a contract of “MITIGATE” among “Research and Development for Expansion of Radio Wave Resources(JPJ000254)”, which was supported by the Ministry of Internal Affairs and Communications, Japan.
6. References
- Hex-Rays, “IDA F.L.I.R.T. technology: in-depth”, https://www.hex-rays.com/products/ida/tech/flirt/in_depth/, (accessed 2022-01-24).
- Akabane, S. and Okamoto, T.: Identification of library functions statically linked to Linux malware without symbols, Procedia Computer Science, Vol. 176, pp. 3436-3445 (online), DOI:https://doi.org/10.1016/j.procs.2020.09.053 (2020).
- Akabane, S. Okamoto, T., “stelftools”, https://github.com/shuakabane/stelftools, (accessed 2022-01-24)
- SecureBrain, “idaflirt-detector”, https://github.com/SecureBrain/idaflirt-detector, (accessed 2022-01-24)