WindowsAPI Primer for Malware Analysis
The Windows API is a common method Windows based malware will interface with the operating system to accomodate its functionality. Functions are required to ‘do something’, such as decrypting strings, making connections to IP addresses or enumerating processes on the victim machine.
These functions exist within DLL’s (Dynamic Link Libraries), let’s take URLDownloadToFile() as an example, referring to MSDN (Microsoft Developer Network), we can see that URLDownloadToFile purpose is to “Downloads bits from the Internet and saves them to a file.”.
We can also see the arguments (parameters) it takes to perform correctly, along with what the possible return values are.
Note: Further analysis is required beyond looking at the IAT (Import Address Table), to confirm if a function is even called in the end, it could be there as a placeholder for a later update or just to throw you off!
How to use this article
I’ve created this document to use as a point of reference for Windows API Function calls. So maybe you’re seeing URLDownloadToFile() and ShellExecute() connected in two branches of code, but don’t understand what purpose they’ll fullfil from the point of view of a malicious actor, you can check the document I’ve made here and the description for hints, hopefully my hints add up to what you’re seeing, as always apply your own due diligence as the hints I have listed could be different.
Cross reference your findings, with this document and MSDN to get a full picture. Check MSDN for the description of the function call, what does it do? What parameters does it take? What does it do with the return value(s)? Theorise how it could be used maliciously/maybe have a Google online for malware using that function to see how threat actors have used it.
NOTE: There will be parts in-complete, I can only fill in the areas I’ve previously came across in my analysis.
Malware stages and their common Function calls
This list is going to work as a condensed list of function calls, mapped to stages of malware i.e. Downloaders, Droppers, Un-Packing, Gaining Persistence etc. Then the table below will be for extra info on an individual function.
UrlDownloadToFile() — To download the file to the disk.
ShellExecute() — Execute the downloaded specimen
WinExec() — Execute the downloaded specimen
CreateProcess() — Execute the downloaded specimen
FindResource() — in the .rsrc section of a PE, strings may be stored to avoid static analysis tooling. FindResource will look in this section for a particular resource, the returned value is stored in EAX and normally passed as the second argument to the LoadResource API.
LoadResource() — This function retrieves a handle to the data associated with the resource. The return value of LoadResource, which contains the retrieved handle, is then passed as the argument to the LockResource API. It essentially turns the handle into a pointer.
LockResource() — This function takes the pointer to the resource data and finds the actual pointer to the resource itself. By examining the hex dump of the EAX register using breakpoints on this call, you’ll see when you step into it that EAX is populated with an MZ header (or other executable code maybe).
SizeOfResource() — Once the resource is retrieved, the malware determines the size of the resource (PE file/executable code).
CreateFile() — Once the size is determined and the resource is located, CreateFile then simply creates this data into a file. The parameters consist of the name of the file, path and size, all prepared to be written to the disk.
WriteFile() — The file is then written to the disk with this function, the first parameter is the handle to the file, second is the address of the resource and the third is the size, which was determined by SizeofResource.
VirtualAlloc() — This is the most common import/function call you’ll see when looking at samples of malware that are packed. VirtualAlloc reserves, commits, or changes the state of a region of pages in the virtual addres space of the calling process. Memory allocated by this function is automatically set to zero unless otherwise modified by the second parameter dwSize, which defines the size of the region, in bytes. The first parameter is lpAddress, which is the starting address of the region to allocate. The return value, if successful is the base address of the allocated region of pages, which is then likely used by the unpacking sub-routine.
GlobalAlloc() — Most malware prefers to use VirtualAlloc, because the memory address can be specified by way of a parameter, which means the need to perform address relocation on the unpacked code is obviated if a memory block can be allocated at the desired address.
EnumProcessModules — This function is used to enumerate the loaded modules (executables and DLL’s) for a given process. Malware enumerates through modules when performing injection.
VirtualAllocEx() and CreateRemoteThread() — A process cannot simply write to an area of memory in which, for example, Notepad is running, as the virtual address mappings used by each process will differ. The API function VirtualAllocEx(), however, allows a block of memory belonging to a different process to be allocated; code can then be copied into this area and executed by the way of CreateRemoteThread().
CryptEncrypt() — This function encrypts data. The algorithm used is designated by the key held by the CSP (Cryptographic Service Provider) module and is referenced by the hKey parameter. The hKey parameter is the first param passed to the function.
CryptGenKey() — This function generates a random cryptographic session key or a public/private key pair. The handle to the key or key is returned in phKey.
CryptImportKey() — This function transfers a cryptographic key from a key BLOB into a cryptographic server provider (CSP). Key BLOB’s hold keys outside the CSP.
CryptDecrypt() — Decrypts data previously encrypted by CryptEncrypt. It will take the first parameter hKey which is a handle to the key used for decryption. The fifth parameter is also of interest, pbData which is a pointer to the buffer holding the data to be decrypted, after the decryption has been performed, the plaintext is placed back into the same buffer.
API Obfuscation Routines
GetProcAddress — Because it is easy to discover which API functions a program calls by examining the IAT, these two functions are used to resolve the addresses of the API calls made by the body of the malware (as opposed to making the calls directory). GetProcAddress is a function used to retrieve the address of a function in a DLL loaded into memory. This is used to import functions from other DLL’s in addition to the functions imported in the PE file header.
LoadLibrary() — Loads the specified module into the address space of the calling process, the first parameter is simply lpLibFileName which will be the name of the module, it can either be a library module (.dll file) or an executable module (an .exe file). It returns a handle to the module if successful.
LdrLoadDLL() — This is a low-level function to load a DLL into a process, just like LoadLibrary.
RegCreateKeyExA() — Creates a specified registry key, if it already exists, the function just opens it.
RegOpenKey() — This function opens a handle to a registry key for reading and editing. The first parameter is hKey and this is the handle, typically the handle is returned by RegCreateKeyEx or RegOpenKeyEx. Otherwise, it can be one of the following predefined keys: HKEY_CLASSES_ROOT, HKEY_CURRENT_CONFIG, HKEY_CURRENT_USER, HKEY_LOCAL_MACHINE, HKEY_USERS.
RegEnumKey() — Once the key is opened, malware might use this to enumerate for subkeys in the open key. In my experience, I have seen it do this as a method to detect if the machine is already infected by either it’s own malware or other strains.
RegQueryValue() — Part of the registry enumeration process, this function queries the value of a specified registry key, it can do this again to check if the value is already set or unset.
RegSetValue() — As it says on the tin, this function will set the value of a specified registry key. This will be where you see the persistence being implemented.
RegCloseKey() — This function closes the handle to the registry key, presumably once the malware has set the values.
GetAsyncKeyState() — Checks the keys state (whether it’s pressed or de-pressed), it determines when a key is being pressed.
SetWindowsHookEx() — This is a function for installed a hook procedure to monitor events, in this case, keyboard events. The malware will register a function that will be notified of keyboard events triggering, you might see some CMP instructions with conditional JMP’s to maybe send the data somewhere, locally or over the network. The SetWindowsHookEx takes a parameter called ‘idHook’ which offer different types of hook procedures to be installed, they range from monitoring message boxes (WH_GETMESSAGE or symbolic constant 3), monitoring the keybaord (WH_KEYBOARD or symbolic constant 2), to mouse events (WH_MOUSE or symbolic constant 7). Be aware with mouse events there are hook procedures for the mouse movement, mouse left click down and up, mouse right click down and up.
GetKeyState() — Another API to get the state of keys on a keyboard.
Note: cover the keyloggers in SANS booklets.
Network sniffing/traffic monitoring
WSASocket() or SockeT() — Creates a socket
bind() — binds a socket to an interface (NIC)
WSAloctl() or ioctlsocket() — Put interface (NIC) in to Promiscuous mode. The dwIOControlCode parameter sets a value of SIO_RCVALL which tells the OS to put the NIC in promiscuous, the dwIOControlCode parameter is available in both WSAloctl and ioctlsocket as the first parameter to be pushed to the call.
Replication via Removable Media
GetLogicalDriveStrings() — Typically reads/lists all drives connected to the machine.
GetDriveType() — This will read for which ones are removable.
CreateDirectory() — Creates a directory (in this scenario, it’ll be on the target removable media). The param of interest here is lpPathName (The path of the dir to be created), you might notice a trick used by malware to use the Unicode version of this function called ‘CreateDirectoryW’, it’ll do this as Windows will not display accents as a dir name, thus making it invisible technically speaking. The return value will be nonzero if it succeeds, otherwise zero if it fails, typically if this fails, the directory already exists so you need to check where the jump goes if it fails, does the malware kill itself off knowing it’s already infected?
SetFileAttributes() — Sets the attributes for a file or directory. The two (and only two) params of interest are lpFileName (the name of the file whose attributes are to be set) and dwFileAttributes (the file attributes to be set), there are about 8 attributes, the one you’ll typically see will be (and don’t take this as gospel, again cross reference your analysis):
- FILE_ATTRIBUTE_HIDDEN — The file or directory will be hidden. It is not included in an ordinary directory listing. This means the directory/file with this attribute set to true will hide the dir/file from the user.
CreateFile() — With the directory made and hidden, the malware will now create an empty file for the executable code to be dropped into. The function will return a handle that can be used to access the file for I/O operations.
WriteFile() — This will take the handle to the file to write to, from CreateFile as the first parameter. The second parameter, lpBuffer will give a pointer to the buffer containing the data to be written to the file, you may see this is an address to another location in the virtual memory of the process, or it’s been handed to the WriteFile function from somewhere else, as the data has to be retrieved, pointer has to be created and so on and so on…
MoveFile() — During my own analysis of malware targeting removable media, it typically has to trick the user into executing the payload. The reason for this is because up to Windows Vista, the Autorun.inf file was an ideal attack vector as autorun features would run anything in that file as commands, i.e. the payload! Looking onwards from Vista, this method is defunct so the malware will now trick the user by moving everything into the hidden directory. Sometimes a file inside the hidden folder will be called desktop.ini to configure the icon of the hidden folder to something that resembles a drive. The malware would then be expected to create a shortcut to rundll32.exe, and the param to rundll32.exe is the malware payload (badfile.exe), so instead of opening that hidden folder, it launches rundll32.exe and passes the payload into it as a launch parameter, thus executing the malware.
CopyFile() — This might be used to move all files from the removable media into the aforementioned hidden folder to create only one option for the user.
InternetOpen() — This initialises the malwares use of the WinINet functions. The params of interest here will be lpszAgent which will be the user agent used, also a good IOC.
GetComputerName() — I have seen during analysis, C2 malware sending the victims machine name up to the C2 as part of the User Agent, as always double check in the sample you’re looking at how it really does use this API call, though I have seen it used in C2.
InternetConnect() — This opens an FTP or HTTP session for a given site, the hInternet parameter will take the handle returned by a previous call to InternetOpen(). The second parameter is lpszServerName which is a null-terminated string that specifies the host name of the server to connect to, typically this will contain an IP address, sometimes obfuscated in one way or another to protect from basic analysis and to throw the analyst off during code analysis.
HTTPOpenRequest() — Creates a HTTP request handle, it’s part of building the HTTP request and will take the handle to the session from InternetConnect(), which now consists of things like User Agent, IP Address, Protocol etc.
HTTPSendRequest() — As it says on the tin, this function is responsible for sending the specified request built by the previous functions up to the HTTP server. The amount of data being sent depends on the malware using HTTPSendRequest() or HTTPSendRequestEx() so be aware of that.
InternetOpenURL() — This qill query a resource specified by a given URL, it typically takes the first argument hInternet as the handle to the session from InternetOpen() and a pointer to a null-terminated string, which will be the URL in the second parameter called lpszUrl. You may also want to look at the third arg, lpszHeaders, which specifies the headers and the dwHeadersLength as sometimes this can be a bit of a behavioural indicator if you’re tracking malware over time, does the header length change? Suggesting it may be using the same command instruction set. The final param which is dwFlags is interesting as it can control the discretion of the connection i.e. some flags can ignore redirects to HTTPS or HTTP, others can disable the checking of SSL based certificates for proper validity dates.
InternetReadFile() — This function reads data from a previously opened URL, it will typically get the handle for this data from InternetOpenUrl, FtpOpenFile, or HttpOpenRequest, that will be passed in as the first argument, hFile. The second will be a pointer to a buffer to recieve the data, lpBuffer, finally lpdwNumberOfBytesRead, which is a pointer to a variable that receives the number of bytes read, you will have to debug this function as the InternetReadFile function sets this value to zero before doing any work or error checking.
WSAStartup() — This is used to initialise low-level network functionality. Findings calls to WSAStartup cam often be a way to identify the start of network related functionality.
Socket() — This creates a socket that is bound to a specific transport service provider. The first argument ois address family, the most common value is ‘AF_INET’ which specifies the use of ipv4, this can also be displayed as the respective symbolic constant 2. The second parameter is ‘type’ which specifies typically TCP or UDP, the values are 1 to 5, TCP is ‘1’ or ‘SOCK_STREAM’ and UDP is ‘2’ or ‘SOCK_DGRAM’.
Connect() — This function will then establish a connection to the specified socket.
Send() — Sends the data on a connected socket.
Recv() — Receives data from a connection socket.
GetHostByName() — Resolves a URL/hostname to an IP address. It performs a DNS lookup on a particular hostname prior to making an IP connection. The data used here makes a good IOC.
Inet_ntoa() — This converts an ipv4 address into a string in Internet standard dotted format.
inet_addr() — converts a string containing an IPv4 address into a proper address for the IN_ADDR structure.
CreateThread() — Creates a thread for a function to run in, in this case you might see a push instruction onto the stack of the offset to another function for performing extra C2 activities.