Analyze stealer from VietNam

Konoha
16 min readJul 27, 2023

--

1/ About Stealer

In the past week, my leader came across an article about a stealer originating from Vietnam. Therefore, after I completed all the tasks and decided to analyze this stealer, the purpose was for me to understand how to decompile a binary written in Node.js (this is the second time I’m analyzing a binary written in Node.js.).

2/ Analyze

Now, let’s analyze it. I have downloaded a sample from tria.ge as mentioned in the Twitter post.

Because I have experience from the previous time, this time my beginning is quite smooth.

a/ Initial identify

How can I recognize if this is a program written in Node.js?

I loaded it into IDA and found that there is a string "Node.js is only supported…” in the main function, which is an indication to identify it.

As it is written in Node.js and compiled into an executable file, this file will contain all configuration files and code in clean text format. I have searched for "dependencies”: { to find all package.json files.” After a period of searching, I finally found where i needed attention.

A question pops up in my mind: “How to compile Node.js into an exe file?” So, I searched and found a post discussing how to compile a Node.js application into an exe file:

This post mentioned 2 methods to compile into an exe file:

  • nexe
  • pkg

But in the package.json file I mentioned earlier, there is a line: "build": "pkg package.json". So, I now know that this stealer is compiled using pkg.

I have researched more information about pkg, and as a result, I found:

I understand that pkg is a packer that can package the entire Node.js project into an exe file.

To unpack it, i found a tool on github:

This is the result when I use the above tool:

At the app.js file (referenced in package.json mentioning pkg using app.js), the code of the file has been obfuscated, which makes me very frustrated. Therefore, I will proceed to clean the code to make it more readable.

b/ Clean code

To clean code, first, i used visual studio code to “Format document”. Then, i used a tool to deobfucate it:

The result I obtained is as follows:

Other issue, as we can see, in some places, when using strings, they need to go through a function to resolve the strings, like this:
string = array_string[Input_number - 468]. And before that, array_string has been rearranged through a function on line 9.

I debugged it (function on line 9) to obtain array_string and then wrote a Python script to resolve the string by replacing the function call with the resolved string.:

import re
import io

INPUT = "clean.js"
OUTPUT = "clean-ouput.js"

# function resolve string
def getText(index):
arr = [
... # obtained array_string
]
return arr[index-468]

f = open(INPUT, "rb")
text = f.read()
f.close()

f = open("log.txt","wb")

codelines = text.decode().split("\n")
_vars = ["tinelle"] # <== function resolve string as the first element of the array (ex: tinelle)
regex = r'const\s+(\w+)\s+='
isbreak = False
while not isbreak:
isbreak = True
for i in range(len(codelines)):
for v in _vars:
if ("= %s"%v) in codelines[i]:
text = re.search(regex, codelines[i])
if text == None:
continue
tmp = text.group(1)
if tmp not in _vars:
_vars.append(tmp)
isbreak = False
break

print(_vars)
regexPattern = r'\b(\w+)\(\d+\)'
pattern2 = r'(\d+)'
for i in range(len(codelines)):
for v in _vars:
if ("%s("%v) in codelines[i]:
tmpLine = codelines[i]
isbreak = False
while not isbreak:
text = re.search(regexPattern, tmpLine)
if text == None:
isbreak = True
break

tmpLine=tmpLine[len(text[0]):]
if ("%s("%v) not in text[0]:
continue

mat = re.search(pattern2, text[0])
tmpText = getText(int(mat[0]))
if '"' in tmpText:
f.write(("%30s -> %30s \n"%(text[0], tmpText)).encode("utf-8"))
codelines[i] = codelines[i].replace(text[0], ("\'%s\'"%tmpText))
else:
f.write(("%30s -> %30s \n"%(text[0], tmpText)).encode("utf-8"))
codelines[i] = codelines[i].replace(text[0], ("\"%s\""%tmpText))
text = re.search(regexPattern, codelines[i][mat.start():len(codelines[i])])
with open(OUTPUT, "w", encoding="utf-8") as f1:
for l in codelines:
f1.writelines(l)
f.close()

The result:

c/ Analyze app.js (cleaned)

Below the resolve function, it hides the window (one thing that any malware often does). I changed hideConsole to showConsole so that if I accidentally click on execute, I can still know how to turn it off.

Next, here is the function to decode the cipher into a string (don’t care about ‘require’):

To explain this algorithm for decoding the cipher, I will demonstrate the workflow when it is executed with the parameter “57414841…”.

I decoded it:

Continuing, I decoded the base64 to obtain the C2 server domains.It will randomly choose 1 out of 3 servers.

Up to this point, I have noticed that this stealer will send data via the Telegram API. It already has the bot token and two destinations configured (chatID: -694092486, -929725005)

Get information bot:

Get information chatID -694092486:

Get information chatID -929725005:

Continuing the analysis, I will proceed to the function named “main” for further examination. From now on, there are only function declarations without executing functions (they will be executed in the main function), and I will decode all encoded values beforehand.

The first in the main function, i saw it sleep 1s by setTimeout.

Here, you might have the same question as I do, “Why is there a sleep of 1 second? Also, I noticed that 1e3 looks like a hex number, but when converted to decimal, it becomes 483.”.

I found the answer explaining this through a link on Stack Overflow. 1e3 is not a hex number; it is written in Scientific Notation. 1e3 means 1 * 10³, which equals 1000. So, it sleeps for 1000 milliseconds ~ 1 seconds.

Afterward, it will check if the binary path of the current process is located within the user directory. If it is, then it will change the variable “hanhDong” from “Tắt máy” to “Reset”; otherwise, it will not make any changes.

It takes the process’s executing path, extracts the filename of the process, and concatenates it with the User directory path into the variable atie. As for the value of albine, it will be assigned the path %APPDATA%/local/pdf.exe.

The purpose is to create persistence on the victim’s machine by copying the executing stealer file into the current user directory and %APPDATA%/../local/pdf.exe. Then, it (including the current stealer file) will be launched when Windows starts up by using "auto_launch" .

Before creating persistence, the stealer will retrieve the last execution time from the file recorded during the previous execution to check whether the time difference between the last and current execution is more than 1890 seconds (over half an hour). If not, it will enter a sleep mode (sleep 20.5s * 2880000) before proceeding with a series of actions; otherwise, it will skip the sleep mode (if it’s the first execution, it will not enter the sleep mode). After that, it will overwrite the file (or create the file if it doesn’t exist) and write the current time into the file. Note that the file is named “/zcxzcd.txt,” and it is located in the current user directory, and this time recording is performed before entering the sleep mode. I don’t understand the purpose of the action of sleeping for more than 683 days here.

I don’t understand the purpose of the action of sleeping for more than 683 days here. Regardless of whether it receives data or not, it will be saved into a file named bcbcvbsde.txt in the user directory in base64 format. Note: The received data is in JSON format.

It gets IP’s victim and country through whoer.net.

An overview of getBW: This function is used to retrieve specific data from a large dataset passed as the first parameter. It achieves this by utilizing the signature (second and third parameters) to eliminate the data before and after the desired portion that needs to be extracted. For example, if I want to extract “abc” from the string “…..!@#<”abc’………”, I will use the function like this: getBW(".....!@#<\"abc'.........", '!@#<"', "'").

After obtaining the victim’s IP, it will create new data on its GraphQL over its WebSocket with the URL “wss://heheimage.xyz/graphql”, and the ID of that data will be the victim’s IP address. Additionally, i have additional information that all of its domains also have a URL leading to GraphQL (server2 + "/graphql").

I using studio.apollographql to view their graphql.

I have their graphql’s tree map:

query
|
|-------businesses(…):
| Businesses
|-------businessess(…):
| [Businesses]
|-------countDuLieu(…):
| Float
|-------duLieu(…):
| DuLieu
|-------duLieus(…):
| [DuLieu]
|-------existImageInS3(…):
| Boolean
|-------fanpage(…):
| Fanpage
|-------fanpages(…):
| [Fanpage]
|-------imageInS3:
| [String]
|-------images(…):
| [Image]!
|-------me:
| User
|-------taiKhoanAdss(…):
| [TaiKhoanAds]
|-------users:
| [User]
Mutation
|-------capNhatTrangThaiDuLieu(…):
| DuLieu
|-------createBusinesses(…):
| Businesses
|-------createDuLieu(…):
| DuLieu
|-------createFanpage(…):
| Fanpage
|-------createTaiKhoanAds(…):
| TaiKhoanAds
|-------createUser(…):
| User
|-------deleteBusinesses(…):
| Businesses
|-------deleteFanpage(…):
| Fanpage
|-------deleteTaiKhoanAds(…):
| TaiKhoanAds
|-------deletesBusinesses(…):
| [Businesses]
|-------deletesDuLieu(…):
| [DuLieu]
|-------deletesFanpage(…):
| [Fanpage]
|-------deletesTaiKhoanAds(…):
| [TaiKhoanAds]
|-------imageCreate(…):
| Image
|-------imageDelete(…):
| Boolean!
|-------imageDeleteByUrl(…):
| Boolean!
|-------imageDeletes(…):
| Boolean!
|-------imageUpdate(…):
| Image
|-------newPasswordResetToken(…):
| Boolean
|-------newVerificationToken(…):
| ReNewVe
|-------removeUser(…):
| Boolean
|-------signedLinkUpload(…):
| signedS3
|-------signupUser(…):
| User
|-------updateBusinesses(…):
| Businesses
|-------updateDuLieu(…):
| DuLieu
|-------updateFanpage(…):
| Fanpage
|-------updateProfileUser(…):
| User
|-------updateTaiKhoanAds(…):
| TaiKhoanAds
|-------updateUser(…):
| User
|-------verifiForGot(…):
| String
|-------verifiLogin(…):
| String
|-------yeucaudulieumoiDuLieu(…):
| DuL
Subscription
|-------yeucaudulieumoi(…):
| yeucaudulieumoiSubType

Not only does it create new data, but it also performs data updates.

Now, let’s proceed with the final analysis of how this stealer collected data from the victim and transferred it back to the stealer’s authors.

Before proceeding with the analysis, I will display the predefined values of browser paths used by this stealer:

I decoded it:

ChromePaths: {
name: 'Chrome',
productName: 'Google Chrome',
pa: '\\AppData\\Local\\Google\\Chrome\\User Data',
local: '\\AppData\\Local\\Google\\Chrome\\User Data\\Local State',
cookie: '\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\Cookies',
login: '\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\Login Data'
},
OperaGXPaths:{
name: 'Opera',
productName: 'Opera Browser',
pa: '\\AppData\\Roaming\\Opera Software\\Opera GX Stable',
local: '\\AppData\\Roaming\\Opera Software\\Opera GX Stable\\Local State',
cookie: '\\AppData\\Roaming\\Opera Software\\Opera GX Stable\\Cookies',
login: '\\AppData\\Roaming\\Opera Software\\Opera GX Stable\\Login Data'
},
OperaDefaultPaths:{
name: 'Opera',
productName: 'Opera Browser',
pa: '\\AppData\\Roaming\\Opera Software\\Opera Stable',
local: '\\AppData\\Roaming\\Opera Software\\Opera Stable\\Local State',
cookie: '\\AppData\\Roaming\\Opera Software\\Opera Stable\\Cookies',
login: '\\AppData\\Roaming\\Opera Software\\Opera Stable\\Login Data'
},
MicrosoftEdgePaths:{
name: 'Edge',
productName: 'Microsoft Edge',
pa: '\\AppData\\Local\\Microsoft\\Edge\\User Data',
local: '\\AppData\\Local\\Microsoft\\Edge\\User Data\\Local State',
cookie: '\\AppData\\Local\\Microsoft\\Edge\\User Data\\Default\\Cookies',
login: '\\AppData\\Local\\Microsoft\\Edge\\User Data\\Default\\Login Data'
},
BravePaths: {
name: 'Brave',
productName: 'Brave Browser',
pa: '\\AppData\\Local\\BraveSoftware\\Brave-Browser\\User Data',
local: '\\AppData\\Local\\BraveSoftware\\Brave-Browser\\User Data\\Local State',
cookie: '\\AppData\\Local\\BraveSoftware\\Brave-Browser\\User Data\\Default\\Cookies',
login: '\\AppData\\Local\\BraveSoftware\\Brave-Browser\\User Data\\Default\\Login Data'
}

The variables above will be the parameters passed into the function to retrieve victim’s data (johanan).

Inside johanan function, first, it will retrieve the browser information passed as a parameter (delauren) via funGetProfile function.

The information that the funGetProfile function retrieves includes the user’s path directory, the path of user data browser, and the browser version. If the browser does not exist, it will return the default version, which is “108.0.0.0”.

result funGetProfile

The purpose of obtaining the path of the user data browser is to retrieve the encrypted_key from the ‘Local State’ file and decrypt it.

The demo output after decryption
‘local state’ file path

After decrypting the encryption_key, it proceeds to kill all processes of Chrome and Edge browsers. At this point, I am not sure of the deeper purpose behind this action; it seems to be more about causing disruption and chaos rather than serving a specific purpose.

Even within the loop, it’s the same.

After killing all Chrome and Edge processes, it starts using sqlite3 to retrieve cookies from the browser (e.g., the cookie file path for Google Chrome: C:\Users<USER>\AppData\Local\Google\Chrome\User Data\Default\Network\Cookies). If it fails to retrieve the cookies, it will once again kill all processes as before and attempt to retrieve them again. At this point, I understand that the purpose of killing Chrome and Edge processes is to obtain the browser’s cookies.

But I still don’t know why, except for Chrome and Edge, other browsers are exceptions and not killed.

After successfully creating the sqlite3 with the “Cookies” file, it will then execute a query to retrieve all cookies.

The format of each element in array cookies is as follows:

After obtaining all the cookies, it checks if there exists a Facebook cookie. If found, it proceeds to retrieve all cookies with domains [“*.facebook.com”, “*google.com”, “*.live.com”], and then decrypts the encrypted_value of each cookie using the decrypted encrypted_key obtained from the “Local State” file (using the aes decryption algorithm). In each iteration of the for loop, after successfully decrypting the encrypted_value, it will push that cookie into solange, but with the encrypted_value field replaced by the value field, which will now contain the decrypted encrypted_value.

Similarly, with the above steps, the stealer uses sqlite3 to retrieve username and password information from domains like [“*.facebook.com”, “*.google.com”, “*.live.com”], and the password_value field will be decrypted using the AES algorithm.

Continuing, it will create a zip file and add all the files within “nkbihfbeogaeaoehlefnkodbefgpgknn” extension. The file will be named following the format: `<IP victim>+’”-”+<current iteration count>+”-meta.zip` (for example: 127.0.0.1–2-meta.zip). Once it has completed creating the compressed file containing the files within that extension, it will send it to the stealer’s author via a Telegram bot.

So, now i have a question “what is “nkbihfbeogaeaoehlefnkodbefgpgknn” extensions ?”

I searched on Google with the `”nkbihfbeogaeaoehlefnkodbefgpgknn” extension`, and I have found two noteworthy results.

At this point, I have discovered that it is MetaMask, and what MetaMask is, you may explore on your own. I won’t mention it in this article to avoid going off-topic.

Returning to the analysis, after the stealer collects the information (cookies, usernames + passwords, MetaMask extensions data), the result will be passed into a function called “getToken” (maran: current key of loop, haelynn: just facebook cookies, solange: cookies, gerben: accounts information, ellynor.version: version browser, delauren: passed parameters (json browser path), jocob: all decrypted password)

Inside getToken, the list of facebook cookies used to create a cookie string (raymundo) which is then utilized in the request facebook function (at header). “ari” uses the POST method, while “quinnlee” uses the GET method.

Steal access token facebook:

If stealing the access token fails or the access token is invalid, it will send a request to notify the server about the failure. Now I have another URL of the stealer’s author: server+/image/XTISWcQyPrNgX4haIpvoexpgdqp7Oa3u3w9ZGE.png to push (update) the stolen information from the victim. After completing the notification, it will end the function with a return value of false.

The format of the data it sends is:

{ 
ten: ,
uid: ,
email: ,
birthday: ,
location: ,
fa: ,
step: ,
trinhDuyet: ,
cookie: ,
password: ,
usergmail: ,
passgmail: ,
useroutlook: ,
passoutlook: ,
hanhDong: ,
token: ,
businesses: [],
fanpage: [],
maquocgia: ,
ip: ,
taiKhoanAds: [],
cookieOutlook: ,
cookieGg:
}

When it’s about to send data back to the stealer’s author server, Encipher is a function to encrypt data encoded by base64 and insert it into the Authorization header. Now, I will briefly explain how this data encryption function works.

The Encipher function is used to encrypt a string into a hash string. It can be understood as the reverse of the Decipher function, but it doesn't mean they use the same key. (De uses "haha123444" while Encipher uses "haha123").

key Encipher: “haha123”, key De: “haha123444”

To demonstrate the relationship between Encipher and De, I have provided the following examples. The image on the left shows the result when both functions use different keys, while the image on the right shows the result when both functions use the same key. As you can see, when using the same key, the decrypted result matches the original string (“abc”).

Continuing with the analysis, once it successfully obtains the access token, it will retrieve (fbid, userFullName), and then send this information (along with the Facebook access token) back to the stealer’s author server.

Continuing, it will also update the Outlook and Gmail tokens using the same method of request.

update gmail + outlook token

It request to mbasic facebook, get information (email, location, birthday) and update it to stealer’s author server.

It uses the Facebook Graph API to retrieve the ad account limit for this Facebook account. By following this Facebook link, I discovered that a Facebook account can only have a maximum of 25 ad accounts (“)>. You can try the Facebook Graph API through https://developers.facebook.com/tools/explorer/.

Then, push it to server.

Get and push access token fanpage:

This part of the code is a bit long, but it mainly focuses on retrieving information about all payment methods associated with this Facebook account. Then, it proceeds to fetch detailed information about each payment method, including the administrator, account type, spending limit, time zone, invoice date, account threshold, account balance, payment card, etc. After collecting this information, it updates it on the stealer’s author server.

Finally, it will print the collected information to a file and send it to a Telegram group through the Telegram Bot API.

3/ Summary

The information that the stealer steals includes:

  1. Cookies from browsers like Chrome and Edge.
  2. Login information (username and password) from websites such as Facebook, Google, and Live.
  3. Data from the MetaMask extension.
  4. Phishing information related to credit card and payment methods.
  5. Data on ad account limits and payment methods from Facebook.
  6. Other sensitive data related to payment accounts, such as account limits, time zones, invoice creation dates, balance, and payment cards.

This data is collected stealthily from the victim’s system and then encrypted and sent to the stealer’s author through various channels, including the server and Telegram bot API. Deatil:

  • Url stealer’s server:
https://toimageai.top
https://toimageai.top/graphql
https://toimageai.top/image/XTISWcQyPrNgX4haIpvoexpgdqp7Oa3u3w9ZGE.png
https://toimageai.top/bk/map.txt
https://editorimage.info
https://editorimage.top/graphql
https://editorimage.top/image/XTISWcQyPrNgX4haIpvoexpgdqp7Oa3u3w9ZGE.png
https://editorimage.top/bk/map.txt
https://avatarcloud.top
https://avatarcloud.top/graphql
https://avatarcloud.top/image/XTISWcQyPrNgX4haIpvoexpgdqp7Oa3u3w9ZGE.png
https://avatarcloud.top/bk/map.txt
wss://heheimage.xyz/graphql
  • Bot telegram:
Token: 6096165622:AAHmqcRH4azKCvxg2uXNShyYrsnG6Xx-lnE
chatId: -694092486
chatId2: -929725005

Signature to check the existence of stealer, check path is exist:

  • %APPDATA%/../local/pdf.exe
  • %APPDATA%/../../zcxzcd.txt
  • %APPDATA%/../../bcbcvbsde.txt

--

--