Introduction to Web Application Penetration Testing

Through this writeup, I intend to convey in simple words, how Penetration Testing for a Web Application is performed. I won’t be going in-depth of each and every aspect but would be giving you an overview of the stuff. I will also be mentioning some of the tools used in the process. You will have to google the rest :P

Why is Web Application security necessary?

Imagine that you own a company which developed an app ‘A’. This app provides a unique feature which none of the other apps out there provide. It is bound to fetch you some good fortune, and it will. But you decide to not focus on the security aspect of the app to cut the costs. Later on, the app becomes a big hit, people are talking about it - is making headlines. Yey.. :D Keep in mind that you were not careful with how your app handles data. One day you wake up; Your app makes it to the main headlines again, but this time it read,

“Data of Thousands of users of app ‘A’ put up for sale in the Dark Web”

This is a big slap on the face. Would this have happened if you had spent a little more time for enhancing the security of the app?

What else could have happend to you? Attackers could have taken advantage of some functional loopholes in the application which could have caused a huge financial loss for you or, the attackers could deny access for other users to the application just for fun. Whatever the case may be, it leaves a negative remark on your app and thereby losing business.

I can go on, but I’m gonna stop right here and say, this is why security is necessary.

Steps of Web Application Penetration Testing

I’m trying to keep it simple here, the actual steps are massive.

There are 3 steps for Penetration Testing:

  1. Planning and Reconnaissance.
  2. Identifying Vulnerabilities.
  3. Exploitation.

You should also document each step properly. This eliminates a lot of confusion and saves time because you don’t have to do a step again to see the result it produces, you can just refer the document. Let’s dive into the interesting content. :D


1. Planning and Reconnaissance.

Planning

Before starting the test, make sure that you understand the web-app very well. You should have a clear idea of what you intend to achieve through this penetration test. You should also be aware of what might crash the server, what might crash the app etc. If you are doing this for a client, they might want you to perform either a Black Box Testing, a White Box Testing or a Grey box Testing.

  • Black Box testing: You test the application without the client giving you any authentication to the application.
  • White Box Testing: You test the application by knowing clearly the code lying underneath it. You are expected to analyse the source code and find vulnerabilities.
  • Grey Box Testing: You test the application by the authentication provided by the client, to their backend softwares.

Reconnaissance

Also known as Open Source Intelligence Gathering (OSINT), it can be said as the very first step to Penetration testing. In this step you are expected to gather every information that you could possibly get about a web application or the scope you are testing.

Reconnaissance is classified into two types:

  1. Passive Reconnaissance
  2. Active Reconnaissance

Passive Reconnaissance

In Passive Reconnaissance, you gather all the information without actively interacting with the target. Some of the methods of Passive Reconnaissance are mentioned below:

Finding out the technology of the target

It is important to know what technology the target system is using so that we can narrow down the number of tests that we need to perform on that target. For example, consider that a web application is running on Angular and Flask. So while testing, we need not test the web application to see if has any vulnerabilities that are only present in PHP. (In short, the site ain’t PHP, so why would you wanna test for PHP vulnerabilities. Duh! )

We can find out the technology used by the web app with a simple browser called Wapplyzer. It is available for both Chrome and Mozilla as a browser extension.

If you’re scared that these plugins might have backdoors in them, you can use a site called BuiltWith to extract the same information.

Finding out the sub-domains

There is no point in increasing your attacking skills without increasing your attack surface. Enumerating the sub-domains are one way of doing so.

We can enumerate the sub-domains of a web app through crt.sh.

Enter the domain for which you need to find out the sub-domains, and crt.sh will give you the result.

Data from search engines

Search engines are powerful tools to find out information about a target. The search engines can crawl inside web pages and fetch data which might be harder to find while manually browsing the website. We can force the search engine to include particular type of files in the search results, like log files. It can be achieved by using Google Dorks. If you have used ‘index of /<series / movie name >’ to download TV series or movies, you have already used Google Dorks.

Many more Google Dorks can be found in Null Byte.

New Google Dorks come up in exploit-db.com.

Data from social media websites

Social media websites are another great way to gather information about something or someone. This section is something which I don’t need to tell much about, everyone is an expert in this. :P

If you’re not one, here are some of the data that you should consider gathering:

  • Facebook - Date of Birth, Email ID, Hometown, First car etc.
  • LinkedIn - First Place of Work, Official Email ID etc.

Web archives

Did you know that once your website is web-facing, they will be archived and stored and can be viewed later even if you take it off the Internet? To make it more clear, did you know that you can search for how a website looked way back in the year 2000. And I’m not talking about googling, “ in 2000”. There’s another way.

Go to Web Archive and then search for the website you want to see an older version of.

How is this relevant here? Imagine that you had unknowingly typed your login credentials on your web app as a title (idk why I chose an example which is very unlikely, this sounded funny :P ) . You realise it and remove it in the next update. Now if I, as an attacker, want to see those credentials which I knew were there from some public forums, I can just go to one of those web archives and see if your page was saved.

Active Reconnaissance

In Active reconnaissance, we actively engage with the target to gather information.

Some of the methods are:

Port Scanning

No. Not the USB port kind of thing. Although, now that I think of it, it can be used as an example to explain the same. What if you were given a computer with lots of sensitive information and you want to copy that information to put it out into the world. You search for a USB port to connect to so that you can connect your pen drive and copy those information. Port scanning is somewhat the same. But instead of a physical port, you search for a logical port which exists in the software level. Port is something which is used by the server to communicate with the outer world. So if there are open ports in your server, they might attract some interested people. These ports can be discovered by a tool called Nmap.

Nmap can also be used for more than just discovering ports. Feel free to explore.

Identifying Server OS

This is also one of the crucial steps of reconnaissance. Knowing the OS will help you determine if the server has got some weird behaviour which you can take advantage of. Knowing the OS version can help you search up for any publicly disclosed vulnerabilities which can be used to compromise the server.

Web Application Scanning

In this method we scan the the target to get more information. One of the main things to look for here is open directories. Due to the last minute rush to complete the deployment of a web application, what happens most of the time is, developers forget to restrict access to other files on the server which can be accessed by explicitly typing their name in the search field. A tool which can be used to find this is dirsearch.

You might be wondering what all can be found. Sensitive files, admin portals, login pages not meant to be disclosed in public…. The list goes on.

Gentle reminder, what is the status of the note that you were keeping? :P


2. Identifying Vulnerabilities

Now, let’s assume that our recon has concluded. Next step is to find out vulnerabilities of the application. Vulnerabilities are weak points which an attacker can take advantage of. Some of the steps for identifying vulnerabilities are:

Searching for Vulnerabilities with software versions

Glad you kept notes. Now you can refer the notes to see different types of firmwares or softwares that is running the application and its infrastructure and then search for any vulnerabilities that are available. One of the places which you can search up for vulnerabilities is exploit-db. You can enter the software name and version to see if there are any known vulnerabilities.

Manually surf the web application

It is necessary to know how the web application functions. You should know every nook and corner of the web application that you are testing. Visit all the links. Use all the functionalities. Fill out text fields. Click all the buttons. Do everything that is possible to fully understand the application. If you remember, we did enumerate some subdomains and directories in our reconnaissance part. Now is the time to explore those. Explore functionalities, if any, in those pages too.

Notice the behaviour of the Web Application

Now that you are familiar with the web application, see if you can get the application to respond weirdly. If you enter values like ‘,”,> etc and it returns an error on the web app, there you go, you triggered a weird action. Note these down.

Two tools which can be used to analyse weird behaviour are:

  1. Burpsuite

Burpsuite is an application which can be used to capture the traffic that the client and server exchanges for its working. With the help of BurpSuite, you can:

  • Edit the parameters
  • Brute force login pages
  • Repeat requests
  • Decode some basic encodings

And much more. Extensions are available which take the usability to another level.

  1. Wireshark

Wireshark is not that much relevant here. It is used to capture and analyse network traffic. If you’ve been told not to connect to a public wifi and visit a site without https because hackers will steal your information, this is how they do it.

OWASP Top 10

OWASP (Open Web Application Security Project) is an international non-profit organisation which focuses on web security. Their main intention is to make the materials free. OWASP has a regularly updated list of critical vulnerabilities that might be present in web applications. These vulnerabilities can be a result of:

  • Developers mistake in coding a functionality
  • Poor judgement of the cryptography used
  • Improper handling of data

I’m stopping the list here.

Below are the OWASP top 10 vulnerabilities of 2020:

  1. Injection.
  2. Broken Authentication.
  3. Sensitive Data Exposure.
  4. XML External Entity (XXE).
  5. Broken Access Control.
  6. Security Misconfiguration.
  7. Cross Site Scripting.
  8. Insecure Deserialization.
  9. Using components with known Vulnerabilities.
  10. Insufficient Logging and Monitoring.

3. Exploiting

At last, we are here. The information that we gathered, the vulnerabilities that we identified and notes that we kept, everything was to achieve this goal. To exploit. Exploitation is the process of taking advantage of the vulnerabilities to do some serious damage or to find out the extend to which a vulnerability can be misused. Exploitation might not be that much easy. Sometimes, multiple vulnerabilities need to be chained together to perform an exploitation. Perfect execution of these attacks might require some good amount of experience. Some of the tools which can be used for exploitation include:

  1. BurpSuite Yes, Burpsuite can be used here also.

  2. SQLMap

There is a vulnerability called SQL Injection which allows a malicious user to directly interact with the database connected to the web application. Well, the consequences? They can read all the values of the database which might contain details like users credentials, Credit Card informations, contact details etc. of everyone that has registered to that website. (Data Breach Alert!!!!) SQL Injection requires a lot of trial and error for its successful exploitation. SQLMap helps us automate the process.

  1. Metasploit

This is a dream tool for hackers. With more than thousands of scripts and exploits, Metasploit is surely one of the dangerous tools if it falls into the wrong hands. Don’t know how to write scripts to exploit a particular vulnerability? Or don’t you have time for that? Wanna build your custom scripts? Fear not, Metasploit has got you covered. It can be used to create backdoors, malicious payloads etc. It can also be used in the reconnaissance stage, because it can be used to run port scans, banner grabbing etc.

  1. Hydra

There are many password-cracking techniques. Brute-forcing with a set of common passwords (yes, even in 2020 people use passwords like ‘qwerty123’) is one of them. But these lists will contain thousands, or even millions of possible combinations. Manually trying all those in a login page is impossible. Thats where Hydra comes into play. It is one of the fastest password-cracking tools out there.


Documentation

No one likes this part. But this is another crucial step because your findings need to be understood by the developers in order to patch them. The notes will help you here as well. :)


Conclusion

This is a skill which cannot be acquired overnight. It takes a good amount of dedication and hard work. The testing is not limited to what I mentioned in this writeup. Keep yourselves updated about new methodologies to find bugs. I hope this article served its purpose.

Thank you. :)

(I ought to mention. Articles are polished by Praveen G Anand)