Guide to Mapping, Enumerating Web Applications and Extracting Information Part 1

Hello and welcome again!

In today’s blog we will be discussing the following topics when it comes to mapping applications:

What is mapping?
Which information is important to gather?
Techniques and tools to use for your mapping

What is Mapping?

When one wants to attack another server or machine, it is vital that they do some recon and gather some information about who and what it is that they are attacking. Lets think about it from a logical standpoint, you get all your tools and goodies set up to hack this database server – but there’s one problem (actually many problems but lets just say there’s one): you don’t know what the hell you’re attacking, so how are you supposed to know which techniques and tools to use to exploit that machine’s vulnerabilities? This is why when you identify a target you need to map the machine before you actually start attacking. Mapping is the act of extracting information about an application or machine to help you better understand what software they are running on, what ports are open, what functionality does the application performs, what security headers are they using, etc.

The first step when attacking a machine is mapping out the entire machine/application to the best of your ability – the more information you find out about it the more avenues you open up to gain access to the machine.

Techniques when Mapping

Web Spidering

Web spiders essentially crawl websites to map out the entire site. For example, Google uses many web spiders to crawl as many websites as possible to add more information to their search engine. Web spiders click on any link they can and input random data to get feedback from the server until no new content is discovered. We can use automated web crawlers to map out the site for us, but there are many limitations:

Sometimes they may miss whole areas of an application that are not standardized. Ex: the crawler may miss a highly customized menu pane due to the custom functionality of it.
Links that are buried within client-side objects may not be picked up.
When a spider tries to implement a string into a form, it may input an invalid string which the spider cannot pickup ex: it may enter a random string for an email field that is obviously not the format of an email address and the website will respond with its usual error
If you are in an authenticated session the spider will eventually request the logout function that logs itself out
Get stuck in infinite loops

As we can see, an automated spider can help us with identifying functionality of an application, but it comes with too many limitations as we saw above. Another spidering technique called User-directed spidering is exactly what it sounds like – a hybrid spider that automates some of the tasks but is guided by the user in terms of testing functionality. This technique is more sophisticated and pretty much solves the limitations that the automated spider contains. You are analyzing every HTTP request that is going in and out and you can pick up on key clues that an automated spider might not pick up on.

Tip:

In order to analyze the HTTP requests – you need a proxy that captures these requests. I personally use Burp Suite Free Edition to do this.

Brute-Force Techniques

Brute forcing is a method by which we go through a trial-error process to uncover any information that is deemed useful to us. After we use a spider to get the sitemap we can brute-force, we can then bruteforce different aspects of the website to reveal any hidden information such as:

Login credentials
Hidden directories
Hidden URL parameters (which partially overlaps hidden directories)

Burp Intruder is part of the Burp Suite platform that I mentioned earlier – a brute-forcing tool that takes in a word list (just like in a dictionary attack), and you send the browser numerous requests and analyze the response. Depending on the response, they can give you clues as to finding hidden directories and parameters. The following response code is what the server sends once you make a request:

302 Found – this is a redirect response – usually your request will get redirected if you are trying to access an authenticated session (where you can only access if you are logged into the system) – definitely look into this if you get that code
400 Bad request – when the application is expecting a certain syntax or word and the word list you used does not comply with their syntax. Try different word lists and experiment with what works and what doesn’t
401 and 403 are Unauthorized or Forbidden – this is a flag that the resource is there but it cannot be access by any user no matter the privilege the user has.
500 Internal Server Error – this can indicate that the web-application is expecting parameters

Another tool that specifically brute-forces directories is DirBuster – I prefer this tool over the demo version of Burp Suite because you can adjust the number of threads to use. Just know, the more threads you have the quicker any brute-forcing tool will be (so long you have the computer power to utilize it). This tool comes with Kali Linux.

Steps for mapping with Spider and Brute-forcing:

1. Use your proxy to crawl the website (user-guided) and obtain as much information as you can
2. Analyze the naming scheme of the website. Meaning, if a website’s directories are all lowercase ex. /prepare/anywhere then find a word list that matches the scheme they implement. Developers (that are organized) follow a particular naming scheme in the majority if not all of their application.
3. After you have found a word list that matches the naming scheme, start brute-forcing the website. Any directories/subdirectories you find – re-run the brute-force inside of the found directories. Usually, applications have some sort of “recursive brute-force” option where they recursively do it automatically without you having to start a new session.
4. Investigate further on your findings, modify and send custom requests to the found directories and see what you can get as far as a response. Manipulating requests and seeing how the server reacts to these responses gives you clues on what logic they use, what software they may be running on – especially when you get verbose error messages (we will talk more on this in just a jiffy).
5. Brute-force with other naming schemes because there is always a chance that they use multiple schemes in different pages of the website.

Public Information – Your Best Friend

There are two methods of finding public (and sometimes private but its in the public) information on your victim:

Search engines – I know I don’t need to explain this – just know you can find a lot of information from Google Advanced Search with advanced search queries
Web Archives – websites that contain older versions of websites that can go back several years – you may find information from these archives that reveal important information of the website you are attacking.
Forums – you can find developers of the victim site posting questions on coding forums and how they went about creating certain functionality. For example, I create an authentication feature on my web-app but I asked some questions on a coding forum on “How should I securely transport the login credentials to the server once the user entered them in?” and someone suggests that I flag those parameters as hidden and that’s all I need to do. If I went that route then the hacker that is viewing this information can exploit the vulnerabilities that come with how I just programmed it. Also, you can get contact information from these developers and use that to your advantage.

Conclusion

Mapping is one of the most important steps when pen-testing a machine. It is the process of identifying information on an application or machine, the more info you have on it, the easier it is to attack. In my opinion, it should be done all at once and to an exhaustive measure to rule out any possibility of letting something slip by. In the next blog, we will discuss another set of techniques on how to further map out the application.

I will catch you all next week! Stay tuned!

Peter

The Web Application Hacker’s Handbook