Web applications and web servers are critical to our online presence and the attacks observed against them constitute more than 70% of the total attacks attempted on the Internet. These attacks attempt to convert trusted websites into malicious ones. Due to this reason, web server and web application pen testing plays an important role.
Why do we need to consider the safety of web servers? It is because with the rapid growth of e-commerce industry, the prime target of attackers is web server. For web server pentesting, we must know about web server, its hosting software & operating systems along with the applications, which are running on them. Gathering such information about web server is called footprinting of web server.
In our subsequent section, we will discuss the different methods for footprinting of a web server.
Web servers are server software or hardware dedicated to handle requests and serve responses. This is a key area for a pentester to focus on while doing penetration testing of web servers.
Let us now discuss a few methods, implemented in Python, which can be executed for footprinting of a web server −
A very good practice for a penetration tester is to start by listing the various available HTTP methods. Following is a Python script with the help of which we can connect to the target web server and enumerate the available HTTP methods −
To begin with, we need to import the requests library −
import requests
After importing the requests library, create an array of HTTP methods, which we are going to send. We will make use of some standard methods like 'GET', 'POST', 'PUT', 'DELETE', 'OPTIONS' and a non-standard method ‘TEST’ to check how a web server can handle the unexpected input.
method_list = ['GET', 'POST', 'PUT', 'DELETE', 'OPTIONS', 'TRACE','TEST']
The following line of code is the main loop of the script, which will send the HTTP packets to the web server and print the method and the status code.
for method in method_list: req = requests.request(method, 'Enter the URL’) print (method, req.status_code, req.reason)
The next line will test for the possibility of cross site tracing (XST) by sending the TRACE method.
if method == 'TRACE' and 'TRACE / HTTP/1.1' in req.text: print ('Cross Site Tracing(XST) is possible')
After running the above script for a particular web server, we will get 200 OK responses for a particular method accepted by the web server. We will get a 403 Forbidden response if the web server explicitly denies the method. Once we send the TRACE method for testing cross site tracing (XST), we will get 405 Not Allowed responses from the web server otherwise we will get the message ‘Cross Site Tracing(XST) is possible’.
HTTP headers are found in both requests and responses from the web server. They also carry very important information about servers. That is why penetration tester is always interested in parsing information through HTTP headers. Following is a Python script for getting the information about headers of the web server −
To begin with, let us import the requests library −
import requests
We need to send a GET request to the web server. The following line of code makes a simple GET request through the requests library.
request = requests.get('enter the URL')
Next, we will generate a list of headers about which you need the information.
header_list = [ 'Server', 'Date', 'Via', 'X-Powered-By', 'X-Country-Code', ‘Connection’, ‘Content-Length’]
Next is a try and except block.
for header in header_list: try: result = request.header_list[header] print ('%s: %s' % (header, result)) except Exception as err: print ('%s: No Details Found' % header)
After running the above script for a particular web server, we will get the information about the headers provided in the header list. If there will be no information for a particular header then it will give the message ‘No Details Found’. You can also learn more about HTTP_header fields from the link — https://www.howcodex.com/http/http_header_fields.htm.
We can use HTTP header information to test insecure web server configurations. In the following Python script, we are going to use try/except block to test insecure web server headers for number of URLs that are saved in a text file name websites.txt −
import requests urls = open("websites.txt", "r") for url in urls: url = url.strip() req = requests.get(url) print (url, 'report:') try: protection_xss = req.headers['X-XSS-Protection'] if protection_xss != '1; mode = block': print ('X-XSS-Protection not set properly, it may be possible:', protection_xss) except: print ('X-XSS-Protection not set, it may be possible') try: options_content_type = req.headers['X-Content-Type-Options'] if options_content_type != 'nosniff': print ('X-Content-Type-Options not set properly:', options_content_type) except: print ('X-Content-Type-Options not set') try: transport_security = req.headers['Strict-Transport-Security'] except: print ('HSTS header not set properly, Man in the middle attacks is possible') try: content_security = req.headers['Content-Security-Policy'] print ('Content-Security-Policy set:', content_security) except: print ('Content-Security-Policy missing')
In our previous section, we discussed footprinting of a web server. Similarly, footprinting of a web application is also considered important from the point of view of a penetration tester.
In our subsequent section, we will learn about the different methods for footprinting of a web application.
Web application is a client-server program, which is run by the client in a web server. This is another key area for a pentester to focus on while doing penetration testing of web application.
Let us now discuss the different methods, implemented in Python, which can be used for footprinting of a web application −
Suppose we want to collect all the hyperlinks from a web page; we can make use of a parser called BeautifulSoup. The parser is a Python library for pulling data out of HTML and XML files. It can be used with urlib because it needs an input (document or url) to create a soup object and it can’t fetch web page by itself.
To begin with, let us import the necessary packages. We will import urlib and BeautifulSoup. Remember before importing BeautifulSoup, we need to install it.
import urllib from bs4 import BeautifulSoup
The Python script given below will gather the title of web page and hyperlinks −
Now, we need a variable, which can store the URL of the website. Here, we will use a variable named ‘url’. We will also use the page.read() function that can store the web page and assign the web page to the variable html_page.
url = raw_input("Enter the URL ") page = urllib.urlopen(url) html_page = page.read()
The html_page will be assigned as an input to create soup object.
soup_object = BeautifulSoup(html_page)
Following two lines will print the title name with tags and without tags respectively.
print soup_object.title print soup_object.title.text
The line of code shown below will save all the hyperlinks.
for link in soup_object.find_all('a'): print(link.get('href'))
Banner is like a text message that contains information about the server and banner grabbing is the process of fetching that information provided by the banner itself. Now, we need to know how this banner is generated. It is generated by the header of the packet that is sent. And while the client tries to connect to a port, the server responds because the header contains information about the server.
The following Python script helps grab the banner using socket programming −
import socket s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket. htons(0x0800)) targethost = str(raw_input("Enter the host name: ")) targetport = int(raw_input("Enter Port: ")) s.connect((targethost,targetport)) def garb(s:) try: s.send('GET HTTP/1.1 \r\n') ret = sock.recv(1024) print ('[+]' + str(ret)) return except Exception as error: print ('[-]' Not information grabbed:' + str(error)) return
After running the above script, we will get similar kind of information about headers as we got from the Python script of footprinting of HTTP headers in the previous section.