Let us discuss some terms commonly used with regard to the Internet.
WWW is the acronym for World Wide Web. WWW is an information space inhabited by interlinked documents and other media that can be accessed via the Internet. WWW was invented by British scientist Tim Berners-Lee in 1989 and developed the first web browser in 1990 to facilitate exchange of information through the use of interlinked hypertexts.
A text that contains link to another piece of text is called hypertext. The web resources were identified by a unique name called URL to avoid confusion.
World Wide Web has revolutionized the way we create, store and exchange information. Success of WWW can be attributed to these factors −
HTML stands for Hypertext Markup Language. A language designed such that parts of text can be marked to specify its structure, layout and style in context of the whole page is called a markup language. Its primary function is defining, processing and presenting text.
HTML is the standard language for creating web pages and web applications, and loading them in web browsers. Like WWW it was created by Time Berners-Lee to enable users to access pages from any page easily.
When you send request for a page, the web server sends file in HTML form. This HTML file is interpreted by the web browser and displayed.
XML stands for eXtensible Markup Language. It is a markup language designed to store and transport data in safe, secure and correct way. As the word extensible indicates, XML provides users with a tool to define their own language, especially to display documents on the Internet.
Any XML document has two parts – structure and content. Let’s take an example to understand this. Suppose your school library wants to create a database of magazines it subscribes to. This is the CATALOG XML file that needs to be created.
<CATALOG> <MAGAZINE> <TITLE>Magic Pot</TITLE> <PUBLISHER>MM Publications</PUBLISHER> <FREQUENCY>Weekly</FREQUENCY> <PRICE>15</PRICE> </MAGAZINE> <MAGAZINE> <TITLE>Competition Refresher</TITLE> <PUBLISHER>Bright Publications</PUBLISHER> <FREQUENCY>Monthly</FREQUENC> <PRICE>100</PRICE> </MAGAZINE> </CATALOG>
Each magazine has title, publisher, frequency and price information stored about it. This is the structure of catalog. Values like Magic Pot, MM Publication, Monthly, Weekly, etc. are the content.
This XML file has information about all the magazines available in the library. Remember that this file will not do anything on its own. But another piece of code can be easily written to extract, analyze and present data stored here.
HTTP stands for Hypertext Transfer Protocol. It is the most fundamental protocol used for transferring text, graphics, image, video and other multimedia files on the World Wide Web. HTTP is an application layer protocol of the TCP/IP suite in client-server networking model and was outlined for the first time by Time Berners-Lee, father of World Wide Web.
HTTP is a request-response protocol. Here is how it functions −
Client submits request to HTTP.
TCP connection is established with the server.
After necessary processing server sends back status request as well as a message. The message may have the requested content or an error message.
An HTTP request is called method. Some of the most popular methods are GET, PUT, POST, CONNECT, etc. Methods that have in-built security mechanisms are called safe methods while others are called unsafe. The version of HTTP that is completely secure is HTTPS where S stands for secure. Here all methods are secure.
An example of use of HTTP protocol is −
https://www.howcodex.com/videotutorials/index.htm
The user is requesting (by clicking on a link) the index page of video tutorials on the howcodex.com website. Other parts of the request are discussed later in the chapter.
Domain name is a unique name given to a server to identify it on the World Wide Web. In the example request given earlier −
https://www.howcodex.com/videotutorials/index.htm
howcodex.com is the domain name. Domain name has multiple parts called labels separated by dots. Let us discuss the labels of this domain name. The right most label .com is called top level domain (TLD). Other examples of TLDs include .net, .org, .co, .au, etc.
The label left to the TLD, i.e. howcodex, is the second level domain. In the above image, .co label in .co.uk is second level domain and .uk is the TLD. www is simply a label used to create the subdomain of howcodex.com. Another label could be ftp to create the subdomain ftp.howcodex.com.
This logical tree structure of domain names, starting from top level domain to lower level domain names is called domain name hierarchy. Root of the domain name hierarchy is nameless. The maximum length of complete domain name is 253 ASCII characters.
URL stands for Uniform Resource Locator. URL refers to the location of a web resource on computer network and mechanism for retrieving it. Let us continue with the above example −
https://www.howcodex.com/videotutorials/index.htm
This complete string is a URL. Let’s discuss its parts −
index.htm is the resource (web page in this case) that needs to be retrieved
www.howcodex.com is the server on which this page is located
videotutorials is the folder on server where the resource is located
www.howcodex.com/videotutorials is the complete pathname of the resource
https is the protocol to be used to retrieve the resource
URL is displayed in the address bar of the web browser.
Website is a set of web pages under a single domain name. Web page is a text document located on a server and connected to the World Wide Web through hypertexts. Using the image depicting domain name hierarchy, these are the websites that can be constructed −
Note that there is no protocol associated with websites 3 and 4 but they will still load, using their default protocol.
Web browser is an application software for accessing, retrieving, presenting and traversing any resource identified by a URL on the World Wide Web. Most popular web browsers include −
Web server is any software application, computer or networked device that serves files to the users as per their request. These requests are sent by client devices through HTTP or HTTPS requests. Popular web server software include Apache, Microsoft IIS, and Nginx.
Web hosting is an Internet service that enables individuals, organizations or businesses to store web pages that can be accessed on the Internet. Web hosting service providers have web servers on which they host web sites and their pages. They also provide the technologies necessary for making a web page available upon client request, as discussed in HTTP above.
Script is a set of instructions written using any programming language and interpreted (rather than compiled) by another program. Embedding scripts within web pages to make them dynamic is called web scripting.
As you know, web pages are created using HTML, stored on the server and then loaded into web browsers upon client’s request. Earlier these web pages were static in nature, i.e. what was once created was the only version displayed to the users. However, modern users as well as website owners demand some interaction with the web pages.
Examples of interaction includes validating online forms filled by users, showing messages after user has registered a choice, etc. All this can be achieved by web scripting. Web scripting is of two types −
Client side scripting − Here the scripts embedded in a page are executed by the client computer itself using web browser. Most popular client side scripting languages are JavaScript, VBScript, AJAX, etc.
Server side scripting − Here scripts are run on the server. Web page requested by the client is generated and sent after the scripts are run. Most popular server side scripting languages are PHP, Python, ASP .Net, etc.
Web 2.0 is the second stage of development in World Wide Web where the emphasis is on dynamic and user generated content rather than static content. As discussed above, World Wide Web initially supported creation and presentation of static content using HTML. However, as the users evolved, demand for interactive content grew and web scripting was used to add this dynamism to content.
In 1999, Darcy DiNucci coined the term Web 2.0 to emphasize the paradigm shift in the way web pages were being designed and presented to the user. It became popularity around 2004.
Examples of user generated content in Web 2.0 include social media websites, virtual communities, live chats, etc. These have revolutionized the way we experience and use the Internet.