HTTP is the foundation of World Wide Web. From buying items off Amazon to checking in with friends on Facebook, everything uses HTTP. Because it is such an integrated part of our online experience, it’s worth a basic understanding how it works.
HTTP stands for HyperText Transfer Protocol. Let’s take a minute to break it down:
Hyper: its literal meaning is ‘too excited’ or ‘energetic’
Text: letters, words
Hyper + Text: are over-excited words or words that can jump from one document to another. In web world, such text appears as blue text with underline, e.g. Click me and I will take you to another document
Transfer: Moving from one place to another
Protocol: Set of rules
If we put these terms together then HTTP is a way to navigate from one page (resource) to another resource by transferring (sending/receiving) data using a set of rules.
So What is a URL?
A URL is what you see in a browser’s address bar and it stands for “Uniform Resource Locator”. If we equate this to sending a letter via post mail, then URL is the address that you write on the envelope.
e.g. http://www.ktlsolutions.com
Similar to real-world addresses getting more specific in terms of Address Line 2, zip code, country, etc., URLs can have a specific path as well.
e.g. https://www.ktlsolutions.com/blog/
URL’s can also be used to pass even more information which can be used by servers, and this information is called Query String, it appears after a ‘?’ and is separated by ‘&’. https://www.ktlsolutions.com/?s=Minal+Wad&submit=y
URLs can be entered in address bars or can simply be hidden in links/text. Click me to go to Google will take you to google as there is a hidden URL linking this text to www.google.com
Resource Locator
Suppose you type http://www.ktlsolutions.com then what comes after “://” is the hostname and it basically points to a computer IP address e.g. 192.168.0.101. There are computers that map domain names to IP addresses and they are called DNS (Domain Name Service).
Data (Request/Response)
Once you have located a resource to make a request, you may need to send additional information to the server. Parameters need to be passed if you are asking for search results, or if you fill an online form then the values need to be taken back to the server. The results that are sent back fro the server are nothing but a form of data. So, the data needs to travel back and forth, and it needs to be formatted using HTTP standards so that any server in the universe can interpret the request and send back a conforming response. It’s really just a standard. A “Request” is the data that is sent to the server and “Response” is what is sent back.
What lies underneath HTTP?
When you type a URL in the address bar and press enter, there are 2 main steps that happen.
Step 1
The first step is to communicate to the DNS server and get the IP address of the destination server. This communication can happen through ethernet or phone line or some other means such as Radio or Satellite. Once an IP address is obtained then an HTTP Request is formed and another communication is made to destination server. The server then sends back HTTP Response and the browser displays the result. This is all at the application level where a browser is a client and server is the provider. But we haven’t discussed how the messages actually cross network and reach the server. There are layers of protocol that operate below HTTP. The layer just after HTTP is a Transport Level protocol called TCP. The main job of TCP is to make sure that the message is delivered, carefully avoiding any duplicates or erroneous packets. Next in line is a Network Level Protocol called IP (Internet Protocol). It divides the message into small packets and then reassembles them back at the destination. These packets travel through the network from one computer to another to finally reach destination server.
Step 2
A client (bowser in our case) then needs to extract the host name and open a TCP socket writing data in it. The rest is taken care of by TCP/IP protocols. Everything we talked so far happens inside the computer, but the data eventually has to leave the machine and travel over a piece of wire, a fiber optic, or satellite and this is the lowest level called Data Link Layer. Using Ethernet, data packets received from IP Protocol becomes frames and Data travels in 0’s and 1’s and electric signals. Eventually when the data is reached at the destination, reverse engineering happens and data travel back from IP to TCP to HTTP and interpreted by a browser to display results.
What is HTTPS, then?
HTTPS is just Secured HTTP. One of the strengths of HTTP is that it is textual and self-describing, but this can be dangerous when dealing with sensitive data such as credit card information or personal information. Secure HTTP solves this problem by encrypting messages before the messages start traveling across the network. In the network protocol stack, an additional security later exists between HTTP and TCP using TLS (Transport Layer Security Protocol) or SSL (Secure Sockets Layer). The destination server needs to buy an SSL certificate from a certificate authority which can validate its authenticity. All traffic over HTTPS is encrypted, including Request, Response, Query Strings, HTTP headers, Message Body, etc. are encrypted.