In order to answer the question, we should go through Web, HTTP, and DSN concept in application layer.
Web page includes objects which simply can be HTML file, an image, a Java applet, a video file,…- that is addressed by a single URL. Most Web pages consists of a base HTML file and several reference objects. For example, the URL as following:
Figure 1: An URL example.
HTTP – HyperText Transfer Protocol, the Web’s application-layer protocol. In client-server model, client initiates TCP connection to server, then server accepts TCP connection from client. Next, HTTP messages (application-layer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server). Finally, TCP connection closed. There are two type of HTTP: Non-persistent and persistent connections. By its default mode, HTTP uses persistent connection.
HTTP with non-persistent connections: Each request/response pair is sent over a separate TCP connection. So, now looking at step by step of transferring a Web page from client to server in this case. Assume that a web page having text, and reference to 10 JPEG images is entered in URL: www.ourcompany.com/someDept/index.html.
Figure 2: HTTP with non-persistent connection and Round-trip-time (RTT).
And what happens:
- 1.a HTTP client initiates TCP connection to HTTP server (process) at www.ourcompany.com on port 80.
- 1.b HTTP server at host www.ourcompany.com waiting for TCP connection at port 80. “accepts” connection, notifying client.
- 2. HTTP client sends HTTP request message (containing URL) into TCP connection socket. Message indicates that client wants object someDept/index.html
- 3. HTTP server receives request message, forms response message containing requested object, and sends message into its socket.
- 4. HTTP server closes TCP connection.
- 5. HTTP client receives response message containing html file, displays html. Parsing html file, finds 10 referenced jpeg objects.
- 6. Steps 1-5 repeated for each of 10 jpeg objects.
RTT: The time needed to request and receive an small packet. Put it another way, the time for a small packet to travel from client to server and back. With regards to the Figure 2 above, non-persistent HTTP total respond time is two RTTs plus the transmission time of the server HTML file.
Non-persistent HTTP issues:
- Requires 2 RTTs per object.
- OS overhead for each TCP connection.
- Browsers often open parallel TCP connections shortens the response time to fetch referenced objects.
HTTP with persistent connection: All the requests and their corresponding responses are sent over the same TCP connection.
Persistent HTTP connections:
- Server leaves connection open after sending response.
- Following HTTP messages between same client/server sent over open connection.
- client sends requests as soon as it encounters a referenced object.
- As little as one RTT for all the referenced objects.
We can identify a person in many ways basically by name, passport, social security numbers,…So, Internet host can do too. But, identifier for a host with its hostname like www.google.com or IP address (using for addressing data-grams) would be difficult to process by routers. Hence, DNS – Domain Name System is a bright solution to map between IP address and host name, and vice versa.
What happens when a browser (an HTTP client) run the URL: www.ourcompany.com/someDept/index.html.
- Runs the DNS application in the client side.
- The browser extracts the hostname www.ourcompany.com from the URL and passes the hostname to the DSN application in the client side.
- The DNS client sends a query containing the hostname to a DSN server.
- The DNS client finally receives a respond which consists of the IP address for the hostname.
- When the browser receives the IP address from DNS client, it can start a TCP connection to the HTTP server located at that server port and IP address.
Why not centralize DNS?
- A single point of failures. If the DNS severs crashes, which leads to the entire Internet failure.
- Traffic volume. A single DNS sever would have to handle all DNS queries (all HTTP requests and email generated from a huge of hosts).
- Distant centralized database. All queries from different locations can lead to significant delays.
- Maintenance: A huge centralized database and being updated frequently to account for every new host.
- James F.Kurose, and Keith W.Ross, “Application Layer”, In Computer Networking: A Top-Down Approach, Chapter 2, pp. 126-164, Pearson, 5th edition, 2010.