What happens after you type a URL into a browser

By | March 19, 2016 | 83 Views

In order to answer the question, we should go through Web, HTTP, and DSN concept in application layer.

Webpage

Web page includes objects which simply can be HTML file, an image, a Java applet, a video file,…- that is addressed by a single URL. Most Web pages consists of a base HTML file and several reference objects. For example, the URL as following:

URL-example

Figure 1: An URL example.

HTTP

HTTP – HyperText Transfer Protocol, the Web’s application-layer protocol. In client-server model, client initiates TCP connection to server, then server accepts TCP connection from client. Next, HTTP messages (application-layer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server). Finally, TCP connection closed. There are two type of HTTP: Non-persistent and persistent connections. By its default mode, HTTP uses persistent connection.

HTTP with non-persistent connections: Each request/response pair is sent over a separate TCP connection. So, now looking at step by step of transferring a Web page from client to server in this case. Assume that a web page having text, and reference to 10 JPEG images is entered in URL: www.ourcompany.com/someDept/index.html.

RTT-non-persistent-connection

Figure 2: HTTP with non-persistent connection and Round-trip-time (RTT).

And what happens:

  • 1.a HTTP client initiates TCP connection to HTTP server (process) at www.ourcompany.com on port 80.
  • 1.b HTTP server at host www.ourcompany.com waiting for TCP connection at port 80. “accepts” connection, notifying client.
  • 2. HTTP client sends HTTP request message (containing URL) into TCP connection socket. Message indicates that client wants object someDept/index.html
  • 3. HTTP server receives request message, forms response message containing requested object, and sends message into its socket.
  • 4. HTTP server closes TCP connection.
  • 5. HTTP client receives response message containing html file, displays html.  Parsing html file, finds 10 referenced jpeg  objects.
  • 6. Steps 1-5 repeated for each of 10 jpeg objects.

RTT: The time needed to request and receive an small packet. Put it another way, the time for a small packet to travel from client to server and back. With regards to the Figure 2 above, non-persistent HTTP total respond time is two RTTs plus the transmission time of the server HTML file.

Non-persistent HTTP issues:

  • Requires 2 RTTs per object.
  • OS overhead for each TCP connection.
  • Browsers often open parallel TCP connections shortens the response time to fetch referenced objects.

HTTP with persistent connection: All the requests and their corresponding responses are sent over the same TCP connection.

Persistent HTTP connections:

  • Server leaves connection open after sending response.
  • Following HTTP messages between same client/server sent over open connection.
  • client sends requests as soon as it encounters a referenced object.
  • As little as one RTT for all the referenced objects.

DNS

We can identify a person in many ways basically by name, passport, social security numbers,…So, Internet host can do too. But, identifier for a host with its hostname like www.google.com or IP address (using for addressing data-grams) would be difficult to process by routers. Hence, DNS – Domain Name System is a bright solution to map between IP address and host name, and vice versa.

What happens when a browser (an HTTP client) run the URL: www.ourcompany.com/someDept/index.html.

  1. Runs the DNS application in the client side.
  2. The browser extracts the hostname www.ourcompany.com from the URL and passes the hostname to the DSN application in the client side.
  3. The DNS client sends a query containing the hostname to a DSN server.
  4. The DNS client finally receives a respond which consists of the IP address for the hostname.
  5. When the browser receives the IP address from DNS client, it can start a TCP connection to the HTTP server located at that server port and IP address.

Why not centralize DNS?

  • A single point of failures. If the DNS severs crashes, which leads to the entire Internet failure.
  • Traffic volume. A single DNS sever would have to handle all DNS queries (all HTTP requests and email generated from a huge of hosts).
  • Distant centralized database. All queries from different locations can lead to significant delays.
  • Maintenance: A huge centralized database and being updated frequently to account for every new host.

References:

  • James F.Kurose, and Keith W.Ross, “Application Layer”, In Computer Networking: A Top-Down Approach, Chapter 2, pp. 126-164, Pearson, 5th edition, 2010.

Leave a Reply

Your email address will not be published. Required fields are marked *