Friday, March 15, 2013

HTTP looks simple, but not ...

Recently I had a chance to relook into Hypertext Transfer Protocol (HTTP), one of the protocol that we use almost everyday. The HTTP is an application layer protocol functions as a request-response protocol in the client-server computing model. A web browser like Chrome or Firefox, for example, may function as the client and an application running on a computer hosting a web site may be the server. The server provides resources such as HTML files and other content, or performs other functions on behalf of the client. When a HTTP request message is sent from client to the server, the server will response a message to the client with completion status information about the request and may also contain requested content in its message body.

The original version of HTTP, HTTP/0.9 (1991) was written by Sir Tim Berners-Lee. It was a simple protocol for transferring raw data across the Internet. There are currently two versions of HTTP, namely, HTTP/1.0 and HTTP/1.1 in use today. HTTP 1.0 defined in RFC 1945 was officially introduced and recognized in 1996 improved the protocol by allowing MIME-like messages. However, HTTP/1.0 does not address the issues of proxies, caching, persistent connection, virtual hosts, and range download. These features were provided in HTTP/1.1 which is defined in RFC 2616.

In HTTP 1.0 and before, TCP connections are closed after each request and response, so each resource to be retrieved requires its own connection. This increases the load on HTTP servers and causing congestion on the Internet. Opening and closing TCP connections takes a substantial amount of CPU time, bandwidth, and memory. In practice, most Web pages consist of several files on the same server. This requires a client to make multiple requests of the same server in a short amount of time. In HTTP/1.1 a keep-alive-mechanism was introduced, where a connection could be reused for more than one request. Several requests and responses are allowed to be sent through a single persistent connection. Such persistent connections experience less latency, because the client does not need to re-negotiate the TCP connection after the first request has been sent.

There are many more improvements in HTTP 1.1. For example, it improves bandwidth optimization by supporting not only compression, but also the negotiation of compression parameters and different compression styles. HTTP 1.1 also allows for partial transmission of objects where a server transmits just the portion of a resource explicitly requested by a client.

Another improvement to the protocol was HTTP pipelining. This feature further reduces lag time, allowing a client to make multiple requests without waiting for each response, allowing a single TCP connection to be used much more efficiently, with much lower elapsed time. This feature speed up browsing by opening up multiple “pipes,” , each pipe downloading a different part of the web page, then assembling them correctly at the browser. Certain browsers even enable user to configure maximum number of pipelining connection. For example, in Firefox, you may set a value to the “network.http.pipelining.maxrequests” for such purpose.

All common desktop browsers like Chrome, Firefox, Internet Explorer and etc support and enable HTTP 1.1 by default. However, you are able to disable the use of HTTP 1.1 in these browsers. Many web sites still use HTTP 1.0, so if you are having difficulties connecting to certain web sites, you might want to clear this check box. For example, you can modify HTTP 1.1 settings in Internet Explorer by using the Advanced tab in the Internet Options dialog box. As for Firefox, you can do so by typing about:config in the address bar, search for network.http.version and change the value to 1.0. As for Chrome, you may like to tweak settings such as HTTP Pipelining in the chrome://flags setup page.