HTTP compression, otherwise known as content encoding, is a publicly defined way to compress textual content transferred from web servers to browsers. HTTP compression uses public domain compression algorithms, like gzip and compress, to compress XHTML, JavaScript, CSS, and other text files at the server. This standards-based method of delivering compressed content is built into HTTP 1.1, and most modern browsers that support HTTP 1.1 support ZLIB inflation of deflated documents. In other words, they can decompress compressed files automatically, which saves time and bandwidth.
Stephen Pierzchala, Senior Technical Performance Analyst with Gomez, said this about HTTP compression:
“When tied to other methods, such as proper caching configurations and the use of persistent connections, HTTP compression can greatly improve Web performance. In most cases, the total cost of ownership of implementing HTTP compression (which for users of some Web platforms is nothing!) is extremely low, and it will pay for itself in reduced bandwidth usage and improved customer satisfaction.”
The Browser / Server Conversation
Browsers and servers have brief conversations over what they’d like to receive and send. Using HTTP headers, they zip messages back and forth over the ether with their content shopping lists. A compression-aware browser tells servers it would prefer to receive encoded content with a message in the HTTP header like this:
GET / HTTP/1.1
Host: www.webcompression.org
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5)
Gecko/20031007 Firebird/0.7
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,
text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
An HTTP 1.1-compliant server would then deliver the requested document with using an encoding accepted by the client. Here's a sample response from WebCompression.org:
HTTP/1.1 200 OK
Date: Thu, 04 Dec 2003 16:15:12 GMT
Server: Apache/2.0
Vary: Accept-Encoding
Content-Encoding: gzip
Cache-Control: max-age=300
Expires: Thu, 04 Dec 2003 16:20:12 GMT
X-Guru: basic-knowledge=0, general-knowledge=0.2, complete-omnipotence=0.99
Content-Length: 1533
Content-Type: text/html; charset=ISO-8859-1
Now the client knows that the server supports gzip content encoding, and it also knows the size of the file (content-length). The client downloads the compressed file, decompresses it, and displays the page. At least, that is the way it is supposed to work.
Browsers Can Lie
Unfortunately, some early versions of Netscape 4 say they support ZLIB inflation when they really can't. Rather than rely on the content negotiation built into Apache and IIS, most webmasters install software specifically designed to make this conversation an amicable one. Products like mod_gzip, Vigos' Website Accelerator, PipeBoost, httpZip, and others offer configurable compression that can avoid browser quirks.
Average Compression Ratios
So what can you expect to save using HTTP compression? In tests that we ran on twenty popular sites we found that on average, content encoding saved 75% off of text files (HTML, CSS, and JavaScript) and 37% overall.1 A similar study of 9,281 HTML pages of popular sites by Destounis et. al found a mean compression gain of 75.2%.2 On average, web compression reduced the text files tested to one-fourth of their original size.3 The more text-based content you have, the higher the savings.
Joe Lima, COO and Head of Product Development at Port80 Software, said this about HTTP compression:
"HTTP compression provides such a clear benefit that it appeals to all kinds of users. Our customers include consumer sites that want to improve end-users' experience, hosting providers seeking to differentiate their offering, Fortune 500's looking to make a specific extranet application as bandwidth-efficient as possible, and many others. Simply put, compression is easy to deploy, widely supported, and saves money. Who could say no to that?"
File Size Savings for Sites Using HTTP Compression
Here are three examples from popular sites that use HTTP compression. Google and Orbitz both use gzip compression to deliver compressed versions of their pages to HTTP 1.1-compliant browsers. Table 1 shows the size of their home pages plus one search results page before and after compression.
Home page | HTML Page Size (uncompressed) | HTML Page Size (compressed) | Percentage savings |
---|---|---|---|
Google.com | 3,873 | 1,412 | 63.5% |
Google HTTP+Compression | 26,321 | 5,505 | 79.1% |
Orbitz.com | 44,183 | 9,046 | 79.5% |
Note: These figures do not include HTTP header size, just the HTML size.
Typical savings on compressed text files range from 60% to 85%, depending on how redundant the code is. Some JavaScript files can actually be compressed by over 90%. Webmasters who have deployed HTTP compression on their servers report savings of 30 to 50% off of their bandwidth bills. Compressed content also speeds up your site by requiring smaller downloads. The cost of decompressing compressed content is small compared to the cost of downloading uncompressed files. On narrowband connections with faster computers CPU speed trumps bandwidth every time.
1Andy King and Konstantin Balashov, Speed Up Your Site: Web Site Optimization (Indianapolis: New Riders Publishing, 2003), Chapter 18, "Compressing the Web," 412. See Table 18.2: Content Encoding Average Compression Ratios for Different Web Site Categories.
2P. Destounis, J. Garofalakis, P. Kappos, and J. Tzimias, "Measuring the Mean Web page size and its compression to limit latency and improve download time," Internet Research: Electronic Networking Applications and Policy 11, no. 1 (2001): 15. Analyzing five popular web sites (cnn.com, disney.com, ibm.com, microsoft.com, and netscape.com) Destounis et. al found a mean compression gain of 75.2% across 9,281 HTML pages. The mean web page size was 13,540 bytes.
3Compression efficiency depends on the repetition of content within a given file. Smaller files have fewer bytes, and therefore a lower probability of repeated bytes. As file size increases compression ratios improve because more characters means more opportunities for similar patterns. The above tests ranged from a 13,540 byte mean (Destounis 2001) to 44,582 bytes per HTML page (King 2003). Smaller files (5,000 bytes or less) typically compress less efficiently, while larger files typically compress more efficiently. The more redundancy you can build into your textual data (HTML, CSS, and JavaScript) the higher your potential compression ratio. That's why using all lowercase letters improves compression in XHTML.
Further Reading
- Compressing the Web
- Chapter 18 of Speed Up Your Site shows how to set up HTTP compression on Apache and IIS servers and evaluates the available compression software. Lists software and hardware compression tools for web compression.
- Gomez, Inc.
- A performance management and monitoring company.
- GrabIT2
- Stephen Pierzchala's configurable URL-based page grabber shows HTTP headers and page characteristics.
- HTTP Compression Speeds the Web
- Introductory article on content encoding by Peter Cranstone.
- Overweight Travel Sites Delay Holiday Travelers
- Compares the home pages of Expedia, Orbitz, and Travelocity for speed and accessibility. Orbitz uses HTTP compression. From Optimization Week Magazine, Dec. 4, 2003.
- Performance Improvement From Caching and Compression
- Adding caching to compression can boost performance, by Stephen Pierzchala in April 2003.
- Performance Improvement From Compression
- Stephen Pierzchala tests how HTTP compression can improve web performance. Concludes that data compression works best with files over 5,000 bytes. April 2003.
- Port80 Software. Fortune 1000 Survey
- Port80 Software's Fortune 1000 Compression Survey
- Port80 Software found that only 3% of the Fortune 1000 uses HTTP compression.
- Slow Shopping Sites Delay Santa: Scrooge Response Times
- Five out of fourteen top shopping sites use HTTP compression on their home pages. By Andrew King of Optimization Week Magazine, Dec. 17, 2003.
- Speed Web delivery with HTTP compression
- A detailed look at the beneficial effects of data compression with HTTP 1.1 by Radhakrishnan Srinivasan of IBM. July 22, 2003.
- WebCompression.org
- Stephen Pierzchala's compression information resource.
Article originally published on December 4, 2003.
This was very useful as I have been seeking ways to speed up my site. I already went from a loading time of 25 sec to 17 sec and am trying to reduce it further while keeping all the essential elements in a shopping site.
I also found your web optimiser useful. Thanks
How to configure HTTP compression on Apache/1.3.34 (Unix)webserver?
On this page https://www.websiteoptimization.com/speed/tweak/compress/ ,it reference to Chapter 18 for setup on Apache webserver too, but I did not see instruction/procedure to configure Apache for compression.
“Chapter 18 of Speed Up Your Site shows how to set up HTTP compression on Apache and IIS servers[…]”
Please provide instruction to configure compression on Apache1.3.x and Apache2.x.
Thanks.
Nanao,
Recommend mod_gzip for setting up HTTP compression on Apache, or Vigos.
– Andy
Hey Andy,
I’ve been searching for a way to improve some of our websites’ download speeds …and I found myself reading your information.
Since most of our sites are hosted on Linux, I was wondering if you could reveal how to configure an htaccess file to enable mod_deflate for html docs.
Thanks
Can anybody tell me where i can find list of servers that have mod_gzip and its use?
I very lazy – that why i not want to stoop to scan them.
bye
Great info really helped out a lot with the dev work
mod_gzip is the most used apache module. There are other less famous ones .. but GZIP compression does not require any special browser side addon.
John,
IIS has inbuilt http compression support.
Regards,
Uday Kadam.
mod_gzip really saves a lot of bandwidth.
Is there any way to force modern browsers to use gzip encoding on usual Internet surfing? Couldn’t find it in IE7
How to set up the header gzip with php ?
My HTML is compressed but the Javascript is not. How can I fix this issue?
Article about mod_deflate settings like on Amazon EC2 AMI
http://railsgeek.com/2008/12/16/apache2-httpd-improving-performance-mod_deflate-gzip
We can compressed PHP Pages(html codes etc.) via PHP (gzip) and sure we can use this same idea for JavaScript mean put JavaScript codes into php file and gzip on fly..
How can a Firewall deal with compressed packets? Specially when the firewall is embedded into a cable modem or router and decompressing packets is expensive in terms of memory and cpu usage.
Thanks
How can i reduce the size of scripts in Bloggers xml file ?? I am having 200+Kb of html size which worries me alot!
Are there any sites for online HTML compression?
If you wish to optimize your website, use function ob_start and use gz_handler as a callback function.
Can any one give me a good source for performing compression.
This blog post is likely incorrect. In fact, the key statistic appears to have been misinterpreted. In the post you say:
“73.3% of the top 1000 websites do not use HTTP compression on their sites”
However, in the citation of the study you say:
“As of February 2010 73.3% of the Fortune 1000 websites use HTTP compression”
The latter statement is likely more accurate. Note these metrics from Google ( http://code.google.com/speed/articles/web-metrics.html ) state that 66-89% of compressible content is compressed.
It would be better not to reference unpublished surveys. No one has the ability to validate the reference. It would be great if posted a correct immediately. Folks like Jason Grigsby have tweeted this incorrect stat.
Good catch Steve. I corrected the cite to say “do not use” now. I’ll ask about publishing the entire Excel file.
Keep in mind that these stats are for the Fortune 1000, the web at large is no doubt different.
All,
I took the Fortune 1000 study down pending updated data.