From the Blogosphere
Compression: Making the Big Smaller and Faster (Part 1) | @DevOpsSummit #DevOps #WebPerf
The sharing of information in a fast and efficient manner has been an area of constant study and research
By: Mehdi Daoudi
Aug. 24, 2017 01:00 PM
Compression: Making the Big Smaller and Faster (Part 1)
How important is data compression? The sharing of information in a fast and efficient manner has been an area of constant study and research. Companies like Google and Facebook have spent a lot of time and effort trying to develop faster and better compression algorithms. Compression algorithms have existed since the ’70s and the ongoing research to have better algorithms proves just how important compression is for the Internet and for all of us.
The Need for Data Compression
There are several best practices that help optimize page load times. Here is a blog from that discusses webpage optimization. In this article, we will spend some time understanding the basics of compression and how it works. We will also cover a new type of compression method called “Brotli” in the second part of this blog.
Encoding and Data Compression
The word “compression” comes from the Latin word compressare, which means to press together. “Encoding” is the process of placing a sequence of characters in a specialized format that allows efficient data storage as well as transmission. Per Wikipedia: “Data compression involves encoding information using fewer bits than the original representation.”
Compression plays a key role when it comes to saving bandwidth and speeding up your site. Modern day websites involve a lot of HTTP requests and responses between the client (the browser) and the server to serve a webpage. With an overall increase in the number of HTTP requests and responses, it becomes important to ensure that these transfers are taking place at a fast and efficient rate.
HTTP works on a request-response model, as demonstrated below:
In this case, we are not using any compression method to compress the response being sent by the server.
As we can see, in this case there is no compression involved. The server responded with a 300 KB file (index.html page). If the file size was bigger, it would have taken more time for the response to be sent on the wire and this would have increased the overall page load time. Please note that we are currently looking only at a single HTTP response. Modern websites receive hundreds of such HTTP responses from the server to render a webpage.
The image below shows the same HTTP request – response between the browser and the server, but in this case, we use compression to reduce the size of the response being sent by the server to the browser.
Today, complex and dynamic websites generate hundreds of HTTP requests/responses. This made it important to have a system which would ensure fast and efficient data transfer between the server and the browser. This is when compression algorithms like Deflate and Gzip came into existence.
Introduction to Gzip
Gzip is based on the DEFLATE algorithm, which in turn is a combination of LZ77 and Huffman coding. Understanding how LZ77 works is essential to understand how compression methods like DEFLATE and Gzip work.
The pointer or backreference is of the form <relative jump, length>, where relative jump signifies how many bytes are there between the current occurrence of the string and its last occurrence and length is the total number of identical bytes found.
Now let us understand this better with the help of an example. Assume, there is a text file with the following text:
As idle as a painted ship, upon a painted ocean.
In this file, we see the following strings: “as” and “painted” occurring multiple times. What LZ77 method does is, it replaces multiple occurrences of strings with the notation: <relative jump, length>.
So using LZ77, the text will get encoded in the following way:
As idle <8,2> a painted ship, upon a <21,7> ocean.
To encode the text, we took the following steps:
For the first test run, we did not specify any encoding to be used by passing the custom header: Accept-Encoding: identity along with the request. The first image shows no Content-Encoding being passed for the request.
In the second image, the browser is sending Accept-Encoding:zip, for which the server is sending zipped file as the response.
We can clearly see how Gzip can drastically compress the files to improve data transmission rate over the wire.
Catchpoint’s Scheduled tests also highlight the difference between compressed and not-compressed content loading on webpages.
The post Compression: Making the Big Smaller and Faster (Part 1) appeared first on Catchpoint's Blog - Web Performance Monitoring.
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week