Lossy Image Compression - 國立臺灣大學access.ee.ntu.edu.tw/course/SOC2004/Version 3/Case Study... · 2010-07-14 · Lossy Image Compression ... The algorithms described in this

Lossy Image Compression

Robert Jessop

Department of Electronics and Computer ScienceUniversity of Southampton

December 13, 2002

Abstract

Representing image files as simple arrays of pixels is generally very inefficient in terms of memory.Storage space and network bandwidth is limited so it is necessary to compress image files. Thereare many methods that achieve very high compression ratios by allowing the data to be corrupted inways that are not easily perceivable to the human eye. A new concept is described, combining fractalcompression with sampling and interpolation. This could allow greater fidelity at higher compressionratios than current methods.

1

mailto:[email protected]

http://www.ecs.soton.ac.uk

http://www.soton.ac.uk

CONTENTS 2

Contents

1 Introduction 2

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Overview of compression methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Patent Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Image Compression Methods 3

2.1 JPEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Wavelet Methods and JPEG2000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Binary Tree Predictive Encoding and Non-Uniform Sampling and Interpolation . . . . . . 52.4 Fractal Image Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Comparison of Decoded Colour Images 8

4 Combining Fractal Compression with Sampling and Interpolation 11

5 Conclusions 11

1 Introduction

1.1 Background

The algorithms described in this report are all for real world or natural images such as photographs. Theseimages contain colour gradients and complex edges, not flat areas of one colour or perfectly straight edges.All these algorithms are lossy; the compressed image is not exactly the same as the original. For most usessome loss is acceptable. In photographs there is the image is already an imperfect representation due tothe limitations of the camera. The data is already quantized and sampled at a finite resolution and there isnormally a little noise all over the image. When lossless compression is needed or images are non-natural,such as diagrams, then another format, such as TIFF, GIF or PNG, should be used.

Though I will not discuss video, it is a related topic as advances in video compression are usually based onresearch in still image compression.

Image compression algorithms take advantage of the properties of human perception. Low intensity highfrequency noise is not important and we are much less sensitive to differences in colour than light intensity.These aspects can be reproduced less accurately when an image is decompressed.

1.2 Overview of compression methods

In their simplest form images are represented as a two dimensional array (bitmap) with each elementcontaining the colour of one pixel. Colour is usually described as three values - the intensities of red, greenand blue light (RGB).

The most widespread algorithm is JPEG, which is more commonly known as just JPEG ( Joint PhotographicExperts Group). It is based on DCTs (Discrete Cosine Transforms). Its dominance is a result of wideapplication support, in particular web browsers. The vast majority of photographs on the Internet are

2 IMAGE COMPRESSION METHODS 3

JPEG files. It is described in section 2.1. The common JPEG file format is correctly called JFIF (JPEGFile Interchange Format).

JPEG2000 is a relatively new standard that is supposed to replace JPEG ( Joint Photographic ExpertsGroup). It is based on wavelets. It offers better results than JPEG, especially at higher compression ratios.It also offers better features, including better progressive display. It is described in section 2.2.

Binary Tree Predictive Coding (Robinson, 1994), and Non-Uniform Sampling and interpolation (Rosen-berg), are similar techniques. They are based on encoding the value of only some of the pixels in the imageand the rest of the image is predicted from those values. It is described in section 2.3.

There is no standard for fractal image compression, but the nearest thing to one is Iterated SystemsFIF format ( Iterated Systems Inc). Fractal compression is based on PIFS (Partitioned Iterated FunctionSystems). Two good books on fractal image compression are (Barnsley and Hurd, 1992) and (Fisher,1994). One advantage of fractal methods is super resolution; decoding the image at higher resolutionsleads to much better results than stretching the original image. It is described in section 2.4.

Image compression usually involves a lossy transformation on the data followed by a lossless compressionof the transformed data. Entropy encoding such as Huffman or arithmetic encoding (Barnsley and Hurd,1992) is usually used. There is information and links for most types of compression in (loup Gailly).

1.3 Patent Issues

Parts of the techniques described here are covered by patents. Iterated Systems’ control of a patent forfractal compression has probably prevented it becoming widespread. Iterated Systems has licensed theirtechnology for use in other applications, such as Microsoft Encarta. Their are many patents coveringwavelets but the JPEG committee has managed to ensure that all patent holding companies have agreedto allow free use of JPEG2000. JPEG was used freely for a long time but recently Forgent Networks hasclaimed royalties for its use ( Joint Photographic Experts Group).

2 Image Compression Methods

This section will describe the main algorithms for lossy image compression in non-mathematical terms. Theyare described in the simplest way and actual implementations may work slightly differently to maximizespeed. Each method has references which contain the relevant equations and detail in full.

2.1 JPEG

JPEG is an effective way of compressing photographs at high quality levels and is widely used on theInternet. The basic algorithm is as follows:

1. The colours in the image are converted from RGB (red, green and blue) to YUV, which is: brightness,hue (colour) and saturation (how colourful or grey it is). The resolution of the hue and saturationcomponents are halved in each axis because the human eye is less sensitive to colour than brightness.The three channels are then encoded separately.

2. Each channel is split into 8×8 blocks of pixels. Each block is then encoded separately.


Figure 1: The 64 coefficients from the discrete cosine transform are weights for the 64 patterns above,which when combined give the 8×8 block of pixels. Image from (Salomon)

3. A DCT (Discrete Cosine Transform) is applied to each block (Figure 1)and the 64 resulting coeffi-cients are rounded to integers and each divided by a number. The higher frequency coefficients havebigger divisors than the lower frequencies. The actual set of divisors used depends of the qualitydesired. Since it is integer division this is a lossy process; it can only be reversed approximately bymultiplying.

4. The first coefficient is the average value for the whole block and is encoded as the difference fromthe average value in the previous block.

5. The 64 coefficients are ordered in a zig-zag pattern from top left to bottom right so the lowerfrequencies are read first. They are run length encoded. Due to the dividing stage most blocksshould end in a long run of zeros in the higher frequencies and will compress well.

6. Huffman encoding is applied to the run length encoded data.

To decode the process is reversed.

Progressive display and lossless compression are supported as well but rarely used. The format also specifiesan option to use arithmetic encoding instead of Huffman encoding but due to patent royalties it is rarelyused.

There is a detailed description in (Barnsley and Hurd, 1992) and the full specification is available from( Joint Photographic Experts Group).

JPEG compression can cause ripples can appear next to sharp edges.

2.2 Wavelet Methods and JPEG2000

Wavelet methods give very good performance at both high and low image fidelities and have the advantageof describing the image starting at a low resolution first and increasing the resolution until all the detail isthere. This is useful for progressive viewing while transferring on slow Internet connections. The generalalgorithm is as follows:


Figure 2: One dimensional wavelets. From (Dartmouth College)

1. Like JPEG, colours are converted to YUV, and the U and V components are encoded with less fidelity.

2. The image is expressed at a low resolution as a regular array of wavelets. For example a 512×512pixels image might be scanned with a 64×64 array of wavelets. Each wavelet is given an amplitudesuch that they best describe the image locally. The wavelets overlap with their neighbors so eachwavelet would be 16×16 pixels in this example. The height of the wavelet at each given pixeldetermines the weight of that pixel in determining the value for the wavelet’s amplitude. There aremany different wavelets but only one is used for a given compression scheme. The wavelets are intwo dimensions but you can see some one dimensional examples in Figure 2.

3. Now there is an approximation to the image as an low resolution array of wavelets. This is subtractedfrom the original image to get the difference. The difference image is then encoded the same wayusing an array of wavelets twice the resolution in both axis. This is repeated until the final differenceimage has been encoded losslessly using wavelets with a width of one pixel.

4. The data is now compressed. Because the low resolution approximations are near the original imagemost of the wavelet amplitudes at higher resolutions are zero or very low. This means they can becompressed extremely well using entropy (huffman or arithmetic) encoding. Also for any zero valuedwavelet where the higher resolution wavelets beneath it are zero a special zero-tree symbol is outputand the higher resolutions wavelets underneath are not output at all.

5. As described the algorithm is lossless. Very good lossy compression is achieved by quantizing thewavelet amplitudes (like the dividing of the coefficients in the JPEG algorithm). The highest reso-lutions can be quantized the most coarsely. This makes more of the amplitudes zero allowing morezero trees and improving the effectiveness of the entropy encoding stage.

JPEG2000 is an attempt to standardise wavelet image compression ( Joint Photographic Experts Group).

2.3 Binary Tree Predictive Encoding and Non-Uniform Sampling and Interpola-tion

These two methods are very similar. They both work by storing only the values of some of the pixels in animage and allowing the rest to be predicted by the decoder.

Non-uniform sampling and interpolation (NSI) encodes the values of the most important pixels and whendecoding the rest of the pixels are interpolated (predicted) from nearby known pixels (Rosenberg). Figure 3


Figure 3: The sample points used to encode Lena with NSI. From (Rosenberg)

shows the sample points used in an encoding of the Lena image. The method encodes large flattish areasvery well but uses a lot of points for edges and textures. The method is not as good as BTPC becausethe positions of the pixels sampled is more complicated to encode.

Binary Tree Predictive Coding (BTPC) works in a similar way to NSI except that, instead of choosing themost important pixels, the pixels are sampled in a binary tree structure, with pixels nearer the root usedto predict those below them. A zero-tree symbol is output when all pixels at lower levels can be predictedaccurately enough. For details of the algorithm see (Robinson, 1994).

2.4 Fractal Image Compression

The word fractal is hard to define but is usually used to describe anything with infinite detail and selfsimilarity on different scales. Fractal image compression is based on partitioned IFS (Iterated FunctionSystems) (Barnsley and Hurd, 1992). Compressed Images are described entirely in terms of themselves.Here is the simplest encoding algorithm:

1. The image is partitioned into blocks called ranges.

2. For each range find a contractive transformation that describes it as closely as possible. Each rangeis defined as looking like a larger area (called a domain) anywhere in the image and has a brightnessand a contrast adjustment. To guarantee decoding will work the contrast adjustment must be lessthan 1 (the contrast is reduced in the transformation) but it has been found that some contrastadjustments greater than 1 can be safely allowed.

3. The transformation data is quantized. In practice the quantization is taken into account when judginghow well a domain is matched to a range.

A very good and detailed account of fractal image compression is in (Fisher, 1994).

It may seem useless to define the image in terms of itself but if you start with any image and iterativelyapply the transformations, it will converge on the encoded image. This decoding process is fast and onlyneeds a few iterations. This convergence has been mathematically proved and is due to the transformations


Figure 4: Range blocks used in the quadtree compression of Figure 6(e)

being contractive. Since it is unlikely that every range will have a domain that matches perfectly, fractalcompression is by nature lossy and there is a limit to the fidelity of the reconstruction possible for anygiven image and range size.

In the simplest algorithm the ranges are all the same size but much better results are obtained by adaptingthe partitions to suit the image. Sometimes large areas can be described by one transformation and otherareas need many ranges and many transformations. A simple method is quadtree partitioning (Figure 4).Better is HV partitioning, which starts with one range and recursively divides ranges unevenly into two(Fisher, 1994). Starting with small ranges and merging adjacent ranges gives good results as well (Ruhlet al., 1997).

One problem with fractal image compression is that encoding takes a long time because to find eachtransformation a large number of possible domains must be compared to the range. Classifying ranges anddomains and only comparing those in the same class reduces the time taken (Fisher, 1994). Even fasterencoding can be achieved by building a multi-dimensional search tree (Saupe, 1995)(Cardinal, 2001).

With the basic algorithm you often get visible artifacts at the edges of ranges (as in Figure 6(e)). It helpsto blur the edges of ranges into each other but this does not eliminate the problem entirely. The qualitypossible depends on the image encoded.

Fractal image compression can be combined with wavelet encoding, which gives better image qualitywithout range edge artifacts. Due to the hierarchical nature of wavelet compression this allows decodingin one step instead of iterating (Levy and Wilson, 1995)(van de Walle, 1995).

Since the image is encoded by transformations rather than pixels you can decode the image at any resolution.When decoded at a higher resolution extra detail is generated by the transformations and this often givesbetter quality enlargements than stretching the original image (Figure 5). This property is called superresolution.

3 COMPARISON OF DECODED COLOUR IMAGES 8

(a) Lena’s eyes enlarged 400% by pixel resize (b) Lena’s eyes decoded at 400% size from a 20:1 FIFencoding

Figure 5: Super Resolution

3 Comparison of Decoded Colour Images

In figures 6 and 7 you can compare decoded images from five different algorithms. There are mathematicalmetrics for judging the difference between two images but for most applications the human eye is the bestjudge. The image used is the Lena image because it is like an unofficial standard for image processingresearch and is used in most research papers on image compression. Using only one image makes thiscomparison less than thorough but you can see the sort of artifacts and information lost when using thedifferent algorithms.

I have used colour images and not greyscale images because colour images are more relevant to real worlduses. The images are 512×512 pixels and 24 bit colour. The Lena image and other test images areavailable from (Kominek).

The programs used to encode these images were:

• Jasc Paint Shop Pro 5.0 for the DCT based JPEG.

• LuraWave SmartCompress ( AlgoVision-LuraTech) for the wavelet based JPEG2000.

• Iterated Systems Fractal Imager 1.6 ( Iterated Systems Inc) for FIF, which appears to be based onfractal transformations and wavelets.

• FRACOMP 1.0 (Kassler, 1996) for QIF, which is fractal based and uses quadtrees.

• BTPC 4.1 (Robinson, 1994) for Binary Tree Predictive Coding.


(a) The original uncompressed Lena image (768KB) (b) JPEG (20KB)

(c) LuraWave JPEG2000 (20KB) (d) Iterated Systems FIF (19KB)

(e) FRACOMP QIF (23KB) (f) Binary Tree Predictive Coding (19KB)

Figure 6: Comparison of the Lena image compressed around 38:1 using different methods.


(a) The original uncompressed Lena image (768KB) (b) JPEG (10KB)

(c) LuraWave JPEG2000 (10KB) (d) Iterated Systems FIF (10KB)

(e) FRACOMP QIF (10KB) (f) Binary Tree Predictive Coding (10KB)

Figure 7: Comparison of the Lena image compressed around 76:1 using different methods.

4 COMBINING FRACTAL COMPRESSION WITH SAMPLING AND INTERPOLATION 11

4 Combining Fractal Compression with Sampling and Interpolation

The biggest problem with standard fractal compression is the artifacts at the edges of ranges. Post-processing can help but not eliminate the problem. One successful way of improving it is by combiningfractal compression with wavelets, as in (Levy and Wilson, 1995) and (van de Walle, 1995). However, Thismethod restricts the size of the domain pool because the domains must be aligned with wavelets on theearlier levels. Generally in fractal compression the larger the domain pool the better quality or compressionratio can be achieved (Fisher, 1994).

Another way to improved image quality might be to use the principles of binary tree predictive coding andnon-uniform sampling. This is an area I am currently researching.

My idea is to remove the brightness offset in the transformations and encode the actual colour values ofthe four corner pixels of each range. In applying the transformation the domain would be tilted so that thecolours of its four corners map to the four corners of the range. Each corner in the transformation wouldeffectively have its own brightness offset calculated. The brightness offset for each pixel in the range wouldbe calculated by interpolating between the offsets of the four corners.

The corner pixel values would be shared between adjacent ranges. For an even grid of ranges of the samesize every corner not on the image edge is shared by four ranges so storing them would take no more spacethan storing brightness offsets does. With HV Partitioning there are the four corners of the image to beginwith and each time a range is split two new corners are added so there are roughly twice as many cornersto encode as there are ranges.

By defining exactly the colour of each corner of each range there would be no artifacts on the corners ofranges and artifacts across edges would be greatly reduced. Since the corners are shared by ranges, theline of pixels on the edge of two ranges could be calculated as the average of the results of the two ranges’transformations.

As well as removing the edge artifacts the automatic tilting of the domain to fit the range would allowbetter domain-range matches. For example, a textured range with a constant gradient could be matchedwith a domain with no gradient and the same texture.

Another area fractal image compression could be improved is the size of domains. Most current methodsuse domains four times the size of ranges because to take a simple mean of four pixels is simple. Bettermatches might be found with domains only slightly bigger than the ranges. It would also be possible touse domains the same size for some ranges as long as there are no circular references leading to a rangebeing mapped to itself. This could be prevented by insisting on a contrast adjustment strictly less than 1for transformations that are not spatially contractive. The pillar in the left of Figure 4 could have beenencoding with fewer ranges if one part of it had been encoded with a few contractive transformations andthe rest of it encoded as looking like that part.

I am currently working on implementing these ideas and will report the results in my 3rd year project finalreport in May 2003.

5 Conclusions

There are many good algorithms to compete with the widespread JPEG standard, but judging from Figure 6none of them appear to provide better quality at the 38:1 compression ratio, which is suitable for webimages. This is a little surprising considering the number of claims of compression better than JPEG in

REFERENCES 12

research papers and books. These claims are usual based on higher compression ratios, as in Figure 7, oron image quality metrics such as PSNR (Peak Signal to Noise Ratio), which does not model the propertiesof human vision well. It looks likely JPEG will remain the most popular format for the forseeable future.

Each algorithm has its own advantages and drawbacks in different areas. For example: In Figure 6 theFIF and BTPC images have the best looking sharp edges but in the FIF image the hat looks very bad andthe BTPC image has speckles. JPEG gives the best hat texture but has minor artifacts at some edges.JPEG2000 is a good compromise between edges and textures, performing nearly as well as the best in bothareas.

JPEG2000’s wavelet based format is overall probably the best available. It performs extremely well over awide range of compression ratios. The format is also the most flexible; it supports transparency, gammacorrection, progressive decoding and regions of interest (increased quality for part of an image).

There are still possibilities for further research. Fractal image compression still has a lot of potential andsampling pixels at range block corners could lead to a very significant improvement in image quality forany given compression ratio.

References

AlgoVision-LuraTech. Lurawave smartcompress 3.0.http://www.algovision-luratech.com/.

Iterated Systems Inc. Fractal imager plus 1.6. The website www.iterated.com is often referenced but ithas little information and only a photoshop plug-in to download, However with a google search you canfind Iterated Systems’ free stand-alone FIF compressor: fi16 mmx.exe.

Joint Photographic Experts Group. Jpeg specifications and news.http://www.jpeg.org/.

Michael F. Barnsley and Lyman P. Hurd. Fractal Image Compression. AK Peters, 1992. ISBN 1-56881-0000-8.

Jean Cardinal. Fast fractal compresison of greyscale images. IEEE Transactions on Image Processing, 10(1), 2001.

Dartmouth College. Wavelets diagram.http://eamusic.dartmouth.edu/~book/MATCpages/chap.3/3.6.alts_FFT.html.

Yuval Fisher. Fractal Image Compression Theory and Application. Springer-Verlag, 1994. ISBN 0-387-94211-4.

Andreas Kassler. Fracomp 1.0 fraktale bildkompression, 1996.http://www-vs.informatik.uni-ulm.de/Mitarbeiter/Kassler/fractals.htm.

John Kominek. Waterloo bragzone.

I. Levy and R. G. Wilson. A hybrid fractal-wavelet transform image data compression algorithm. TechnicalReport CS-RR-289, Coventry, UK, 1995.http://citeseer.nj.nec.com/329032.html.

Jean loup Gailly. comp.compression frequently asked questions.http://www.faqs.org/faqs/compression-faq/.

http://www.algovision-luratech.com/

http://www.google.com/search?q=fi16_mmx.exe

http://www.jpeg.org/

http://eamusic.dartmouth.edu/~book/MATCpages/chap.3/3.6.alts_FFT.html

http://www-vs.informatik.uni-ulm.de/Mitarbeiter/Kassler/fractals.htm

http://links.uwaterloo.ca/bragzone.base.html

http://citeseer.nj.nec.com/329032.html

http://www.faqs.org/faqs/compression-faq/

REFERENCES 13

John A. Robinson. Binary tree predictive coding, 1994. Web pages based on a paper submitted to IEEETransactions on Image Processing.http://www.elec.york.ac.uk/visual/jar11/btpc/btpc.html.

Chuck Rosenberg. Non-uniform sampling and interpolation for lossy image compression.http://www-2.cs.cmu.edu/~chuck/nsipg/nsi.html#nsi.

Matthias Ruhl, Hannes Hartenstein, and Dietmar Saupe. Adaptive partitionings for fractal image compres-sion. In Proceedings ICIP-97 (IEEE International Conference on Image Processing), volume II, pages310–313, Santa Barbara, CA, USA, 1997.http://citeseer.nj.nec.com/ruhl97adaptive.html.

David Salomon. Dct diagram.http://www.ecs.csun.edu/~dxs/DC2advertis/DComp2Ad.html.

Dietmar Saupe. Accelerating fractal image compression by multi-dimensional nearest neighbor search. InJ. A. Storer and M. Cohn, editors, Proceedings DCC’95 (IEEE Data Compression Conference), pages222–231, Snowbird, UT, USA, 1995.http://citeseer.nj.nec.com/458915.html.

Axel van de Walle. Relating fractal image compression to transform methods. Master’s thesis, Waterloo,Canada, 1995.http://citeseer.nj.nec.com/walle95relating.html.

http://www.elec.york.ac.uk/visual/jar11/btpc/btpc.html

http://www-2.cs.cmu.edu/~chuck/nsipg/nsi.html#nsi

http://citeseer.nj.nec.com/ruhl97adaptive.html

http://citeseer.nj.nec.com/ruhl97adaptive.html

http://www.ecs.csun.edu/~dxs/DC2advertis/DComp2Ad.html

http://citeseer.nj.nec.com/458915.html

http://citeseer.nj.nec.com/walle95relating.html

Documents

Lossy Image Compression - 國立臺灣大學access.ee.ntu.edu.tw/course/SOC2004/Version 3/Case Study... · 2010-07-14 · Lossy Image Compression ... The algorithms described in this