Ch5 Dictionary Coding

Embed Size (px)

Citation preview

  • 8/10/2019 Ch5 Dictionary Coding

    1/56

    Chapter 5 DICTIONARY

    CODING

  • 8/10/2019 Ch5 Dictionary Coding

    2/56

    Dictionary Coding

    Dictionary coding is different from Huffmancoding and arithmetic coding.

    Both Huffman and arithmetic coding techniquesare based on a statistical model (e.g.,occurrence probabilities).

    In dictionary-based data compressiontechniques, a symbol or a stringof symbolsgenerated from a source alphabet isrepresented by an index to a dictionaryconstructed from the source alphabet.

  • 8/10/2019 Ch5 Dictionary Coding

    3/56

    Dictionary Coding

    A dictionary is a list of symbols and strings

    of symbols.

    There are many examples of this in our dailylives.

    the string September vs. 9,

    a social security number vs. a person in the U.S.

    Dictionary coding is widely used in textcoding.

  • 8/10/2019 Ch5 Dictionary Coding

    4/56

    Dictionary Coding

    Consider English text coding. The source alphabet includes 26 English letters in both

    lower and upper cases, numbers, various punctuationmarks and the space bar.

    Huffman or arithmetic coding treats each symbol basedon its occurrence probability. That is, the source ismodeled as a memoryless source.

    It is well known, however, that this is not true in manyapplications.

    In text coding, st ructure or context plays a significantrole. Very likely that the letter uappears after the letter q.

    Likewise, it is likely that the word concerned will appear afterAs far as the weather is.

  • 8/10/2019 Ch5 Dictionary Coding

    5/56

    Dictionary Coding

    The strategy of the dictionary coding is tobuild a dictionary that contains frequentlyoccurring symbols and string of symbols.

    When a symbol or a string is encountered and itis contained in the dictionary, it is encoded withan index to the dictionary.

    Otherwise, if not in the dictionary, the symbol orthe string of symbols is encoded in a lessefficient manner.

  • 8/10/2019 Ch5 Dictionary Coding

    6/56

    Dictionary Coding

    All the dictionary schemes have an equivalentstatistical scheme, which achieves exactly thesame compression.

    Statistical scheme using high-order contextmodels may outperform dictionary coding (withhigh computational complexity).

    But, dictionary coding is now superior for fast

    speed and economy of memory. In future, however, statistical scheme may be

    preferred [bell 1990].

  • 8/10/2019 Ch5 Dictionary Coding

    7/56

    Formulation of Dictionary Coding

    Define dictionary coding in a precise manner[bell 1990].

    We denote a source alphabet by S.

    A dictionary consisting of two elements isdefined as ,Pis a finite set of phrases generated from the S,

    Cis a coding function mapping Ponto a set ofcodewords.

    ),( CPD

  • 8/10/2019 Ch5 Dictionary Coding

    8/56

    Formulation of Dictionary Coding

    The set Pis said to be complete if any input stringcan be represented by a series of phrases chosenfrom the P.

    The coding function Cis said to obey the prefixproperty if there is no codeword that is a prefix ofany other codeword.

    For practical usage, i.e., for reversible

    compression of any input text, the phrase set Pmust be complete and the coding function Cmustsatisfy the prefix property.

  • 8/10/2019 Ch5 Dictionary Coding

    9/56

    Categorization of Dictionary-

    Based Coding Techniques

    The heart of dictionary coding is theformulation of the dictionary.

    A successfully built dictionary results in datacompression; the opposite case may lead todata expansion.

    According to the ways in which dictionaries

    are constructed, dictionary codingtechniques can be classified as static oradaptive.

  • 8/10/2019 Ch5 Dictionary Coding

    10/56

    Static Dictionary Coding

    A fixed dictionary, Produced before the coding process

    Used at both the transmitting and receiving ends

    It is possible when the knowledge about the sourcealphabet and the related strings of symbols, alsoknown as phrases, is sufficient.

    Merit of the static approach: its simplicity.

    Its drawbacks lie on Relatively lower coding efficiency

    Less flexibility compared with adaptive dictionarytechniques

  • 8/10/2019 Ch5 Dictionary Coding

    11/56

    Static Dictionary Coding

    An example of static algorithms occurs is diagramcoding. A simple and fast coding technique.

    The dictionary contains: all source symbols and

    some frequently used pairs of symbols.

    In encoding, two symbols are checked at once to see ifthey are in the dictionary. If so, they are replaced by the index of the two symbols in the

    dictionary, and the next pair of symbols is encoded in the nextstep.

  • 8/10/2019 Ch5 Dictionary Coding

    12/56

    Static Dictionary Coding

    If not, then the index of the first symbol is used to encode the

    first symbol. The second symbol is combined with the third

    symbol to form a new pair, which is encoded in the next step.

    The diagram can be straightforwardly extendedto n-gram. In the extension, the size of the

    dictionary increases and so is its coding

    efficiency.

  • 8/10/2019 Ch5 Dictionary Coding

    13/56

    Adaptive Dictionary Coding

    A completely defined dictionary does not existprior to the encoding process and the dictionary isnot fixed. At the beginning of coding, only an initial dictionary

    exists. It adapts itself to the input during the coding process.

    All adaptive dictionary coding algorithms can betraced back to two different original works by Ziv

    and Lempel [ziv 1977, ziv 1978]. The algorithms based on [ziv 1977] are referred to as

    the LZ77 algorithms.

    Those based on [ziv 1978] the LZ78 algorithms.

  • 8/10/2019 Ch5 Dictionary Coding

    14/56

    Parsing Strategy

    Once have a dictionary,

    Need to examine the input text and find a string of

    symbols that matches an item in the dictionary.

    Then the index of the item to the dictionary is encoded.

    This process of segmenting the input text into

    disjoint strings(whose union equals the input

    text) for coding is referred to as parsing .

    Obviously, the way to segment the input text into

    strings is not unique.

  • 8/10/2019 Ch5 Dictionary Coding

    15/56

    Parsing Strategy

    In terms of the highest coding efficiency, optimalparsingis essentially a shortest-path problem[bell 1990].

    In practice, however, a method calledgreedy

    parsingis used in all the LZ77 and LZ78algorithms. With greedy parsing, the encoder searches for the

    longest string of symbols in the input that matches an

    item in the dictionary at each coding step. Greedy parsing may not be optimal, but it is simple inimplementation.

  • 8/10/2019 Ch5 Dictionary Coding

    16/56

    Parsing Strategy

    Example 6.1 Consider a dictionary, D, whose phrase set

    . The codewords assigned tothese strings are , , , ,

    , , and . Now the input text: abbaab.

    Using greedy parsing, we then encode the text as, which is a 10-bit string:

    In above representations, the periods are used to

    indicate the division of segments in the parsing. This, however, is not an optimum solution. Obviously,

    the following parsing will be more efficient, i.e.,, which is a 6- bit string:

    },,,,,,{ bbbaabbbbaabbaP10)( aC 011)( bC 010)( abC 0101)( baC

    01)( bbC 11)( aabC 0110)( bbbC

    )().().( abCbaCabC .010.0101.010

    )().().( aabCbbCaC .11.01.10

  • 8/10/2019 Ch5 Dictionary Coding

    17/56

    Sliding Window (LZ77) Algorithms

    Introduction The dictionary used is actually a portion of the input text,

    which has been recently encoded.

    The text that needs to be encoded is compared with thestrings of symbols in the dictionary.

    The longest matched string in the dictionary ischaracterized by apointer(sometimes called a token),which is represented by a triple of data items.

    Note that this triple functions as an index to thedictionary.

    In this way, a variable-length string of symbols ismapped to a fixed-length pointer.

  • 8/10/2019 Ch5 Dictionary Coding

    18/56

    Introduction

    There is a sliding window in the LZ77 algorithms. Thewindow consists of two parts: a search buffer and a look-ahead buffer. The search buffer contains: the portion of the text stream that has

    recently been encoded --- the dictionary.

    The look-ahead buffer contains: the text to be encoded next.

    The window slides through the input text stream frombeginning to end during the entire encoding process.

    The size of the search buffer is much larger than that of thelook-ahead buffer. The sliding window: usually on the order of a few thousand

    symbols.

    The look-ahead buffer: on the order of several tens to one hundredsymbols.

  • 8/10/2019 Ch5 Dictionary Coding

    19/56

    Encoding and Decoding

    Example 6.2

    Encoding:

  • 8/10/2019 Ch5 Dictionary Coding

    20/56

    Encoding and Decoding

    Decoding: Now let us see how

    the decoder recoversthe string baccbaccgifrom these threetriples.

    In Figure 6.5, foreach part, the last

    previously encodedsymbol cprior to thereceiving of the threetriples is shaded.

  • 8/10/2019 Ch5 Dictionary Coding

    21/56

    Summary of the LZ77 Approach

    The first symbol in the look-ahead buffer is the

    symbol or the beginning of a string of symbols to

    be encoded at the moment. Let us call it the

    symbol s.

    In encoding, the search pointer moves to the left,

    away from the symbol s, to find a match of the

    symbol sin the search buffer. When there aremultiple matches, the match that produces the

    longest matched string is chosen. The match is

    denoted by a triple .

  • 8/10/2019 Ch5 Dictionary Coding

    22/56

    Summary of the LZ77 Approach

    The first item in the triple, i, is the offset, which is the

    distance between the pointer pointing to the symbol

    giving the maximum match and the symbol s.

    The second item, j, is the lengthof the matched string. The third item, k, is the codeword of the symbol

    following the matched string in the look-ahead buffer.

    At the very beginning, the content of the search

    buffer can be arbitrarily selected. For instance, thesymbols in the search buffer may all be the space

    symbol.

  • 8/10/2019 Ch5 Dictionary Coding

    23/56

    Summary of the LZ77 Approach

    Denote the size of the search buffer by SB, thesize of the look-ahead buffer by Land the size ofthe source alphabet byA. Assume that the natural

    binary code is used. Then we see that the LZ77approach encodes variable-length strings ofsymbols with fixed-length codewords. Specifically, the offset i is of coding length ,

    the length of matched string j is of coding length,

    and the codeword k is of coding length , wherethe sign denotes the smallest integer larger than .

    SB2log

    )(log 2 LSB

    )(log2 A

    a a

  • 8/10/2019 Ch5 Dictionary Coding

    24/56

    Summary of the LZ77 Approach

    The length of the matched string is equal tobecause the search for the maximum matchingcan enter into the look-ahead buffer.

    The decoding process is simpler than theencoding process since there is no comparisoninvolved in the decoding.

    The most recently encoded symbols in the search

    buffer serve as the dictionary used in the LZ77approach. The merit of doing so is that thedictionary is adapted to the input text well.

    )(log 2 LSB

  • 8/10/2019 Ch5 Dictionary Coding

    25/56

  • 8/10/2019 Ch5 Dictionary Coding

    26/56

    LZ78 Algorithms

    Introduction

    Limitations with LZ77: If the distance between two repeated patterns is larger than the

    size of the search buffer, the LZ77 algorithms cannot workefficiently.

    The fixed size of the both buffers implies that the matched string

    cannot be longer than the sum of the sizes of the two buffers,

    meaning another limitation on coding efficiency.

    Increasing the sizes of the search buffer and the look-ahead

    buffer seemingly will resolve the problems. A close look,

    however, reveals that it also leads to increases in the number of

    bits required to encode the offset and matched string length as

    well as an increase in processing complexity.

  • 8/10/2019 Ch5 Dictionary Coding

    27/56

    Introduction

    LZ78: No use of the sliding window.

    Use encoded text as a dictionary which, potentially, does not

    have a fixed size.

    Each time a pointer (token) is issued, the encoded string is

    included in the dictionary.

    Once a preset limit to the dictionary size has been reached,

    either the dictionary is fixed for the future (if the coding

    efficiency is good), or it is reset to zero, i.e., it must be restarted. Instead of the triples used in the LZ77, only pairs are used in

    the LZ78.

    Specifically, only the position of the pointer to the matched

    string and the symbol following the matched string need to be

    encoded.

  • 8/10/2019 Ch5 Dictionary Coding

    28/56

    Encoding and Decoding

    Example 6.3 Encoding:

    Consider the text stream: baccbaccacbcabccbbacc.

    In Table 6.4, in general, as the encoding proceeds, the entriesin the dictionary become longer and longer. First, entries withsingle symbols come out, but later, more and more entries withtwo symbols show up. After that more and more entries withthree symbols appear. This means that coding efficiency isincreasing.

    Decoding: Since the decoder knows the rule applied in the encoding, it can

    reconstruct the dictionary and decode the input text stream fromthe received doubles.

  • 8/10/2019 Ch5 Dictionary Coding

    29/56

    Encoding and Decoding

    Index Doubles Encoded symbols

    1 < 0, C(b) > b

    2 < 0, C(a) > a

    3 < 0, C(c) > c

    4 < 3, 1 > cb

    5 < 2, 3 > ac

    6 < 3, 2 > ca

    7 < 4, 3 > cbc

    8 < 2, 1 > ab

    9 < 3, 3 > cc

    10 < 1, 1 > bb

    11 < 5, 3 > acc

    Table 6.4 An encoding example using the LZ78 algorithm

  • 8/10/2019 Ch5 Dictionary Coding

    30/56

    LZW Algorithm

    Both the LZ77 and LZ78 approaches, when published in1977 and 1978, respectively, were theory-oriented.

    The effective and practical improvement over the LZ78 in[welch 1984] brought much attention to the LZ dictionary

    coding techniques. The resulting algorithm is referred tothe LZW algorithm.

    It removed the second item in the double (the index of thesymbol following the longest matched string) and hence, it

    enhanced coding efficiency.In other words, the LZW only sends the indexes of thedictionary to the decoder.

  • 8/10/2019 Ch5 Dictionary Coding

    31/56

    LZW Algorithm

    Example 6.4

    Consider the following input text stream:

    accbadaccbaccbacc. We see that the source

    alphabet is S={a,b,c,d,}.

    Encoding:

  • 8/10/2019 Ch5 Dictionary Coding

    32/56

    LZW Algorithm

    Index Entry Input Smbols Encoded Index

    1

    23

    4

    a

    bc

    d

    Initial dictionary

    5

    6

    7

    8

    910

    11

    12

    13

    14

    15

    ac

    cc

    cb

    ba

    adda

    acc

    cba

    accb

    bac

    cc

    a

    C

    C

    b

    ad

    a,c

    c,b

    a,c,c

    b,a,

    c,c,

    1

    3

    3

    2

    14

    5

    7

    11

    8

    Table 6.5 An example of the dictionary coding using the

    LZW algorithm

  • 8/10/2019 Ch5 Dictionary Coding

    33/56

    LZW Algorithm

    Decoding: Initially, the decoder has the same dictionary (the top four rows

    in Table 6.5) as that in the encoder.

    Once the first index 1 comes, the decoder decodes a symbol a.

    The second index is 3, which indicates that the next symbol is c.

    From the rule applied in encoding, the decoder knows further thata new entry achas been added to the dictionary with an index 5.

    The next index is 3. It is known that the next symbol is also c.

    It is also known that the string cchas been added into thedictionary as the sixth entry.

    In this way, the decoder reconstructs the dictionary and

    decodes the input text stream.

  • 8/10/2019 Ch5 Dictionary Coding

    34/56

    LZ78 Compression Algorithm (contd)Dictionary empty ; Prefix empty ; DictionaryIndex 1;

    while(characterStream is not empty)

    {

    Char next character in characterStream;if(Prefix + Char exists in the Dictionary)

    Prefix Prefix + Char ;

    else

    {

    if(Prefix is empty)

    CodeWordForPrefix 0 ;

    else

    CodeWordForPrefix DictionaryIndex for Prefix ;

    Output: (CodeWordForPrefix, Char) ;

    insertInDictionary( ( DictionaryIndex , Prefix + Char) );

    DictionaryIndex++ ;

    Prefix empty ;

    }}

    if(Prefix is not empty)

    {

    CodeWordForPrefix DictionaryIndex for Prefix;

    Output: (CodeWordForPrefix , ) ;

    }

  • 8/10/2019 Ch5 Dictionary Coding

    35/56

    Exercises

    1. Encode (i.e., compress) the following strings

    using the LZ77 algorithm.

    Window size : 5 ABBABCABBBBC

    2. Encode (i.e., compress) the following strings

    using the LZ78 algorithm.

    1.ABBCBCABABCAABCAAB

    2.BABAABRRRA

    3.AAAAAAAAA

  • 8/10/2019 Ch5 Dictionary Coding

    36/56

    LZ77 - Example

    Window size : 5

    ABBABCABBBBC

    NextSeq Code

    A (0,0,A)

    B (0,0,B)

    BA (1,1,A)

    BC (3,1,C)ABB (3,2,B)

    BBC (2,2,C)

    Example 1: LZ78 Compression

  • 8/10/2019 Ch5 Dictionary Coding

    37/56

    Example 1: LZ78 CompressionEncode (i.e., compress) the string ABBCBCABABCAABCAABusing the LZ78 algorithm.

    The compressed message is: (0,A)(0,B)(2,C)(3,A)(2,A)(4,A)(6,B)

    Note:The above is just a representation, the commas and parentheses are not transmitted;

    we will discuss the actual form of the compressed message later on in slide 12.

  • 8/10/2019 Ch5 Dictionary Coding

    38/56

    Example 1: LZ78 Compression (contd)

    1. Ais not in the Dictionary; insert it

    2. Bis not in the Dictionary; insert it

    3. Bis in the Dictionary.BCis not in the Dictionary; insert it.

    4. Bis in the Dictionary.

    BCis in the Dictionary.

    BCAis not in the Dictionary; insert it.

    5. Bis in the Dictionary.

    BAis not in the Dictionary; insert it.6. Bis in the Dictionary.

    BCis in the Dictionary.

    BCAis in the Dictionary.

    BCAAis not in the Dictionary; insert it.

    7. Bis in the Dictionary.

    BCis in the Dictionary.

    BCAis in the Dictionary.

    BCAAis in the Dictionary.

    BCAABis not in the Dictionary; insert it.

    Example 2: LZ78 Compression

  • 8/10/2019 Ch5 Dictionary Coding

    39/56

    p p

    Encode (i.e., compress) the string BABAABRRRAusing the LZ78 algorithm.

    The compressed message is: (0,B)(0,A)(1,A)(2,B)(0,R)(5,R)(2, )

    Example 2: LZ78 Compression (contd)

  • 8/10/2019 Ch5 Dictionary Coding

    40/56

    Example 2: LZ78 Compression (cont d)

    1. Bis not in the Dictionary; insert it

    2. Ais not in the Dictionary; insert it

    3. Bis in the Dictionary.

    BAis not in the Dictionary; insert it.4. Ais in the Dictionary.

    ABis not in the Dictionary; insert it.

    5. Ris not in the Dictionary; insert it.

    6. Ris in the Dictionary.

    RRis not in the Dictionary; insert it.

    7. Ais in the Dictionary and it is the last input character; output a pair

    containing its index: (2, )

    Examp e 3: LZ78 Compress on

  • 8/10/2019 Ch5 Dictionary Coding

    41/56

    p pEncode (i.e., compress) the string AAAAAAAAAusing the LZ78 algorithm.

    1. A is not in the Dictionary; insert it

    2. A is in the Dictionary

    AA is not in the Dictionary; insert it

    3. A is in the Dictionary.AA is in the Dictionary.

    AAA is not in the Dictionary; insert it.

    4. A is in the Dictionary.

    AA is in the Dictionary.

    AAA is in the Dictionary and it is the last pattern; output a pair containing its index:

    (3, )

    LZ78 Compression: Number of bits transmitted

  • 8/10/2019 Ch5 Dictionary Coding

    42/56

    p Example: Uncompressed String: ABBCBCABABCAABCAAB

    Number of bits = Total number of characters * 8

    = 18 * 8

    = 144 bits

    Suppose the codewords are indexed starting from 1:

    Compressed string( codewords): (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)

    Codeword index 1 2 3 4 5 6 7

    Each code word consists of an integer and a character:

    The character is represented by 8bits.

    The number of bits nrequired to represent the integer part of the codeword with

    index iis given by:

    Alternatively number of bits required to represent the integer part of the codeword

    with indexi is the number of significant bits required to represent the integer i1

    LZ78 Compression: Number of bits transmitted (contd)

  • 8/10/2019 Ch5 Dictionary Coding

    43/56

    p ( )

    Codeword (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)

    index 1 2 3 4 5 6 7Bits: (1 + 8) + (1 + 8) + (2 + 8) + (2 + 8) + (3 + 8) + (3 + 8) + (3 + 8) = 71 bits

    The actual compressed message is: 0A0B10C11A010A100A110B

    where each character is replaced by its binary 8-bit ASCII code.

    LZ78 Decompression Algorithm

  • 8/10/2019 Ch5 Dictionary Coding

    44/56

    Dictionaryempty ; DictionaryIndex1 ;

    while(there are more (CodeWord, Char) pairs in codestream){

    CodeWordnext CodeWord in codestream ;

    Charcharacter corresponding to CodeWord ;

    if(CodeWord = =0)Stringempty ;

    else

    Stringstring at index CodeWordin Dictionary ;

    Output: String + Char ;

    insertInDictionary( (DictionaryIndex , String + Char) ) ;

    DictionaryIndex++;}

    Summary: input:(CW, character) pairs

    output:if(CW == 0)

    output: currentCharacter

    else

    output: stringAtIndex CW + currentCharacter Insert:current output in dictionary

    Example 1: LZ78 Decompression

  • 8/10/2019 Ch5 Dictionary Coding

    45/56

    p p

    Decode (i.e., decompress) the sequence (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)

    The decompressed message is: ABBCBCABABCAABCAAB

    Example 2: LZ78 Decompression

  • 8/10/2019 Ch5 Dictionary Coding

    46/56

    Decode (i.e., decompress) the sequence (0, B) (0, A) (1, A) (2, B) (0, R) (5, R) (2, )

    The decompressed message is: BABAABRRRA

    Example 3: LZ78 Decompression

  • 8/10/2019 Ch5 Dictionary Coding

    47/56

    p p

    Decode (i.e., decompress) the sequence (0, A) (1, A) (2, A) (3, )

    The decompressed message is: AAAAAAAAA

  • 8/10/2019 Ch5 Dictionary Coding

    48/56

    Applications

    The CCITT Recommendation V.42 bis is a datacompression standard used in modems that connectcomputers with remote users via the GSTN. In thecompressed mode, the LZW algorithm is recommended fordata compression.

    In image compression, the LZW finds its application aswell. Specifically, it is utilized in the graphic interchangeformat (GIF) which was created to encode graphicalimages. GIF is now also used to encode natural images,though it is not very efficient in this regard. For more

    information, readers are referred to [sayood 1996]. The LZW algorithm is also used in the Unix Compress

    command.

  • 8/10/2019 Ch5 Dictionary Coding

    49/56

    International Standards for

    Lossless Still Image Compression

    Lossless Bilevel Still Image Compression

    Algorithms

    As mentioned above, there are fourdifferent

    international standard algorithms falling into this

    category.

    MH (Modified Huffman coding): This algorithm is defined inCCITT Recommendation T.4 for facsimile coding. It uses

    the 1-D RLC technique followed by the modified Huffmancoding technique.

  • 8/10/2019 Ch5 Dictionary Coding

    50/56

    Algorithms

    MR (Modified READ (Relative Element Address Designate)coding): Defined in CCITT Recommendation T.4 for

    facsimile coding. It uses the 2-D RLC technique followed by

    the modified Huffman coding technique.

    MMR (Modified Modified READ coding): Defined in CCITTRecommendation T.6. It is based on MR, but is modified tomaximize compression.

    JBIG (Joint Bilevel Image experts Group coding): Defined inCCITT Recommendation T.82. It uses an adaptive 2-D

    coding model, followed by an adaptive arithmetic codingtechnique.

  • 8/10/2019 Ch5 Dictionary Coding

    51/56

    Performance Comparison

    The JBIG test image set was used to compare theperformance of the above-mentioned algorithms. The setcontains scanned business documents with differentdensities, graphic images, digital halftones, and mixed

    (document and halftone) images. Note that digital halftones, also named (digital) halftone

    images, are generated by using only binary devices. Some small black units are imposed on a white background.

    The units may assume different shapes: a circle, a square and so

    on. The more dense the black units in a spot of an image, the darker

    the spot appears.

    The digital halftoning method has been used for printing gray-levelimages in newspapers and books.

  • 8/10/2019 Ch5 Dictionary Coding

    52/56

    Performance Comparison

    For bilevel images excluding digital halftones, thecompression ratio achieved by these techniques rangesfrom 3 to 100.The compression ratio increases monotonically in the order

    of the following standard algorithms: MH, MR, MMR, JBIG. For digital halftones, MH, MR, and MMR result in data

    expansion, while JBIG achieves compression ratios in therange of 5 to 20.

    This demonstrates that among the techniques, JBIG is theonly one suitable for the compression of digital halftones.

  • 8/10/2019 Ch5 Dictionary Coding

    53/56

    Lossless Multilevel Still Image

    Compression

    Algorithms

    There are two international standards formultilevel still image compression:

    JBIG (Joint Bilevel Image experts Group coding):Defined in CCITT Recommendation T. 82.

    It uses an adaptive arithmetic coding technique. To encode multilevel images, the JIBG decomposes

    multilevel images into bit-planes, then compresses these

    bit-planes using its bilevel image compression technique. To further enhance compression ratio, it uses Gary coding

    to represent pixel amplitudes instead of weighted binarycoding.

  • 8/10/2019 Ch5 Dictionary Coding

    54/56

    Algorithms

    JPEG (Joint Photographic (image) Experts Group

    coding): Defined in CCITT Recommendation T. 81.

    For lossless coding (Lossless JPEG), it uses

    differential coding technique. The predictive error is

    encoded using either Huffman coding or adaptive

    arithmetic coding techniques.

  • 8/10/2019 Ch5 Dictionary Coding

    55/56

    Performance Comparison

    A set of color test images from the JPEGstandards committee was used forperformance comparison.

    The luminance component (Y) is of resolutionpixels, while the chrominance

    components (U and V) are of pixels.

    The compression ratios calculated are the

    combined results for all the three components.The following observations have been reported.

    576720

    576360

  • 8/10/2019 Ch5 Dictionary Coding

    56/56

    Performance Comparison

    When quantized in 8 bits/pixel, the compression ratios varymuch less for multilevel images than for bilevel images,and are roughly equal to 2.

    When quantized with 5 bits/pixel down to 2 bits/pixel,

    compared with the lossless JPEG, the JBIG achieves anincreasingly higher compression ratio, up to a maximum of29%.

    When quantized with 6 bits/pixel, JBIG and lossless JPEGachieve similar compression ratios.

    When quantized with 7 bits/pixel to 8 bits/pixel, the losslessJPEG achieves a 2.4 % to 2.6 % higher compression ratiothan JBIG.