19
Lecture 9 1 Chinese Character Output Character 字字 : abstract object recognized by human in communication, it is the representation at the conceptual level. Control characters in computer internal code is not considered characters Glyph 字字 : character in its concrete form without regards to thickness, style, size, and the computer internal representation(bitmap, outline, etc) Font (font set) 字/ 字字字 : specific form of character with all computer internal representation attributes

Chinese Character Output

  • Upload
    gazit

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

Chinese Character Output. Character 字符 : abstract object recognized by human in communication, it is the representation at the conceptual level. Control characters in computer internal code is not considered characters - PowerPoint PPT Presentation

Citation preview

Page 1: Chinese Character Output

Lecture 9 1

Chinese Character Output

• Character字符 : abstract object recognized by human in communication, it is the representation at the conceptual level. Control characters in computer internal code is not considered characters

• Glyph字形 : character in its concrete form without regards to thickness, style, size, and the computer internal representation(bitmap, outline, etc)

• Font (font set)字體 /字型庫 : specific form of character with all computer internal representation attributes

Page 2: Chinese Character Output

Lecture 9 2

• The three levels of representation

Image圖像

Font字型

ExternalRepresentation

外部表示

GID(Glyph ID)

Glyph字形

DocumentDescription

Character字符 Code

Internal Representation

內部表示

Rendering

Association

Human perception

Page 3: Chinese Character Output

Lecture 9 3

Page 4: Chinese Character Output

Lecture 9 4

Page 5: Chinese Character Output

Lecture 9 5

Glyph Representation: Bitmaps• A matrix of 1s and 0s to represent a character• Typical monitor display a character using a 16 x 16 bitmap

• Typical sizes and storage demand are shown • (not double size => quadruple storage)• Data compression(a lot of empty space)

Total Chars 87 x 94 8,178Type Size Storage(est)Simple 16 x 16 262kCommon 24 x 24 589kCommon 32 x 32 1MDetailed 64 x 64 4MDetailed 96 x 96 8MDetailed 128 x 128 16MDetailed 256 x 256 64M

Page 6: Chinese Character Output

Lecture 9 6

• Usually store small bitmaps and scale up but there are problems with the quality of slanted edges

• Linear scaling: from Old(xold, yold) to New(xnew, ynew),

where 0 <= xold<= (WidthOLD -1), 0 <= yold<= (HeightOLD-1)

and 0 <= xnew<= (WidthNEW -1), 0 <= ynew<= (HeightNEW -1)

assuming Height and Width values are integers

• rx= WidthNEW/WidthOLD , ry=HeightNEW /HeightOLD

• If rx >1 and ry >1, then it is called scaling up

• New(xnew, ynew) = New(x * rx, y* ry) = Old(x , y )

Page 7: Chinese Character Output

Lecture 9 7

Smoothing techniques for scaling

• Ad Hoc Techniques (No underlying model but cheap):

– Enlargement (Matrix manipulation)

• Thresholding: convert into bitmap (assign 1 if >= 0.4 for unidirectional)

Page 8: Chinese Character Output

Lecture 9 8

• Smoothing spline (齒形 ) and interpolation嵌入法(costly)

– Basis: Character bitmaps are a coarse sample of the original character

– Approach: Recover the curves of the character as continuous functions (cubic spline) and then interpolate or generate the bitmaps of another size

– Optimization: Minimize the unsmoothing

Page 9: Chinese Character Output

Lecture 9 9

Bezier Curves

• P(t) = (x(t), y(t)): any pointin the curve(0<= t <= 1)

• Cubic Bezier: 4 points– end points coincide with curve

– other points control shape (can specify gradient at end points)

• X(t) =X0*(1-t)3 + 3* X1*(1-t)2*t + 3*X2*(1-t) *t2 + X3*t3

• Y(t) =Y0*(1-t)3 + 3* Y1*(1-t)2*t + 3*Y2*(1-t) *t2 + Y3*t3

Page 10: Chinese Character Output

Lecture 9 10

Glyph Representation: Outline

• Characters as shapes enclosed by lines or curves and specify these by parameters (i.e. data as an ASCII file and an interpreter to generate the graphic image)

• Line specified by 2 points• Curve: (usually cubic Bezier) specified by 4 points

– end points coincide with curve

– other points control shape

Page 11: Chinese Character Output

Lecture 9 11

• Advantages comparing to bitmaps:

– Scaling does not affect quality (Major)

– Does not need to store different sized fonts (a compression of extremely detailed/large fonts)

– Compression (as in standard text)

– Email transport without encoding and decoding

• Example of a Postscript for the Chinese Character 一 :

Page 12: Chinese Character Output

Lecture 9 12

• Unit of measurements: 1 point = 1/72 of an inch and the coordinates starts at the bottom left corner and coordinate translation is needed.

• Postscript level 1 font(base font) can handle only up to 256 characters in each set.

• It maps 256 code into names of fonts in the set.• Postscript Level 0 fonts: Composite Font

– Double byte encoding:– 1st byte: index to base font– 2nd byte: code in the particular base font

Page 13: Chinese Character Output

Lecture 9 13

• CID-keyed fonts(pp 288)

A technique to make character glyph definitions be independent of codeset.– Each character glyph is given a CID which uniquely

defines a glyph shape.

– A CMap is a file which contains mapping of character encodings with glyphs(CID).

– A CIDFont file contains the pointers to the actual descriptions of the glyphs. A CIDFont file usually keeps character glyphs with the same style.

• Other outline fonts include: TrueType fonts and OpenType. They different in the data structures/ header forms.

Page 14: Chinese Character Output

Lecture 9 14

Bitmap-to-Outline Conversion• Determine outline for all the straight lines • Generate curve list: a curve must begin and end in two

different corner (therefore needs to find corners: compute an angle between two vector points along the outline)

• Preprocessing for curve-fitting: knee removal, smooth filtering to yield finer co-ordinates of sample points.

• Perform curve fitting: iterations try to improve fitting goodness (measured as the least square error)

• End point alignment: close end points of two consecutive splines are merged by averaging their positions

Page 15: Chinese Character Output

Lecture 9 15

Page 16: Chinese Character Output

Lecture 9 16

Getting outline pixels through erosion

• Finding the outline of a bitmap is to find the pixel that is located inside an object, but that has at least one neighbour outside the object

• Basic idea– Find the bitmap with its edge pixels

removed:erosion( a smaller cross)

– Original bitmap with the eroded

bitmap removed.

Page 17: Chinese Character Output

Lecture 9 17

• Need more mathematical terms and binary image operation

• Translation:The displacement in either the x direction, the y direction or both at once. It is the reposition of the co-ordinate system.

• Suppose B is a binary image,

• Bxy means to move B by the

coordinates(x,y).

(0,0)origin

(x,y)Translated

Page 18: Chinese Character Output

Lecture 9 18

• Erosion of B(a bitmap): is a set of coordinates (x,y) such that S translated by (x,y), is contained in B.

• E = B ⊕ S = {(x,y) | Sxy B}

• S(4 pixels of blacks):

• Against • and their rotations• Returns all the points in B whose neighbors are not

the boarder (edge) pixels.

Page 19: Chinese Character Output

Lecture 9 19

• Outline pixels:

• B - (B S)