פרק 3 ניתוח לקסיקאלי

  • View
    61

  • Download
    4

Embed Size (px)

DESCRIPTION

3 . ( analysis ). lexical analyser. syntax analyser. semantic analyser. intermediate code generator. code optimizer. ( synthesis ). - PowerPoint PPT Presentation

Transcript

  • lexical analyser

  • (tokens) (, (tab), ) (comments) ( , , )

  • " (pattern matching)if ( alpha33
  • - (subroutine) (coroutine)

    (" ) ( ) - (producer-consumer)

  • (Lexical Analyser) (Scanner) -: (scanning) , ' (lexical analysis)

    , , :

  • ? : , (portability) , , '

  • , (pattern) - "

    (lexeme) - "

    (token) (terminal)

  • ,

  • : (keywords) (identifiers) (constants) (literal strings) (punctuation symbols)

  • Fortran-Fortran (-Algol 68) ( ),

    1 -2 ( )

  • 2PL/1-PL/1

  • (Attributies)

    (identifier) (number) (relation) (-lexeme)

  • " "

  • k " k " , -k

    , "

  • (character) - (alphabet) () , " : {0,1}, {AZ}, ASCII, UNICODE (string) - (empty string) 0 , "

  • 2 (prefix) (0 ) (suffix) (0 ) - (substring) (0 ) / , - (proper) -0, (subsequeue) (0 )

  • (Language) (language) ( ) - : { } * : C "

  • (Concatenation) x -y xy

    :x = suby = stringxy = substring : L : s L : s = s = s

  • (Exponential) :ss = s2 (exponentiation):s0 = si = ssi-1 = si-1s i > 0:s0 = s1 = ss0 = s = ss2 = ss1 = ssss3 = ss2 = sss

  • = { AZ, az, 09 } -L = { AZ, az } 1D = { 09 } 1

    LD LD L4 4 L ( L )L(LD) D+ ( D+ )

  • (Regular Expression) - . ( ). " (regular set). -L(r) " r. L(s) = L(r) r -s . s = r.

  • * |

    ""

    a | b * c = ( a ) | ( ( b ) * ( c ) )

  • a | b { a , b }(a | b) (a | b) = aa | ab | ba | bb { aa, ab, ba, bb } a* { , a, aa, aaa, ... }(a | b)* = (a*b*)* ( 0) a -ba | a*b { a, b, ab, aab, aaab, ... }

  • (Regular Definition) d r (regular definition) d1 r1d2 r2. . .dn rn rk di i < k :dk di dk ( ri )

  • :letter A | B | ... | Z | a | b | ... | zdigit 0 | 1 | ... | 9id letter ( letter | digit )*:digit 0 | 1 | ... | 9digits digit | digit *opt_fraction . digits | opt_exponent ( E ( + | | ) digit ) | number digits opt_fraction opt_exponent

  • ( r )+ rL( r+ ) = ( L(r) ) + r+ = r r* r* = r+ | ( r ) ? r L( (r)? ) = L(r) { } + -? *[ abc ] (character class) L( [ abc ] ) = L( a | b | c )[ az ] (character range) L( [ a z ] ) = L ( [ abc ... z ] )

  • digit [ 09 ]digits digit +opt_fraction (. digits ) ?opt_exponent ( E ( + | ) ? digit ) ?number digits opt_fraction opt_exponent

  • L = { aibai } ,

    L = { wcw | w = (a | b)* }

  • stmt if expr then stmt | if expr then stmt else stmt | expr term relop term | termterm id | num

  • ififthenthenelseelserelop< | = | >idletter ( letter | digit )*numdigit+ (.digit+ )? ( E (+ | )? digit+ )?letter[ AZaz_ ]digit[ 09 ]wsdelim+delimblank | tab | newline

  • : :

  • (Transition Diagram) ( ) (start state) " (accepting states) " ( ) other -

  • 2 "" " ( , , ') :

  • 3 " : : (greedy) ( )

  • " " 2.1 2.2 :2.2.1 2.2.2 -other 2.2.3 ( ) 2.2.4 1 ( )( )

  • ? ? ( ): (: 1. < x 1.0 < x ) (ASU 3.4 pp.104106)

  • Lex / Lex

    lex C

    cc

    scanner -Lex

    scanner.l -C

    scanner.c -C

    scanner.c

    scanner " ( )

  • -Lexdeclarations

    %%

    translation rules

    %%

    auxiliary procedures ( )

  • -Lex ' %{ /* Definitions of the constants * LT, LE, EQ, NE, GT, GE, * IF, THEN, ELSE, ID, NUMBER, RELOP */%}

    /* Regular definitions */delim [ \t\n]ws {delim}+letter [A-Za-z]digit [0-9]id {letter}({letter}|{digit})*number {digit}+(\.{digit}+)?(E[+\-]?{digit}+)?

    %%

  • -Lex ' %%{ws} { /* no action and no return */ }

    if { return(IF); }then { return(THEN); }else { return(ELSE); }

    {id} { yyval = install_id(); return(ID); }{number} { yyval = install_num(); return(NUMBER); }

    = { yyval = GE; return(RELOP); }%%

  • -Lex ' %%

    install_id(){ /* Retrieve the lexeme *//* yytext points to the input string *//* yyleng the length of the lexeme *//* Check if id is in Symbol Table (ST) *//* If not insert the new id into the ST *//* Return a pointer to the symbols entry */}

    install_num(){/* Handle numeric values in a similar manner */}

  • (Language Recognition) (recognizer) ( ) "" "" . ( ) (accepts) .

  • -

  • Finite AutomatonFA = S, , , s0, F S (states) (alphabet) (transition function)s0 (start/initial state) F (accepting/final states)

    - (NFA = Nondeterministic Finite Automaton) (DFA = Determenistic Finite Automaton) : s S, | (s, ) | 1 s S (s, ) = : S ( {}) 2Ss0 SF S

  • " " () " start " 1s 2s {} " s2 (s1, ) - (a* | b*) c 03

  • - DFA NFA NFA DFA DFA NFA " " (subset construction) NFA DFA -3.6 ASU (' 3.2)

    -DFA -NFA" DFA

  • T () |S| +1 || ( || )T[s, ] = (s, ) - ( -) - ( -) , " ( )

  • () "

    s = s0;while ( defined(s) && (c=getchar()) != EOF ) s = move(s, c);return s F; -

    S = _closure ( {s0} );while ( ! empty(S) && (c=getchar()) != EOF ) S = _closure( moves(S, c));return S F ;move undef -S " moves -S " _closure -S -S " : O(n) : O( |S| n)

  • " :r r rr r rr r *r ( r )r r

  • . " " . " :N ( (s) ) = N ( s )

  • NFA . N(r) r. N(r) r. : | ( ) * ( ) : | N(r) | 2|r|. ,

  • " - N(r) r ( O(|r|) ) x ( O(|r||x|) ) (editor)

    - N(r) r ( O(|r|) ) D(r) N(r) ( O(2|r|) ) x ( O(|x|) )

  • 2 D(r) N(r) x s (s,) ( s)

  • Lex

    Lex lexeme FA

  • 3