33
CSC312 Automata Theory Lecture # 2 Languages

CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

Embed Size (px)

Citation preview

Page 1: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

CSC312Automata Theory

Lecture # 2

Languages

Page 2: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

2

Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations.

Valid/In-valid alphabets: An alphabet may contain letters consisting of group of symbols for example Σ= {a, ba, bab, d}.

Remarks: While defining an alphabet of letters consisting of more than one symbols, no letter should be started with the letter of the same alphabet i.e. one letter should not be the prefix of another. However, a letter may be ended in a letter of same alphabet.

Valid alphabet :Invalid alphabet :

, ,a ba c

Alphabet and Strings

, ,a ab c

Page 3: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

3

String or word: A finite sequence of letters/alphabets

Examples: “cat”, “dog”, “house”, “read” …

Defined over an alphabet:

Language: A language is a set of strings constructed from some alphabet e.g. Urdu, English, Java, the set of all binary strings

zcba ,,,,

Alphabets and Strings

Page 4: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

Sentences are made up of certain combinations of words.

Not all combinations of words lead to a valid English sentence.

So we see that some basic units are combined to make bigger units.

Page 5: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

LanguagesHow can you tell whether a given

sentence belongs to a particular languagesBlack is cat theThe tea is hotI like chocolates two much

Rules give a clue to forming as well as validating sentences.

Page 6: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

Formal vs. Informal Rules

Informal language -> abstract languages

Incoherent strings are also understandable

Slang, idiom, dialect etc.

Raise ambiguityInterpretation varies with region

I am through (BrE/AmE)

Same words have multiple meanings.Like, light, base, etc.

Page 7: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

Summary of Languages

Three aspects/specificationsLexical

Defines valid words/units of a language

SyntacticDefines rules for combining the units to

form valid sentences (computer programs in context of machines)

SemanticConcerned with the interpretation or

meaning of a sentence (what output to produce in context of machines)

Affected by ambiguity the most.

Page 8: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

Formal languages

Rules defined explicitly and clearlyNo ambiguitiesUniversally uniform understandingLets the machine

Interpret an input uniformly every time. i.e. always produces same output for a particular input

Avoid crashes because of ambiguity.Explicitly and categorically reject invalid

input

Page 9: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

Formal Languages

Need uniformly understandable notationRepresentations

AlphabetRepresents a finite set of fundamental

units of lanauges, e.g. for English ={a,b,….z.A,…Z,}

∑ = {0,1}

∑ = {0,1,2,3,4,5,6,7,8,9}

Page 10: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

Formal Languages

List of wordsSet of all valid words of a given

language, e.g., a language English_Words that contains all valid words of English would have a = {all entries of the dictionary + punctuation marks and blank space}

Denoted by Is Finite or Infinite set.

Strings: A string a finite sequence of symbols chosen from alphabet. For example

0111100 , 123045, abbbcdeg etc.

Page 11: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

String Variable: A letter used for denoting a string. The author uses w, x, y and z as string variable. For example

w = 0111100 , x = 123045, z = abbbcdeg

Length of String: The number of positions for symbols in the string. For simplicity we can say that it is the number of symbols in the string. For example

|w| = 7 , |x| = ? , |z| = ?

Page 12: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

12

Alphabets and Strings

We will use small letters for alphabets:

Strings

abbaw

bbbaaav

abu

ba,

baaabbbaaba

baba

abba

ab

a

Page 13: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

13

String Operations

m

n

bbbv

aaaw

21

21

bbbaaa

abba

mn bbbaaawv 2121

Concatenation

abbabbbaaa

Let we have following strings

Reverse

12aaaw nR abba

aaabbb

Page 14: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

14

String Length

Length:

Examples:

naaaw 21

nw

1

2

4

a

aa

abba

Page 15: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

15

Length of Concatenation

Example:

vuuv

853

8

5,

3,

vuuv

aababaabuv

vabaabv

uaabu

Page 16: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

16

Empty String

A string with no letters: Observations:

Note-1: A language that does not contain any word at all is denoted by or { }. This language doesn’t contain any word not even the NULL string. i.e. { } ≠ {}

or or 0

w w w

abba abba abba

Page 17: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

17

Empty String

Note-2: Suppose a language L doesn’t contain NULL then

L = L + but L ≠ L + {}.

Important : NULL is identity element with respect to concatenation.

Page 18: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

18

SubstringSubstring of string:

a subsequence of consecutive characters

String Substring

bbab

b

abba

ab

abbab

abbab

abbab

abbab

Page 19: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

19

Prefix and Suffix Let the string is Prefixes Suffixes

abbab

abbab

abba

abb

ab

a

b

ab

bab

bbab

abbab uvw

prefix

suffix

Page 20: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

20

Repeat Operation

- w repeated n time; that is,

Example:

Definition:

n

n wwww

abbaabbaabba 2

0w

0abba

nw

Page 21: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

21

The * Operation

: the set of all possible strings from alphabet , called closure of alphabets also

known as Kleene star operator or Kleene star closure.

i.e. infinitely many words each of finite length.

*

,,,,,,,,,*

,

aabaaabbbaabaaba

ba

Page 22: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

22

The + Operation

: the set of all possible strings from alphabet except , also known as Kleene plus operator.

Note : are infinite

,,,,,,,,,*

,

aabaaabbbaabaaba

ba

* ,,,,,,,, aabaaabbbaabaaba

* and

Page 23: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

23

LanguagesA language is a set of strings ORA language is any subset of , usually

denoted by L. It may be finite or infinite. Example:

Languages:

If a string w is in L, we say that w is a sentence of L.

*

,,,,,,,,*

,

aaabbbaabaaba

ba

},,,,,{

,,

aaaaaaabaababaabba

aabaaa

Page 24: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

24

Note that:

}{}{

0}{

1}{

0

Sets

Set size

Set size

String length

Page 25: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

25

Another Example

An infinite language }0:{ nbaL nn

aaaaabbbbb

aabb

ab

L Labb

Page 26: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

26

Operations on LanguagesThe usual set operations

Complement:

aaaaaabbbaaaaaba

ababbbaaaaaba

aaaabbabaabbbaaaaaba

,,,,

}{,,,

},,,{,,,

LL *

,,,,,,, aaabbabaabbaa

Page 27: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

27

ReverseDefinition:

Examples:

ConcatenationDefinition:

Examples:

}:{ LwwL RR ababbaabababaaabab R ,,,,

{ : 0}

{ : 0}

n n

R n n

L a b n

L b a n

2121 ,: LyLxxyLL

, , ,

, , , , ,

a ab ba b aa

ab aaa abb abaa bab baaa

Page 28: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

28

Repeat OperationDefinition:

L concatenated with itself n times.

Special case:

n

n LLLL

bbbbbababbaaabbabaaabaaa

babababa

,,,,,,,

,,,, 3

0

0, ,

L

a bba aaa

Page 29: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

29

More Examples

}0:{ nbaL nn

}0,:{2 mnbabaL mmnn

2Laabbaaabbb

Page 30: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

30

Star-Closure (Kleene *)

Definition:

Example:

210* LLLL

,,,,

,,,,

,,

,

*,

abbbbabbaaabbaaa

bbbbbbaabbaa

bbabba

Page 31: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

31

Positive Closure

Definition:

Note: L+ includes if and only if L includes

*

21

L

LLL

,,,,

,,,,

,,

,

abbbbabbaaabbaaa

bbbbbbaabbaa

bba

bba

Page 32: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

32

Lexicographical OrderAssume that the symbols in are themselves ordered. Definition: A set of strings is in lexicographical order if -The strings are grouped first according to their length. -Then, within each group, the strings are ordered “alphabetically” according to the ordering of the symbols.

Page 33: CSC312 Automata Theory Lecture # 2 Languages. 2 Alphabet: An alphabet is a finite set of symbols, usually letters, digits, and punctuations. Valid/In-valid

Ex: Let the alphabet beThe set of all strings in Lexicographical order is, a, b, aa, ab, ba, bb, aaa, …., bbb, aaaa, …, bbbb, ….

33

Lexicographical Order

{ , }a b