10
IEEE TRANSACTIONS ON COMPUTERS, VOL. c-23, NO. 12, DECEMBER 1974 Structure Automata YAACOV A. CHOUEKA Abstract-By modifying the acceptability conditions in finite automata, a new and equivalent variant-the "structure autom- aton"-is obtained. The collection SR(2) of sets of tapes on 2 definable by deterministic structure-automata forms, however, a proper subset of the collection of regular sets. The structure and closure properties of SR(2) are analyzed in detail, using a natural topology on 2, in which the closed sets are the reverse ultimately definite sets. A set of tapes V is in SR(2) iff it is a finite union of regular "convex" sets. SR(2) is closed under Boolean operations, but not-closed under product, star, or transpose operations. In fact, SR(2) is exactly the Boolean closure of the regular closed sets. The "sigture" of a set is also defined and it is shown that a regular V is in SR(2) iff it has finite signature. Decision problems are also treated. Index Terms-Closed regular sets, convex languages, definite an& ultimately definite sets, finite automata, languages with finite signatures, minimal regular sets, open regular sets, structure automata. I. INTRODUCTION IN this paper we define and study a certain finite de- vice-the "structure automaton"-and the collection of sets defined by it. This device is related to, and was suggested by, a similar one fruitfully used by M. Rabin to develop his complex theory of finite automata on infinite trees [11]. The interest in this model is twofold. First, it is a natural setting for synthesizing conditions on input tapes, of the form "X contains a "O" but not a "1" or a "T' but not a ""," etc; in fact, one may some- times save states, in such cases, by passing from finite automata to structure automata. Second, the collection of sets defined by the deterministic structure automata- which is a proper subclass of the collection of all regular sets-seems to be of interest in its own, since it can be characterized by using two more different and totally independent methods: the "topological" approach and the "signature" concept developed below. The plan of the paper is as follows. After giving the appropriate definitions in Section II we show in Section III that these devices are equivalent to finite automata (in the sense that they define all-and only-regular sets), even when the structure is "maxi- mally" restricted, and that they are essentially non- Manuscript received December 11, 1972; revised August 3, 1974. The author is with the Department of Mathematics, University of Illinois, Urbana, III. 61801, on leave of absence from the Depart- ment of Mathematics, Bar-Ilan University, Ramat-Gan, Israel. deterministic. The rest of the paper is centered around the study of the sets-termed here "structure regular" or, in short, S-regular sets-defined by the deterministic structure automata. Interestingly enough, these sets turn out to be closely related to the open (or closed) sets of a certain natural topology (which may be of some imporj tance in its own) on the space of all tapes on a given alphabet 2. After giving a systematic development of the topological notions involved, we apply these notions, in Section V, to the regular sets, and show how they can be fruitfully used to obtain a unified approach to some well-known facts in this domain. All these results are then used, in Sections VI and VII, to study the closure properties of S-regular sets, and to give set-theoretical characterizations (in the Kleene fashion) of the collection of these sets. It is shown, in particular, that the collection of S-regular sets is the Boolean closure of the regular closed sets, or (which is exactly the same), of the regular reverse ultimately definite sets, introduced by Paz and Peleg in 17]. It is also the closure of the reverse definite sets (studied in [1] and [8]) under Boolean operations and left product by "minimal" regular sets. (A set of tapes is minimal if it does not contain two different tapes X, Y such that X is a prefix of Y.) What is needed now is an algebraic or set-theoretical characterization of min- imal regular sets, which is independent of the general notion of regularity. Decision problems are also treated. In Section VIII, we introduce the general notion of a "signature" of 'a set: of tapes, and give some-of its proper- ties, especially with relation to regular sets. We then show that a regular set of tapes is S-regular if and only if it has a finite signature. We have thus here one of those happy-and rather rare-cases, when some subcollection of the collection of relar sts is pinked out anddeined in nonmachine terms,h and -then showin to be the class of sets defined by a fimnte machine of some sort. Finally some interesting connections between these notions and the theory of finite automata on infinite tapes are given in Section IX. We assume that the reader is familiar with the elemen- tary notions and techniques of automata theory. To make the paper self-contained, however, we have collected at the beginning of the next section all the definitions and facts relevant to our context, with full references. The reader interested in gaining more intuitive background about these ideas may wish to refer again to the now classic paper [9]. 1218

Structure Automata

  • Upload
    ya

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Structure Automata

IEEE TRANSACTIONS ON COMPUTERS, VOL. c-23, NO. 12, DECEMBER 1974

Structure Automata

YAACOV A. CHOUEKA

Abstract-By modifying the acceptability conditions in finiteautomata, a new and equivalent variant-the "structure autom-aton"-is obtained. The collection SR(2) of sets of tapes on 2definable by deterministic structure-automata forms, however, aproper subset of the collection of regular sets. The structure andclosure properties of SR(2) are analyzed in detail, using a naturaltopology on 2, in which the closed sets are the reverse ultimatelydefinite sets. A set of tapes V is in SR(2) iff it is a finite union ofregular "convex" sets. SR(2) is closed under Boolean operations,but not-closed under product, star, or transpose operations. In fact,SR(2) is exactly the Boolean closure of the regular closed sets. The"sigture" of a set is also defined and it is shown that a regular Vis in SR(2) iff it has finite signature. Decision problems are alsotreated.

Index Terms-Closed regular sets, convex languages, definitean& ultimately definite sets, finite automata, languages with finitesignatures, minimal regular sets, open regular sets, structureautomata.

I. INTRODUCTION

IN this paper we define and study a certain finite de-vice-the "structure automaton"-and the collection

of sets defined by it. This device is related to, and wassuggested by, a similar one fruitfully used by M. Rabinto develop his complex theory of finite automata oninfinite trees [11]. The interest in this model is twofold.First, it is a natural setting for synthesizing conditions oninput tapes, of the form "X contains a "O" but not a"1" or a "T' but not a ""," etc; in fact, one may some-times save states, in such cases, by passing from finiteautomata to structure automata. Second, the collectionof sets defined by the deterministic structure automata-which is a proper subclass of the collection of all regularsets-seems to be of interest in its own, since it can becharacterized by using two more different and totallyindependent methods: the "topological" approach andthe "signature" concept developed below.The plan of the paper is as follows.After giving the appropriate definitions in Section II

we show in Section III that these devices are equivalentto finite automata (in the sense that they define all-andonly-regular sets), even when the structure is "maxi-mally" restricted, and that they are essentially non-

Manuscript received December 11, 1972; revised August 3, 1974.The author is with the Department of Mathematics, University

of Illinois, Urbana, III. 61801, on leave of absence from the Depart-ment of Mathematics, Bar-Ilan University, Ramat-Gan, Israel.

deterministic. The rest of the paper is centered aroundthe study of the sets-termed here "structure regular"or, in short, S-regular sets-defined by the deterministicstructure automata. Interestingly enough, these sets turnout to be closely related to the open (or closed) sets of acertain natural topology (which may be of some imporjtance in its own) on the space of all tapes on a givenalphabet 2. After giving a systematic development of thetopological notions involved, we apply these notions, inSection V, to the regular sets, and show how they canbe fruitfully used to obtain a unified approach to somewell-known facts in this domain. All these results arethen used, in Sections VI and VII, to study the closureproperties of S-regular sets, and to give set-theoreticalcharacterizations (in the Kleene fashion) of the collectionof these sets. It is shown, in particular, that the collectionof S-regular sets is the Boolean closure of the regularclosed sets, or (which is exactly the same), of the regularreverse ultimately definite sets, introduced by Paz andPeleg in 17]. It is also the closure of the reverse definitesets (studied in [1] and [8]) under Boolean operationsand left product by "minimal" regular sets. (A set oftapes is minimal if it does not contain two different tapesX, Y such that X is a prefix of Y.) What is needed nowis an algebraic or set-theoretical characterization of min-imal regular sets, which is independent of the generalnotion of regularity. Decision problems are also treated.

In Section VIII, we introduce the general notion of a"signature" of 'a set:of tapes, and give some-of its proper-ties, especially with relation to regular sets. We then showthat a regular set of tapes is S-regular if and only if ithas a finite signature. We have thus here one of thosehappy-and rather rare-cases, when some subcollectionof the collection of relar sts is pinked out anddeinedin nonmachine terms,h and -then showin to be the classof sets defined by a fimnte machine of some sort.

Finally some interesting connections between thesenotions and the theory of finite automata on infinitetapes are given in Section IX.We assume that the reader is familiar with the elemen-

tary notions and techniques of automata theory. To makethe paper self-contained, however, we have collected atthe beginning of the next section all the definitions andfacts relevant to our context, with full references. Thereader interested in gaining more intuitive backgroundabout these ideas may wish to refer again to the nowclassic paper [9].

1218

Page 2: Structure Automata

CHOUEKA: STRUCTURE AUTOMATA

II. NOTATION, TERMINOLOGY, ANDRECAPITULATION OF KNOWN RESULTS

We recall some basic definitions from automata theory.For a given alphabet z we denote by 2* the set of allfinite tapes (sequences) on Z; for X C *?, X is thelength of X; A is the unique tape of length 0. The productof the tapes X = xl... x. and Y = Yin..y. is the tapeXY xi ...x.y- - ym; XA AX = X. The transpose ofX = x.. x is the tape XT - x 1xi; AT = A.

Sets of tapes will be denoted by U, V, etc.; union, inter-section, and complementation (with respect to 2*) willbe denoted as usual by u,n,-. If U is finite, then the num-ber of its elements will be denoted by II U II; U is asingleton set if II U II = 1. P(U) is the power set of U.The product of two sets of tapes U, V C 2* is the setUV = {XY: X C U, Y e V}. Powers are defined byVI = {A}, Vn+1 = VIV. The star of a set V is the set

V*= UVi,0

and its transpose is the set VT = {XT: X C V}. Thederivative of V with respect to U is the set: Du (V) ={Y: XY C V, for some X C U}.A table on 2 is a triple T = (S,M,S1) where S is a

nonempty finite set (the set of states), M is a function(the transition function) from S X 2 to P(S), and SI(the set of initial states) is a nonempty subset of S.For X = (xi) a tape of length n > 0 on 2, a run of Ton X is a sequence 0 = (si) of length n + 1 of elementsof S, such that so C Si, si+i C M(si,xi). A finite autom-aton a is a pair a = (T,F) where T = (S,M,S1) is atable and F C S is a specified set of final states. X E 2*is accepted by a, in symbols:. X E T (a), if there is a runof T on X with the last state in F. A set V C X* is de-fined by (a if V = T(a,); V is regular if it is defined bysome finite automaton (t.A table T = (S,M, 1S) is deterministic if S, and M (s,o)

are singleton sets (for every s e S and a- E 2). In thiscase M can be extended on all S X 2* by the inductivedefinition: M(s,Xa) = M(M(s,X),o-). Such a table T =(S,M, Is, }) is connected if for every s C S, there is someX e I* such that M(s1,X) = s. A finite automatona = (T,F) is deterministic, or connected, if the corre-sponding table T has these properties.The following facts are well known (for proofs see [9];

the number in parentheses refer to the numbered theoremsin that paper). Every regular set is defined by some(connected) deterministic automaton [9, th. 11]. Thecollection of regular sets is closed under union, intersec-tion, complementation [9; th. 5], product, star, and trans-pose operations [9, th. 13.4] and contains all singletonsets of tapes. Given a finite automaton a one can effec-tively decide whether T(a) is empty, finite, or infinite

[9, ths. 7.1 and 9.1); given two automata a,(3, one caneffectively decide whether they are equivalent, i.e.,whether T (a) = T(63) [9, 10.1]. A set is regular if andonly if all its derivatives are regular [12, th. 2].

Finally, a set V C 2* is definite (cf., [1], [8]) ifV = V1 U V*V2 where V, and V2 are finite; it is ultimatelydefinite (cf., [7]) if V = 2;*V1 for some VI C V. The"reverse" notions will be more useful here. V is reversedefinite (reverse ultimately definite) if VT is definite(ultimately definite, respectively).

III. STRUCTURE AUTOMATA-BASIC FACTS

Definition 3.1: Let S be a finite set. A structure on Sis a binary relation on P(S), i.e., a set of pairs (H,L)where H C S and L C S. The structure Q is simple ifII Q II = 1, and atomic if II H jj < 1 and ILII < 1 forevery (H,L) C U. A set S' C S satisfies U, in symbols:S' C [Q], if there is some (H,L) C Q such that S' n H 0,S' nL = 0.

If 4 = (si) is any sequence of elements of S, we putJ(4) = Is: s = si for some i}.

Definition 3.2: A structure-automaton a is a pair (T,Q),where T is a table and Q is a structure on the set of statesof T. A tape X is accepted by a (X C T(a)) if there issome run 0 of T on X such that J(4) C [Q]. V C 2* isdefined by a if V = T(a). A structure automaton issimple or atomic if its structure has these properties.We immediately remark that if a = (T,Q) is a struc-

ture automaton, and (H,L) C Q, then we can alwaysassume that H n L = 0; for, in the other case, we justreplace (H,L) by (H - L,L) without affecting T (a).

Notation: When dealing with singleton sets of states,we shall usually omit the braces enclosing their uniquemembers; thus we write (S,M,si,s2) instead of (S,M,SI1,52 1 ), and (si,52) CEQinstead of (5s 1,I521) C Q, etc.Example 3.3: Let x = {0,1,21 and a = (1801,S821,

M,s2,9) be the deterministic structure automaton on 2defined by Q = {(S,so),(S,s) I and M(s,i) = si for i =0,1,2 and s C S. It is easily seen that the set defined bya is V = 10,21* U 11,21*. The minimal finite automatondefining V, however, has four states (as follows from[9, corollary 2. 1]).

Similarly if we take the set V of all tapes on =10,1,2,31 which contain a "0" but not a "1,,, or a "2"but not a "3", then a deterministic structure automaton,with only 5 states, defining V can be found, while theminimum number of states in any finite automatondefining V is 8. Thus one may sometimes save states bypassing from finite automata to structure automata; thisis perhaps not so surprising, since the "checking mecha-nism" of the latter is more complex than that of thefinite automata.

Theorem 3.4: Let V C 2* be a set of tapes. Then thefollowing three conditions are equivalent. 1) V is aregular set. 2) V is defined by some simple atomic struc-

1219

Page 3: Structure Automata

1220

ture automaton. 3) V is defined by some structure autom-aton.

Proof: Let V be a regular set and put U = VT. U iscertainly regular and so is defined by some deterministicautomaton with one initial state and a set of final states.By the construction of [9, th. 12], UT = (VT)T = V isdefined by some (nondeterministic) automaton a =(S,M,S1, {t) with a set of initial states SI and a uniquefinal state t. Define a structure automaton a' = (S',M',Si/IQ) as follows: S' = (S X {O,1}) U I f1, where f (for"failure") is a new state; SI' = SI X {0} (and if t E SIwe adjoin to Si' also the state (t,j)); M'((s,O),o) =M](s,o) X {0,1}, M'((s,1),r) = M'(f,) = {f}; finally,= { ((t,),f) }.It is easy to see that T (a') is indeed V. In fact, all we

have done is to allow a' to "guess" for the end of thetape X (by choosing 1 as the second component of thestate), and if it guesses wrong, it "sinks" into a failurestate, from which it never recovers. The formal proof issimple and is left to the reader.

Condition 2) entails 3) by definition. We show that 3)entails 1), thus proving the theorem. Let V = T (a),where a = (S,M,S,,Q) is a structure automaton. Becauseof the closure of regular sets under union, we may assumewithout loss of generality that Q is a singleton structure,i.e., Q = { (H,L) }. Define a (nondeterministic) finiteautomaton a' = (R,N,R1,F) as follows. Let S' = S -L,and put

R =S' u(S' X {1}),

R1= [(S1 n H) X {1}] u ESln (S'-H)],

N(s', )= [(M(s',ca) n H) X {1 }]

U [MI (s',a) n (S' - H) ],

N((s', ),)= (Mi(s',o) fn S') X {1}, F = S' X {1}.

Informally speaking, we first eliminate from S all the"failure" states of L. Then, every time we reach a statefrom H we "mark" it with a 1, and continue markingall subsequent states (not in L) by 1. A tape X is acceptedby a' if and only if there is some run on X with the laststate marked by 1. That T(a') = V is obvious. Q.E.D.

It is interesting to contrast this situation with that ofstructure automata on infinite tapes and trees. For in-finite tapes, the structure on the nondeterministic autom-ata may be restricted to be either simple or atomic-butgenerally not both-without weakening the acceptabilitypower of the automata (as is implicit in [3]); while forstructure automata on infinite trees, neither one of theseconditions may be specified, without essentially weakeningtheir power (oral remark by M. Rabin). Thus a certainnatural parallelism exists between the complexity of thestructures required and the complexity of the objects onwhich the automata act. (The situation for determi-

IEEE TRANSACTIONS ON COMPUTERS, DECEMBER 1974

nistic structure automata is different; cf., Theorem 7.2below.)

Theorem 3.5: Structure automata are essentially non-deterministic.

Proof: Take = {0}, V= {00}*. V is certainlyregular, however it is not a reverse definite set, so that byTheorem 6.5, (to be proven in Section VI) it is notaccepted by a deterministic structure automaton. For

; 1 1 > 1, let X be any tape of positive length on I.We show that the set V = *{X} (which is certainlyregular) is not accepted by any deterministic structureautomaton. (This example will be used again in SectionVI.) Suppose, on the contrary, that V = T(a), wherea = (T,2), T is deterministic, and II 1I = k. Let u- E zbe any letter different from the last letter in X, and for1 < j < k + 1, put Yj = (o-X) j, and let 4j be the run of(t on Yj. Since Y E V for every 1 < j < k + 1 theremust be 1 < il < i2 < k + 1 and (H,L) C Q such thatif we put Q' = t (H,L) }, then J(45j,) and J(q5,) satisfy Q'.Let now 4z be the run of a on Z = YIla. Since J(45j1) CJ(4z) C J(4*,), we have, J(oz) E [Q], that is Z E V,a contradiction. Q.E.D.

IV. A TOPOLOGY ON 2*

In this section we introduce a certain natural and usefultopology on 2*, and prove some of its basic properties.Though it is true that no "deep" results are really con-nected with these notions, we hope to show that thetopological nomenclature is a very useful framework fordealing with some subsets of the collection of regular sets,and can give a unified and smooth development of scat-tered results in this area.We note that most of these topological notions have

already appeared in the literature. Thus, in [4], a defini-tion was given of "open sets of tapes," which is equivalentto the one given below (some of the lemmata given in thissection are already implicit in [4]). Also the reverseultimately definite sets of [7] coincide with the closedsets of this topology. It seems, however, that no syste-matic development of these ideas has appeared in theliterature.

First, we define a partial ordering of 2* as follows.Definition 4.1: For X,Y E 2* we say that X is a prefix

of Y, and write X < YJ if Y = XZ for some Z E 2; .A set V C 2* is minimal if X < Y and Y E V entailX f V; it is convex if X,Y E V and X < Y entail Z E Vfor all X < Z < Y (that is, if together with any twocomparable tapes, it contains the whole "closed interval"defined by them).

Definition 4.2: A set V C 2* is open if X < Y andY E V entail X E V.

It is readily seen that arbitrary unions and intersectionsof open sets are open. Since 4 and 2;* are clearly open,we have indeed defined a topology, in the usual sense,on 2*. All elementary notions and results of point set

Page 4: Structure Automata

CHOUEKA: STRUCTURE AUTOMATA

topology can now be used (see, for example, [5] for moredetails).

In particular, V is closed if V is open, and, byDeMorgan's laws, arbitrary unions and intersections ofclosed sets are closed. The closure of a set V, cl(V), isthe intersection of all closed sets containing V, and itsinterior, int(V), is the union of all open sets containedin V. For later purposes we also introduce the followingdefinitions.

Definition 4.3: The kernel of V, k(V), is the set ofminimal elements of V. The convex h,ull of V, conh (V),is the intersection of all convex sets containing V. Theenvelope of V is the intersection of all open sets contain-ing V. The following lemma is immediate.Lemma 4.4: For any set V C 2*, the following holds.

1) int(V) and env(V) are open, cl(V) is closed, k(V) isminimal, and conIl(V) is convex. 2) V is open if and onlyif int(V) = V = env(V); closed iff cl(V) = V; minimaliff k(V) = V; and convex iff conk(V) = V.

It is also easily verified that a set V is closed if and onlyif X E V entails XZ C V for all Z C 2*, and that env(V)consists of all prefixes of tapes in V. This is used in thefollowing lemma which shows that topological notions in2 can be defined using formulas with set-theoreticaloperations.Lemma 4.5: For any set V C V, the following formulas

hold. 1) cl(V) = V2* = k(V)2*. 2) int(V) = cl(V).3) k(V) = V n VZ+. 4) env(V) = [Dz*(VT)j]T. 5)

conh(V) = cl(V) n env(V).Proof: As to 1), since V2* is closed and contains V,

cl(V) C V:*; if X C V2* let Y < X be the shortestsection of X which is in V; then Y C k(V) and Xk(V)2*; finally k(V) * is certainly contained in any

closed set which contains V, so that k( V) V C cl (V).This completes the proof of 1). Formula 2) is true inany topological space, and 3) and 4) are obvious. Comingto 5) put U = cl(V) n env(V). We first remark that Ucontains V and is convex (being the intersection of twoconvex sets), so that conh,(V) C U. Let now V1 be a

convex set containing V, and let X C U. There are thenZ1,Z2,Z3 such that X = ZJZ2, Zi E V, and Z1Z2Z3 E V.But then Z1Z2 = X E V, too, so that U C V,. ThusU C conh(V) and 5) is proven. Q.E.D.

It is now clear that V is closed if and only if V = V/2*for some V' C V. The closed sets are thus exactly thereverse ultimately definite sets of [7]. Incidentally we

remark that V, with this topology, is a To-space, since

two different points (tapes) have different closures; it isnot however, even a T1-space, since singleton sets are

not closed.In some topological spaces, a special role is played by

the "clopen" sets, i.e., sets which are both open and closed.In our case, however, only 0- and V are clopen, for ifV9£#0 is lopenen n V, and since V is closed, it is

equla 0to:; -. Instead, a special role will be playedhinthisconteit -by: sets which are the intersection of an open set

with a closed set, and these are precisely the convex setsas is now shown.Lemma 4.6: V is convex if and only if it is the inter-

section of a closed set and an open set.Proof: If V is convex then V = conh(V) = cl(V) n

env (V). On the other hand, let V = V1 n V2 where VI isclosed and V2 open, and take X < Z < Y, where X, Y C V.X C V1 which is closed, so Z e V1; Y C V2 which is open,soZ C V2, i.e., Z C V. Q.E.D.Another equivalent condition for V to be convex is that

V U env(k(V) ) is open. We omit the proof.Lemma 4.7: Denote by CO the collection of all convex

sets (in V*). The following statements are then true.1) All closed, open, minimal, or singleton sets are in CO.2) CO is closed under arbitrary intersections. 3) CO isnot closed under union, complementation, product, star,or transpose operations. 4) If V1 is minimal and V2 E CO,then V1V2 C CO. 5) If V is open or closed, then V* C CO;if it is minimal, then V* C CO if and only if V = ' C I.

Proof: As to statement 1), since V = V n 2 and 2is both closed and open, it is clear that open or closedsets are convex sets. Minimal sets vacuously satisfyDefinition 4.1, and singleton sets are certainly minimal.Statement 2 is trivial. As for 3) take V = {10,110,1010};V is not convex although it is the union of three singletonsets, and the product of the two convex sets {A, 1,10 }, { 10}I.If CO were closed under complements, then it would beclosed under unions too, which is not true. Finally, takeV = {01,1101}; V is convex although VT and V* are not.As for 4), suppose V1 minimal, V2 C CO, and let X <Y < Z be such that.X,Z C V1V2; then X = X1X2, Z =

Z1Z2, where Xi,Z. C V,, for i = 1,2. Since V1 is minimal,Xi = Z1. Let then Y = X1Y2; X2 < Y2 < Z2, SO Y2 C V2,that is Y C V1V2. Q.E.D.As to 5) it is easily verified that if V is closed or open,

then so is V*. Suppose now that V is minimal, and thereis some X C V with X > 1. Let a- be the first letter ofX. X and X2 are in V*, while Xo is not (since Xa- f Vand no prefix of X is in V). Thus V* is not convex.

Q.E.D.Lemma 4.8: The derivative of a closed, open, convex,

or minimal set V with respect to any tape X, is againclosed, open, convex, or minimal, respectively. (In partic-ular CO is closed under such derivatives.)

Proof: For closed or open sets, the assertion is trivial.For convex sets, suppose Y,, Y2 C Dx(V), where Y1 < Y2,and let Y1 < Z < Y2. We then have XYj,XY2 C V andXY1 < XZ < XY2, sothatXZ C V, thatis Z E Dx(V).Finally, if for some set V and some tape X, Dx(V) isnot minimal, there are Y1,Y2 E Dx(V) with Y, < Y2;but then XY1, XY2 C V and XY1 < XY2, which showsthat V too was not minimal. Q.E.D.We remark that Lemma 4.8 is not true for convex or

minimal sets, and general derivatives; just take V ={01,1111}, Q = {0,1}. DQ(V) is then equal to {1,i11},

1221

Page 5: Structure Automata

IEEE TRANSACTIONS ON COMPUTERS, DECEMBER 1974

and so is not convex (and not minimal) even though Vand Q are both minimal (and so convex too).

Turning now to cardinality questions, we note that itwas proven in [7, ex. 4.2] that the collection of all ulti-mately definite sets on any alphabet of at least two lettershas cardinality M. (The proof given there was related tosome notions and results of probabilistic automata theory.)This result is included in the following theorem.

Theorem 4.9: For every alphabet I with at least twoletters, the collections of all. 1) open sets, 2) closed sets,3) minimal sets, 4) convex sets, of tapes on S havecardinality N.

Proof: The cardinality of all these collections is atmost M; we show that it is at least N. Clearly it sufficesto show this for the alphabet -= {0,1 }. Let P bethe set of all infinite sequences on z of the form:0loi10n210n31. . ., where ni > 0. The cardinality of P is N.For any a E P, let I(a) be the set of all finite prefixesof a. I(a) is certainly open, and if a,b E P and a 4 b,then I(a) 5z# I(b). The collection C = {I(a): a E P} isthus a collection of K different open sets. The result forclosed sets and transpose of closed sets (i.e., ultimatelydefinite sets) follows immediately.For minimal sets, take C' = {VI{110}: 7V E C}. C' has

cardinality M, and all members of C' are minimal sets.For, if X,Y E V' = VI{110}, and X < Y then Y = XZwhere Z sz, A and X = X1110, that is Y = X111OZ. Butthis is impossible, since no tape Y in V' can have a propersubtape 110 which is not a final section of Y. Since everyminimal set is also convex, the result for convex setsfollows immediately. Q.E.D.

V. APPLICATIONS TO REGULAR SETS

Theorem 5.1: 1) If V C 2 is regular, then so are cl(V),int (V), k (V), and env (V). Moreover, given a finite autom-aton defining V, one can effectively construct the automatawhich define these sets. 2) Given a finite automaton a,one can effectively decide whether V = T (a) is closed,open, minimal, or convex.

Proof: Statement 1) of Theorem 5.1 follows immedi-ately from the formulas in Lemma 4.5 and the effectiveclosure of the collection of regular sets under Booleanoperations as well as transpose and taking derivativesoperations. Statement 2) follows from 1) together withLemma 4.4, statement 2) and the fact that we can checkwhether two given automata are equivalent or not.

Q.E.D.More direct and economical constructions can be given

using the following device.Definition 5.2: Let T = (S,M,si) be a deterministic

table, and S' C S. Take a new state f and define the tableT[S'] = (S U { f},M',si) by M'(s,o) = M (s,o) for s ( S',M'(s,o) = M(f,o-) = f for s E S'. T[S'] will be calledthe S' merging of T.Now let V = T(a,), where a, = (T,F) and T=

(S,M,si). cl(1V) is then defined by a1 = (T[F],F U { f}),int(V7) by a2 = (T[F],F), k(V) by 3 = (T[F],F), andenv((V) by a4 = (T,F') where F' is the set of all states sfor which T((S,M,s,F)) is not empty (F' can be effec-tively found).For the decision procedures, let V = T (a), a =

(S,M,si,F), and suppose a, is connected. The followingis then easily verified: V is closed if and only if M (s,o) E Ffor every s E F and a E z (this criterion was given byPaz and Peleg in [7]); V is open if and only if M(s,cr) f Ffor every s { F and a- E 2; V is minimal if and only if forevery s C F the automaton (S,M,s,F) accepts only A.For the convexity property, let F' = {M(s,a-) : s e F}I F,and put 63 = (T[F'],F). Certainly T(63) C'T((t). Weclaim that V is convex if and only if T(63) = T(a).Suppose V is not convex and X < Y < Z be such thatX,Z C V, while Y i V. If 0 - (si) is the run of (i on Z,there is certainly some i < Z such that si e F'; butthen the last state in the run of 63 on Z is f, so that Z {T ((3). The other side is similarly proven.Remark 5.3: The "converse," in some sense, of The-

orem 5.1 is not true, even for one-letter alphabets 1,as shown by the set V1= {X: XI =n2, n-1,2,= }which is certainly not regular, although k(V) = ,cl(V) = 32*, int(V) = 0, env(V) = x*, and all thesesets are obviously regular.

Corollary 5.4: A regular set V is convex if and only ifit is the intersection of a regular closed set with a regularopen set.

Proof: If V is regular and convex, then by Lemma 4.5,V = cl (V) n env (V) and by Theorem 5.1 these two setsare regular. Q.E.D.

TAeorem 5.5: All closed, open, convex, and minimal setson a one-letter alphabet z are regular. This is not true if

11 > 1.Proof: If 2 is a one-letter alphabet, then for everv

X,Y E , either X < Y or Y <X. Thus VC * isminimal if and only if it is a singleton set, which showsthat minimal sets are regular. If V is closed then V =k (V)7 where k(V) is regular, so V is regular too. Theresult for open or convex sets follows immediately. Weprove the second part of the theorem for z = {I0,1 }. LetV' = {Onln: n > 1}, and V = TV'2. 17' is minimal, butnot regular; also V is closed but not regular, otherwisek(V) = V' would be regular too, which is not true;finally V is an example of an open set which is not regular.

Q.E.D.Incidentally, we remark that the preceding example

shows that an infinite union of regular closed sets maybe not regular; indeed V = U V,,' where V,' = IOnl.12*and these sets are certainly closed and regular.As an application, we show now how these notions can

be used to exhibit a decision procedure for checkingwhether a given regular set V is reverse definite, thus,also, if it is definite. For other decision procedures, see

1222

Page 6: Structure Automata

CHOUEKA: STRUCTURE AUTOMATA

[1] and [8]; all these decision procedures (except the Proof: 3) If V is a reverse definite set, then V =third one in [8]) require the construction of the minimal V1 u V22;* = V1 u cl(V2) where V1 and V2 are finite. Sinceautomaton defining V. every finite set is a finite union of regular minimal sets,

Theorem 5.7: A regular set V is reverse definite if and the theorem is proven. Q.E.D.only if k ( V) and Vo = env ( V) n V are finite. For one-letter alphabets, we have the following.

Proof: First we remark that VO = {X: X E V and Theorem 6.5: A set of tapes V on a one-letter alphabet{X}2 V , from which it easily follows that V - Vo z is S-regular if and only if it is a (reverse) definite set.is closed, that is V - Vo = V12*, or V = Vo U V12* = Proof: The "if" part is contained in Corollary 6.4.Vo U k(V,) 2* for some V, C2*. Now if V0 and k(V) Suppose now V C *= {a}* is S-regular; thenare finite then certainly k(V,) is finite too, and so V isa reverse definite set. Conversely, if V = V2 U V32;* where V UV2 and V3 are finite, then k(V) C k(V2) U k(V3) and so 1is finite, and V0 C V2 is finite too. Thus, given a finiteconnected automaton (a = (S,M,s1,F) defining V, we can where Vi are regular convex sets. If all the Vi are finite,effectively construct, by Theorem 5.1, automata 63 and C we are finished; otherwise let ak (k > 0) be the shortestwhich define k(V) and V0, respectively, and check whether tape in V which is in some infinite Vi. It is then clearT(QB) and T(C) are finite or not. Alternatively one can that V = V' U {an: n > k} = V' U {ak}2* where V' istake for e the automaton (S,M,s,,F') where F' is the finite, i.e., V is a (reverse) definite set. Q.E.D.set of all states in F for which T ( (S,M,s,F)) is not empty. Theorem 6.6: SR (2) is closed under Boolean operations.

Q.E.D. Proof: If V, are S-regular sets, then Vi = Uj Vij,where Vtj are regular convex sets. But then V = Ui Vi =

VI. STRUCTURE-REGULAR SETS; Ui Vij, so that V is S-regular. Also if V is S-regular thenCLOSURE PROPERTIES V = Ui Vf where Vi = Vi' n Vi", Vi' is regular and

Definition 6.1: A set V C 2* is a (simple, atomic) closed, Vi" regular and open. We then have: V = (\iV =structure-regular set if it is accepted by some (simple, \i (Vi' U Vi") = Ui,i (v, n Vj") = Uij Wiji, where Wi,atomic) deterministic structure automaton. The collection is regular and convex, so that V is S-regular. The resultof all structure regular sets on 2 will be denoted by SR (2). for intersection follows immediately. Q.E.D."Structure regular" will be abbreviated by S-regular, and Theorem 6.7: For 11 I;11 > 1, SR(2) is not closed under"deterministic structure automaton" by DSA. product, star, or transpose operations.

Theorem 6.2: V C 2* is a simple S-regular set if and Proof: For 2 = {0, 1}, Z* and I1} are S-regular,only if it is a regular convex set. while V = 2*{ 1} is not S-regular (see Theorem 3.5),

Proof: Let V = T (a) where a = (T, I (H,L) ) is a which shows nonclosure under products. Since U = VTsimple DSA. Put a1 = (T, I (H,0)), 2= (T,{ (S,L) }) is closed, it is S-regular; UT = V, however, is not S-(where S is the set of states of ct). It is clear that T (al) regular. Finally the singleton-set V = {0 } is S-regular,is closed, T (a2) is open, and V = T (a1) n T (a2). Suppose while V* is not (3.5). Q.E.D.now that Y is a regular convex set defined by the con- Remark 6.8: If Z2' are two given alphabets, then aneeted automaton a (S,M,s1,F). Ais mi proof of Theorem function f from z on 2' is called a projection. Such- a pro-

5.1 let F' = {M(s,o): s C FJ n F, and 63 = (T[F'],F). jection can be naturally extended to a function on tap-es,We already know that T((B3) = V. Put e = (T[F'], by defining f(A) = A,f(Xcr) = f(X)f(o); also if V C 2*(F,f) }). It is quite obvious that T (e) = T (63). Q.E.D. then its projection is f(V) = { f(X): X C V}. It is wellTheorem 6.3: V C 2* is S-regular if and only if it is a known that any projection of a regular set is again

finite union of regular convex sets. regular. That this is not true for S-regular sets should beXProof: Thatan S-rzegular set is a finite union of simple quite evident from the fact that structure automata are

S-regular sets, i.e., of regular convex sets, is- immediate essentially nondeterministic. Here is an example for theby Theorem 6.2. The other side is proven by using the skeptical reader. Let z = {0,1,2,3} and a O= ({SI,8,82,S3 },"product-table" technique (see [9, def. 7]). Let V = M,s0, { (s2,s3)} ) where A! (so,l) = si, A! (si,0) = sOIT(a), V' = T(a') where a = (S,M,s1,{ (H,L)J), a' = M(si,2) = 82, and Al(s,) =s3 for all other sC S, aC 2.(S',M',s ',{(H',L') }). Take a" = (S X S',M x M', Let V = T((a) and f: 2 - {0} be the trivial projection.(81,81'), il") where M X M'((s,s'), o-) = (M(s,a), It is easily seen that f(V' = {00 - {A}, and this setM'(s',)) and Q" = I (H X S', L X S'), (S X H', is not S-regular.S X L') }. It is readily seen that T((a") = V U V'. Theorem 6.9: Let V1 and V2 be S-regular sets. 1) If V,

Corollary 6.4: 1) All closed, open, minimal, or convex is minimal then V1V2 is S-regular. 2) If A C VI n V2,regular sets are S-regular. 2) In particular, if V is regular, then V1V2 is S-regular. 3) If V1 is minimal, then V1* isthen cl(V), int(V), k(V), conh(V), and env(V) are S-regular if and only if V1 =. ' C 2.S-regular. 3) All reverse definite (and all finite) sets are Proof: 1) We know that V2 = Ui Wi, where Wi areS-regrular. regular convex sets. Thus V1V2 = V. (UM W.) = UI V1W,

1223

Page 7: Structure Automata

IEEE TRANSACTIONS ON COMPUTERS, DECEMBER 1974

where V1 Wi are convex [by Lemma 4.7, 4)] and certainlyregular, so that V1V2 is S-regular. Parts 2) and 3) can beproven using similar techniques; we prefer however togive the proofs later (cf., Remark 9.4), to illustrate othertypes of arguments.

For two given sets of tapes V1,V2, define the extendedproduct V] - V2 by Vr - V2 = V,V2 U V1u V2. Since V, - V2 =

(V1 U {A}). (V2 U {A}), we get immediately that SR(2)is closed under extended product.

Theorem 6.10: V C 2* is S-regular if and only if all itsderivatives are S-regular.

Proof: Since V - {A} = U°,z {o-JD,(V), it is clearthat if DT (V) is S-regular for every c EC, then so are

(by Theorem 6.9) the sets { a I D¢ ( V), their union V Al },and thus the set V too. Suppose now V is S-regular, i.e.,V = U Vi where V, are regular and convex; givenUCZ*, let U' = {X, *,XA;} C U be such that Du (V) =

Du, (V). (That such a set always exists is a well-knownfact; it is implicit, for example, in [12, ths. 1 and 2].)It is obvious then that Du (V) = U,ij D (xiV) and since(by Lemma 4.8) Dx, (Vi) is convex (and regular), Du ( V)is S-regular. Q.E.D.Summary: SR (2) contains all finite and reverse definite

sets, and all open, closed, convex, or minimal sets whichare also regular. It is closed under union, intersection,complementation, left product by regular minimal sets,and extended product. It is also closed under the opera-

tions of taking closures, interiors, kernels, envelopes,convex hulls, and derivatives. It is not closed under prod-uct, star, or transpose operations.

VII. CHARACTERIZATIONS, VARIANTS, ANDDECISION PROBLEMS

Theorem 7.1: SR ( 2) is also equal to the followingcollection of sets of tapes on 2: A1 is the Boolean closureof the closed (or open) regular sets; A2 is the closure ofminimal regular sets under Boolean operations and topo-logical closure; A3 is the closure of the empty set underBoolean operations and product from the left by minimalregular sets; A4 is the closure of the reverse definite setsunder the same operations as in A3.

Proof: We already know (Theorem 6.3) that SR (z) CA1. A1 C A2, since if V E A1 is closed, then it is the clo-sure of k(V) which is minimal and regular. To show thatA2 C A3 we first remark that every closed regular set Vis in A3, since V = k(V)2* where k(V) is minimal and2* e A3. Thus every regular open set is also in A3, andso is every regular convex (and hence every regularminimal) set. Also if V E A3, then cl(V) C A3 too sinceit is closed and regular. That A3 C A4 is trivial. FinallyA4 C SR(2) since all reverse definite events are S-regular,and SR(2) is closed under the operations mentioned in A3.

Q.E.D.We have seen that restricting the structure in (non-

deterministic) st,ructure automata to be simple or atomic

(or both) does not affect the definability power of thesedevices. That this is not true for DSA is shown by thefollowing theorem.

Theorem 7.2: The collection of simple S-regular sets isproperly contained in the collection of atomic S-regularsets which is properly contained in SR(2;).

Proof: Let a = (S, M, si, { (H,L) } ) be a simple DSA,where we assume H n L = 0. V = T(a) is accepted alsoby the following atomic DSA: 63 = (S U { f}, M', s,, Q')wheref is a new state,Q' = { (h,f): h E H}, and M'(s,i) =M(f,ur) = f if s E L or M(s,o-) E L; M'(s,o) = M(s,T)in all other cases. On the other hand the set V = { A, 10,00 Iis not convex, so it is not simple S-regular; however it isaccepted by the following atomic DSA: e = ({ 80,81,82,s31,M,SO,S{(O,S1),(S2,S3) }) where M(so,O) = M(so,l) = si,M(s1,O) = S2, M(s,O-) = 83 for all other s E S, a E I.Finally the set V = {A,00 I is S-regular but is not ac-cepted by any atomic DSA. For suppose V = T (a)where ( = (S,M,s1,Q), Q an atomic structure, and let= (S0,Sl,S2) be the run of a on 00. Since A E V while

O i V, we must have (so,sl) E Q. If M(so,1) si, then 1would be in V; thus M(so,1) = si. But then the run of aon 10 is (so,8s,s2) = 42 i.e., 10 E V; a contradiction.

Q.E.D.A variant of structure automata, which is naturally

suggested by the Muller-M\cNaughton notion of automataon infinite tapes [6], is the following.

Definition 7.3: A T-automaton a on z is a pair a=(T,F) where T is a table on z and F C P(S). A tapeX E 2* is accepted by a (X E T(a)) if there is somerun 4 of a on X such that J(4) E F. T (a) is the setdefined by t.

Theorem 7.4: V C 2 is defined by a (deterministic)T-automaton if and only if it is defined by a (deter-ministic) structure automaton. (In particular T-automataare essentially nondeterministic devices.)

Proof: If V is defined by the (deterministic) structureautomaton t = (T,Q) then V is also defined by the(deterministic) T-automaton a' = (T,F) where C E F ifand only if C n H 4 0, C'n L = 0 for some (H,L) E Q.It is also clear that every set defined by a T-automatonis regular. Suppose now that V is defined by the deter-ministic T-automaton a = (T,F), F = {GI, * * -Gk. ;

k

V= UVii=1

where Vi = T(ij), ai = (T, {Gi). If Xi < X2 < X3 and41,02,43 are the corresponding runs of ai on these tapes,then J(401) C J(402) C J(03) so that if X1,X3 E Vi, thenso is X2. Thus Vi is a convex regular set and V is S-regular. Q.E.D.

Turning finally to decision problems, we have thefollo-wing.

Theorem 7.5: Given a finite automaton at, we can effec-tively decide whether V = Tr(a) is S-regular. If it is, an

1224

Page 8: Structure Automata

CHOUEKA: STRUCTURE AUTOMATA

explicit representation of V as a finite union of regularconvex sets can be found, and a DSA defining V can beeffectively constructed.

Proof: Let a = (S,M,si,F) and assume, as usual,that a, is connected. For s,s' E S, put a(s,s') =(S,M,s,s'). We claim that V is S-regular if and only iffor every s E F and s' f F, at least one of the setsT(a(s,s')), T(a(s',s)) is empty.Assume first that V is S-regular, i.e.,

k

V= UVi,i=1

where V. are regular convex sets. If the condition is notsatisfied, then there are s E F, s' i F, and tapes X,Y E x*such that M(s,X) = s', M(s',Y) = s. Since a is con-nected, there is some Z E 2* such that M(s1,Z) = s.The-tapes Zj' = Z(XY) , for 1 < j < k + 1, are all in V,so there must be some 1 < io < k + 1 such that Zji',Zj2' EVj0. Since Vi0 is convex and Zjl' < Zj1'X < Zj,', wehave Zj1'X E Vj0. This is, however, impossible, sinceM(s1,Zj1'X) = M(M(s1,Zjl'), X) = M(s,X) = s' f F.Suppose now that the condition stated above is satis-

fied. For every s E S, let V8 = k(T(a(sj,s))), W8 =env(T(a(s,s))). We show that V = USEF VS,WS. Since V.is minimal and W8 open, V8W8 is convex (by Theorem4.7) and certainly regular, so it is S-regular, and V willbe thus certainly S-regular. In fact, if X E V andM(si,X) = s E F let Y be the shortest initial section ofX for which M(s1,Y) = s; put X = YZ. Then Y E V8and Z E W8. On the other hand, let X E V8W8 for somes E F; i.e., X = YZ where M(s1,Y) = s, M(s,Z) = s'and there is some Z' such that M(s',Z') = s. T(a(s,s'))and T(a(s',s)) are both not empty so that s' E F,and since M(s1,X) = M(s1,YZ) = M(M(si,Y), Z) =M(s,Z) = s' we get that X E V. If V is S-regular, wehave obtained a representation of V as a union of k convexsets Vi, where k is the number of final states of F. A finiteautomaton accepting Vi can be effectively constructedby Theorem 5.1, and (by Theorem 6.2), this allows usto construct a DSA defining Vi, and thus a DSA definingV too. Q.E.D.We remark that the S-regularity condition can also be

stated as follows: V is S-regular if and only if V =U3eF V,W, where V, and W. are as above. This conditionis however much more lengthy to check than the pre-vious one.

VIII. SIGNATURE PROPERTIESDefinition 8.1: Let V C 2* be a given set of tapes,

X E 2* a given tape of length n > 0, and Xi its prefixesof length i(0 < i < n). The signature of X with respectto V, to be denoted by Sg(X/ V), is the number of in-dices i for which Xi E V and Xi+, i V, or Xi { V andXi+,G V. For V1 C V, the signature of V, with respectto V, to be denoted by Sg(VI/V), is the supremum of

{Sg(X/V): X E V1}. Finally we abbreviate Sg(V/V) bySg(V) and call it the signature of V.Example 8.2: Let 2= {0,I}, V= 2*{1, X = 1001,

and V1 = {0nl: n = 1,2,.-.}. Then Sg(X/V) = 3,Sg(Vl/V) = 1, Sg(V) =°. In this paper we are inter-ested only in Sg(V); so we list some of its properties.

Facts 8.3: 1) Sg(V) = 0 if and only if V is open.2) Sg(V) < 1 if and only if V is convex. 3) If V is closedor minimal and A $ V, then Sg(V) = 1. 4) If Sg(V) isfinite then it is even if and only if A E V. 5) For i> 0,let SGj[V] = {X: X E V, Sg(X/V) = i}. Then SGiEV]is convex, and V = Ui SGiEV].

Theorem 8.4: For k > 1, Sg(V) < 2k if and only if Vis the union of k convex sets.

Proof: Suppose that Sg(V) < 2k. Then

2k-1

V = U SGiEV].i=o0

Among these 2k sets, however, at least k of them areempty (those with the even index if A i V, or with theodd index if A E V), so that in fact V is the union of kconvex sets. Let now

V=UVi=1

where Vi are convex and take any X E V. If Y1 < Y2 <<* Yk+1 ( < X) are any k + 1 initial sections of X that

are in V, then two of them, say Yj, and Yj, must be in thesame component Vi0, for some 1 < io < k. Thus for everyYjl < Z < Yj2, Z E Vio, i.e., Z E V. This shows thatSq (X/ V) < 2k, and since this is true for every X E V,wegetSg(V) <2k. Q.E.D.

Corollary 8.5: A set V C 2* has finite signature if andonly if it is the union of a finite number of convex sets.We cannot immediately infer from this that a regular

set V is S-regular if and only if it has finite signature,since V is required to be the finite union of regular convexsets. That this is the case, however, will be shown below(Theorems 8.9 and 8.10).We turn now to the connections between the signature

of a set and its regularity or S-regularity.Theorem 8.6: Every set of finite signature over a one-

letter alphabet is regular and, in fact, S-regular.Proof: If Sg(V) = k < oo then

V = U SGiEV];i=O

since SG{V] is convex, it is regular (by Theorem 5.5)and so, also S-regular. Thus V also is S-regular. Q.E.D.The theorem is not true for general alphabets, as shown

by the following.Theorem 8.7: If 11211 > 1, then for every 0 < k < oo,

there exist regular sets Vk C 2*, and nonregular setsWA C :*, whose signature is exactly k.

1225

Page 9: Structure Automata

IEEE TRANSACTIONS ON COMPUTERS, DECEMBER 1974

Proof: Assume I = IO01 J. It is easy to see that thesets Vk,Wk defined below have the desired properties:

Vo = {A}, V2Ak_l = {02j: 1 < j < k},

V2k = V2k_1 U {A};

Wo = {Omln: 0<n<m}, W2k_l = (Onln)k: n= 1,2 ...

W2k = W2k,- U {A} . Q.E.D.

Theorem 8.8: If V is regular, then SGiJV] is simpleS-regular (and a fortiori regular) for every i > 0.

Proof: We show by induction on i, that SGiEV] isregular for every i > 0; since this set is also convex, thiswill show that it is simple S-regular. The case i = 0 istrivial since SGo[V] = int ( V). Now let

k

TSGk[V] = U SGiV].i=O

The following identity is then easily checked:

TSGk+1[V] = (int(V) U int(V)) * (TSGk[V]).

If we assume then by induction that SGiEV] is regularfor every i < k, we get that TSGk[V] is regular, so thatby the above identity TSGk+l[V] is also regular, andso is

SGk+jv][ = TSGk+lV]- TSGkEV]. Q.E.D.

In fact, a finite automaton defining SGiEV] can bedirectly constructed from the automaton a = (S,M,s1,F)defining V, as follows. Let S' = S X {0,1,.* , i + 1}, si/ =(sl,0), F' = F X {i}, and define M': S' X 2 - S' byM'((s,j), v) = (M1 (s,a), j'), where j' = j if j ii + 1,or s and M(s,co) are both in F, or both in F; otherwisej = j + 1. The automaton a' = (S',M',s1',F') then de-fines SGiEV].

Theorem 8.9: A regular set is S-regular, if and only ifit has finite signature.

Proof: If V is regular and Sg(V) = k < oo thenV = TSGk[V] which is S-regular. On the other hand, if

k

V= UVii=l

where Vi are convex, then by Theorem 8.4 Sg(V) < 2k,i.e., V has finite signature. Q.E.D.

Corollary 8.10: A regular set is a finite union of convexsets if and only if it is a finite union of regular convex sets.

Proof: If a regular set is a finite union of convexsets then it has finite signature (Corollary 8.5), so it isS-regular, and by Theorem 6.3, it is a finite union ofregular convex sets. Q.E.D.

Corollary 8.11: Let V = T (a) where (a is a connectedfinite automaton with k final states. Then Sg(V) < coif and only if Sg(V) < 2k.

Proof: If V has finite signature, then, by Theorem8.9, it is S-regular. By the proof of Theorem 7.5 this setcan then be written as a union of k convex sets, so that,by Theorem 8.4, Sg(V) < 2k. Q.E.D.

Corollary 8.12: If V is regular, then Sg(V) can beeffectively computed.

Proof: Take any connected finite automaton a, whichdefines V and suppose a, has k final states. Let io =max {i: i < 2k + 1, SGj[V] 5 0J. (This number can beeffectively found.) If io < 2k, then Sg(V) = io; other-wise, Sg(V) = m. Q.E.D.A speedy way of doing this computation, i.e., of finding

the number io is as follows. Let T' = (S',M',s1') whereS'= SX {0, ,2k+1 ,sl' = (s1,0), and M'( (s,j), )=(M(s,o), j') where j' = j if j = 2k + 1, or both s andM(s,o) are in F or both in F; otherwise j' = j + 1 (cf.,the proof of Theorem 8.8). Define the sets Ti as follows:To= {s/l}, Tj+ = Ti U IAM'(s',a):s' E Ti. Letm<11m11be the first index for which Ti = Tj+1; it is then clear thatio=max I j: (sj) C Tm sE F}.

Incidentally, we get here, by this computation, a newdecision procedure for S-regularity which is perhaps easierto apply than the one given in Theorem 7.5.

IX. CONNECTIONS WITH AUTOMATA ONINFINITE TAPES

We now show how the criterion and the decision pro-cedure for S-regularity can be nicely and naturally statedin the framework of deterministic automata on infinitetapes [6].We denote by 2 (w) the set of all infinite tapes (tapes

of length co) on 2.Definition 9.1: Let V C 2V. The alternating support V

of V is the set of all infinite tapes X C z (co) which havean infinite number of prefixes in V and an infinite numberof prefixes not in V.

Clearly if V is not empty, then the signature of V isinfinite. One is tempted to prove (using perhaps some"Konig-like" lemma) that the converse also is true, namely,if the signatures of the tapes in V are unbounded, thenthere is some infinite tape with infinite signature (relativeto V). That this is not true is shown by the followingexample. Let z = {0,1}, V = {(onl)m; 1 < m < n}.Clearly Sg(V) = oo; it is obvious however that if X C2 (co) is any infinite tape which contains a "1,,, and On1is the shortest prefix of X which ends with a "1", thenall prefixes of length greater than n (n + 1) are not in V,i.e., V is empty. Interestingly enough, however, thisresult (which is of a combinatorial nature) is indeed truefor regular sets.

Theorem 9.2: If V C V is regular, then it has finitesignature if and only if its alternating support is empty.

Proof: The "only if" part is obvious. Suppose nowV = T(a), where A = (S,M,s1,F) is connected and Vdoes not have finite signature. By Theorem 7.5 there are

1226

Page 10: Structure Automata

CHOUEKA: STRUCTURE AUTOMATA

states s E F, s' f F and tapes Y1, Y2, Y3 such thatM(s1, Y1) = s, M(s, Y2) = s', M(s', Y3) = s. Putting thenX = Y,(Y2Ys)", we immediately get that X C V.

Q.E.D.We now briefly recall the relevant definitions from [6].

An M-automaton a is a pair (T,F) where T = (S,M,si')is a deterministic table and F C P(S). The run of T on

X = (xi) E 2 (w) is the unique infinite sequence of states(si) which satisfy so = sl', si+1 = M(si,xi). A tape

X e Z(w) is accepted by e (X E T(a)) if the set ofstates which appear infinitely often in the run of T on Xis a member of F. T(a) is the set defined by a.

Theorem 9.3: If V is regular, then V is defined by some

M-automaton.Proof: Let V = T(a) where a = (T,F), (a con-

nected), and put a' = (T,F') where F' = {C: C n F 0,C nF zF- 01. Then P = T(a'). Q.E.D.

Sinice the emptiness problem of M-automata is effec-tively solvable ([6]), we get, by Theorems 8.9 and 9.2,a decision procedure for S-regularity (which is in factequivalent to the one given in Theorem 7.5).Remark 9.4: We illustrate the usefulness of this approach

by proving parts 2) and 3) of Theorem 6.9. Suppose Vis a minimal S-regular set, and V* is also S-regular. If Vcontained some tape X of length greater than one, thenthe tape Xw would be in V*, since for every k > 1, Xk e V*while Xkro ( V* (where or is the first letter of X), in contra-diction to Theorem 9.2. Suppose now VI and V2 are

S-regular sets, A e V, n V2; we show that =1V2 0proving thus that CVl2 is S-regular. Suppose on the con-trary there is some X e VIV2. There is then an infinitenumber of initial sections of X: Y, < Z, < < Yi <Zi < ... such that Y, E ,7V2, Zi ( V1 V2; in particular,Zi V7, Zi ( V2. There is thus some io such that fori 2 4, Xi f V1, Xi. f V2, where Xi is the prefix of X oflength i. Since, however, Yi E VIV2, we get that Yi =

WiWi' where Wi E V1, Ws/ C V2, and Wi # A, Wi' 56 A.We thus conclude that there is some fixed prefix of X,say X', such that Yi = X'Wi' for i > il. If we let X =

X'X", we get-since V2 = 0-that ultimately all initialsections of X" are in V2, but then ultimately all initialsections of X are also in ViV2, a contradiction. Q.E.D.

REFERENCES[1] J. A. Brzozowski, "Canonical regular expressions and minimal

state graphs for definite events," in Proc. Symp. on Math.

1227

Theory of Automata. New York: Polytechnic Press, 1963, pp.529-562.

[2] -, "Regular expressions for linear sequential circuits," IEEETrans. Comput., vol. EC-14, pp. 148-156, Apr. 1965.

[3] J. R. Buchi, "On a decision method in restricted second-orderarithmetic," in 1960 Int. Congr. on Logic, Methodology, andPhilosophy of Science. Stanford, Calif.: Stanford Univ. Press,pp. 1-11.

[4] di' C. Elgot, "Decision problems of finite-automata design andrelated arithmetics," Trans. Amer. Math. Soc., vol. 98, pp. 21-51, Jan. 1961.

[5] J. L. Kelley, General Topology. Toronto, Canada: VanNostrand, 1955.

[6] R. McNaughton, "Testing and generating infinite sequences bya finite automaton," Inform. Contr., vol. 9, pp. 521-530, Oct.1966.

[7] A. Paz and B. Peleg, "Ultimate-definite and symmetric definiteevents and automata," J. Ass. Comput. Mach., vol. 12, pp.399-410, July 1965.

[8] M. Perles, M. 0. Rabin, and E. Shamir, "The theory of definiteautomata," IEEE Trans. Electron. Comput., vol. EC-12, pp.233-243, June 1963.

[9] M. 0. Rabin and D. Scott, "Finite automata and their decisionproblem," IBM J. Res. Develop., vol. 3, pp. 114-125, Apr. 1959.

[10] M. 0. Rabin, "Mathematical theory of automata," in Proc.Symp. Appl. Math., vol. 19. Providence, R. I.: Amer. Math.Soc., 1968, pp. 153-175.

[11] , "Decidability of second-order theories and automata oninfinite trees," Trans. Amer. Math. Soc., vol. 141, pp. 1-35,July 1969.

[12] R. E. Stearns and J. Hartmanis, "Regularity preserving modi-fications of regular expressions," Inform. Contr., vol. 6, pp.55-69, Mar. 1963.

:4 Yaacov A. Choueka was born in Cairo, Egypton June 16, 1936. He received the M.Sc.degree (with honors) in 1962 and the Ph.D.degree in 1972, both in mathematics, bothfrom the Mathematics Institute of theHebrew University, Jerusalem, Israel.

After three years of military service, duringthe period 1962 to 1965, he joined the De-partment of Mathematics, Bar-Ilan Uni-versity, Ramat-Gan, Israel, first as anInstruictor, then as a Lecturer, and from

1972, as a Senior Lecturer. In 1974 to 1975 he will be visiting theDepartment of Mathematics, University of Illinois at Urbana,Urbana. Since 1968, he has also been actively engaged as a Coprinci-pal Investigator in the Responsa Retrieval Proiect, where he hasbeen especially in charge of the analysis and solution of the computa-tional linguistics problems associated with this full-text documentretrieval system. His main interests are twofold: finite automatatheory on finite or infinite structures, or one hand, and full-textdocument retrieval systems, and automatic grammatical analysisof natural languages (in particular: Hebrew) on the other.