arXiv:1510.02036v1 [cs.FL] 7 Oct 2015

TREE AUTOMATA AND TREE GRAMMARS

by Joost Engelfriet

DAIMI FN-10 April 1975

Institute of Mathematics, University of Aarhus DEPARTMENT OF COMPUTER SCIENCE Ny Munkegade, 8000 Aarhus C, Denmark

Preface I wrote these lecture notes during my stay in Aarhus in the academic year 1974/75. As a young researcher I had a wonderful time at DAIMI, and I have always been happy to have had that early experience. I wish to thank Heiko Vogler for his noble plan to move these notes into the digital world, and I am grateful to Florian Starke and Markus Napierkowski (and Heiko) for the excellent transformation of my hand-written manuscript into LATEX. Apart from the reparation of errors and some cosmetical changes, the text of the lecture notes has not been changed. Of course, many things have happened in tree language theory since 1975. In particular, most of the problems mentioned in these notes have been solved. The developments until 1984 are described in the book “Tree Automata” by Ferenc G´ecseg and Magnus Steinby, and for recent developments I recommend the Appendix of the reissue of that book at arXiv.org/abs/1509.06233. Joost Engelfriet, October 2015 LIACS, Leiden University, The Netherlands

Tree automata and tree grammars To appreciate the theory of tree automata and tree grammars one should already be motivated by the goals and results of formal language theory. In particular one should be interested in “derivation trees”. A derivation tree models the grammatical structure of a sentence in a (context-free) language. By considering only the bottom of the tree the sentence may be recovered from the tree. The first idea in tree language theory is to generalize the notion of a finite automaton working on strings to that of a finite automaton operating on trees. It turns out that a large part of the theory of regular languages can rather easily be generalized to a theory of regular tree languages. Moreover, since a regular tree language is (almost) the same as the set of derivation trees of some context-free language, one obtains results about context-free languages by “taking the bottom” of results about regular tree languages. The second idea in tree language theory is to generalize the notion of a generalized sequential machine (that is, finite automaton with output) to that of a finite state tree transducer. Tree transducers are more complicated than string transducers since they are equipped with the basic capabilities of copying, deleting and reordering (of subtrees). The part of (tree) language theory that is concerned with translation of languages is mainly motivated by compiler writing (and, to a lesser extent, by natural linguistics). When considering bottoms of trees, finite state transducers are essentially the same as syntax-directed translation schemes. Results in this part of tree language theory treat the composition and decomposition of tree transformations, and the properties of those tree languages that can be obtained by finite state transformation of regular tree languages (or, taking bottoms, those languages that can be obtained by syntax-directed translation of context-free languages). Thirdly there are, of course, many other ideas in tree language theory. In the literature one can find, for instance, context-free tree grammars, recognition of subsets of arbitrary algebras, tree walking automata, hierarchies of tree languages (obtained by iterating old ideas), decomposition of tree automata, Lindenmayer tree grammars, etc. These lectures will be divided in the following five parts: (1) and (2) contain preliminaries, (3), (4) and (5) are the main parts. (1) Introduction. (p. 1) (2) Some basic definitions. (p. 2) (3) Recognizable (= regular) tree languages. (p. 10) (4) Finite state tree transformations. (p. 32) (5) Whatever there is more to consider. Part (5) is not contained in these notes; instead, some Notes on the literature are given on p. 69.

Contents 1 Introduction

1

2 Some basic definitions

2

3 Recognizable tree languages 10 3.1 Finite tree automata and regular tree grammars . . . . . . . . . . . . . . 10 3.2 Closure properties of recognizable tree languages . . . . . . . . . . . . . . 18 3.3 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4 Finite state tree transformations 4.1 Introduction: Tree transducers and semantics . . . . . . . . . . . . 4.2 Top-down and bottom-up finite tree transducers . . . . . . . . . . 4.3 Comparison of B and T, the nondeterministic case . . . . . . . . . 4.4 Decomposition and composition of bottom-up tree transformations 4.5 Decomposition of top-down tree transformations . . . . . . . . . . 4.6 Comparison of B and T, the deterministic case . . . . . . . . . . . 4.7 Top-down finite tree transducers with regular look-ahead . . . . . . 4.8 Surface and target languages . . . . . . . . . . . . . . . . . . . . . 5 Notes on the literature

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

32 32 35 46 51 56 58 59 65 69

1 Introduction Our basic data type is the kind of tree used to express the grammatical structure of strings in a context-free language. Example 1.1. Consider the context-free grammar G = (N, Σ, R, S) with nonterminals N = {S, A, D}, terminals Σ = {a, b, d}, initial nonterminal S and the set of rules R, consisting of the rules S → AD, A → aAb, A → bAa, A → AA, A → λ, D → Ddd and D → d (we use λ to denote the empty string). The string baabddd ∈ Σ∗ can be generated by G and has the following derivation tree (see [Sal, II.6], [A&U, 0.5 and 2.4.1]): S A

D

A b

A

A a

e

a

A

D b

d

d

d

e

– Note that we use e as a symbol standing for the empty string λ. – The string baabddd is called the “yield” or “result” of the derivation tree.

Thus, in graph terminology, our trees are finite (finite number of nodes and branches), directed (the branches are “growing downwards”), rooted (there is a node, the root, with no branches entering it), ordered (the branches leaving a node are ordered from left to right) and labeled (the nodes are labeled with symbols from some alphabet). The following intuitive terminology will be used: – the rank (or out-degree) of a node is the number of branches leaving it (note that the in-degree of a node is always 1, except for the root which has in-degree 0) – a leaf is a node with rank 0 – the top of a tree is its root – the bottom (or frontier) of a tree is the set (or sequence) of its leaves – the yield (or result, or frontier) of a tree is the string obtained by writing the labels of its leaves (except the label e) from left to right – a path through a tree is a sequence of nodes connected by branches (“leading downwards”); the length of the path is the number of its nodes minus one (that is, the number of its branches) – the height (or depth) of a tree is the length of the longest path from the top to the bottom

1

– if there is a path of length ≥ 1 (of length = 1) from a node a to a node b then b is a descendant (direct descendant) of a and a is an ancestor (direct ancestor) of b – a subtree of a tree is a tree determined by a node together with all its descendants; a direct subtree is a subtree determined by a direct descendant of the root of the tree; note that each tree is uniquely determined by the label of its root and the (possibly empty) sequence of its direct subtrees – the phrases “bottom-up”, “bottom-to-top” and “frontier-to-root” are used to indicate this ↑ direction, while the phrases “top-down”, “top-to-bottom” and “root-to-frontier” are used to indicate that ↓ direction. In derivation trees of context-free grammars each symbol may only label nodes of certain ranks. For instance, in the above example, a, b, d and e may only label leaves (nodes of rank 0), A labels nodes with ranks 1, 2 and 3, S labels nodes with rank 2, and D nodes of rank 1 and 3 (these numbers being the lengths of the right hand sides of rules). Therefore, given some alphabet, we require the specification of a finite number of ranks for each symbol in the alphabet, and we restrict attention to those trees in which nodes of rank k are labeled by symbols of rank k.

2 Some basic definitions The mathematical definition of a tree may be given in several, equivalent, ways. We will define a :::: tree::: as :: a ::::::: special::::: kind::: of :::::: string (others call this string a representation of the tree, see [A&U, 0.5.7]). Before doing so, let us define ranked alphabets. Definition 2.1. An alphabet Σ is said to be ranked if for each nonnegative integer k a subset Σk of Σ is such that Σk is nonempty for a finite number of k’s only, and S specified, such that Σ = Σk . † k≥0

If a ∈ Σk , then we say that a has rank k (note that a may have more than one rank). Usually we define a specific ranked alphabet Σ by specifying those Σk that are nonempty. Example 2.2. The alphabet Σ = {a, b, +, −} is made into a ranked alphabet by specifying Σ0 = {a, b}, Σ1 = {−} and Σ2 = {+, −}. (Think of negation and subtraction). Remark 2.3. Throughout our discussions we shall use the symbol e as a special symbol, intuitively representing λ. Whenever e belongs to a ranked alphabet, it is of rank 0. Operations on ranked alphabets should be defined as for instance in the following definition. † To be more precise one should define a ranked alphabet as a pair (Σ, f ), where Σ is an alphabet and f is a mapping from N into P(Σ) such that ∃n ∀k ≥ n : f (k) = ∅, and then denote f (k) by Σk and (Σ, f ) by Σ. Note that N = {0, 1, 2, . . .} is the set of natural numbers and that P(Σ) is the set of subsets of Σ.

2

Definition 2.4. Let Σ and ∆ be ranked alphabets. The union of Σ and ∆, denoted by Σ ∪ ∆, is defined by (Σ ∪ ∆)k = Σk ∪ ∆k , for all k ≥ 0. We say that Σ and ∆ are equal, denoted by Σ = ∆, if, for all k ≥ 0, Σk = ∆k . We now define the notion of tree. Let “ [ ” and “ ] ” be two symbols which are never elements of a ranked alphabet. Definition 2.5. Given a ranked alphabet Σ, the set of trees over Σ, denoted by TΣ , is the language over the alphabet Σ ∪ {[ , ]} defined inductively as follows. (i) If a ∈ Σ0 , then a ∈ TΣ . (ii) For k ≥ 1, if a ∈ Σk and t1 , t2 , . . . , tk ∈ TΣ , then a[t1 t2 · · · tk ] ∈ TΣ .

Intuitively, a is a tree with one node labeled “a”, and a[t1 t2 · · · tk ] is the tree a ... t1

. . .

t2

.

tk

Example 2.6. Consider the ranked alphabet of Example 2.2. Then +[−[a − [b]]a] is a tree over this alphabet, intuitively “representing” the tree + − a

a − b

which on its turn “represents” the expression (a − (−b)) + a (note that the “official” tree is the prefix notation of this expression). Example 2.7. Consider the ranked alphabet ∆, where ∆0 = {a, b, d, e}, ∆1 = ∆3 = {A, D} and ∆2 = {A, S}. A picture of the tree S[A[A[bA[e]a]A[aA[e]b]]D[D[d]dd]] in T∆ is given in Example 1.1.

Exercise 2.8. Take some ranked alphabet Σ and show that TΣ is a context-free language over Σ ∪ {[ , ]}. Our main aim will be to study several ways of constructively representing sets of trees and relations between trees. The basic terminology is the following. Definition 2.9. Let Σ be a ranked alphabet. A tree language over Σ is any subset of TΣ .

3

Definition 2.10. Let Σ and ∆ be ranked alphabets. A tree transformation from TΣ into T∆ is any subset of TΣ × T∆ . Exercise 2.11. Show that the context-free grammar G = (N, Σ, R, S) with N = {S}, Σ = {a, b, [ , ]} and R = {S → b[aS], S → a} generates a tree language over ∆, where ∆0 = {a} and ∆2 = {b}. The above definition of “tree” (Definition 2.5) gives rise to the following principles of proof by induction and definition by induction for trees. (Note that each tree is, uniquely, either in Σ0 or of the form a[t1 · · · tk ]). Principle 2.12. Principle of proof by induction (or recursion) on trees. Let P be a property of trees (over Σ). If

(i) all elements of Σ0 have property P , and (ii) for each k ≥ 1 and each a ∈ Σk , if t1 , . . . , tk have property P , then a[t1 · · · tk ] has property P ,

then all trees in TΣ have property P .

Principle 2.13. Principle of definition by induction (or recursion) on trees. Suppose we want to associate a value h(t) with each tree t in TΣ . Then it suffices to define h(a) for all a ∈ Σ0 , and to show how to compute the value h(a[t1 · · · tk ]) from the values h(t1 ), . . . , h(tk ). More formally expressed, given a set O of objects, and (i) for each a ∈ Σ0 , an object oa ∈ O, and (ii) for each k ≥ 1 and each a ∈ Σk , a mapping fak : Ok → O, there is exactly one mapping h : TΣ → O such that (i) h(a) = oa for all a ∈ Σ0 , and (ii) h(a[t1 · · · tk ]) = fak (h(t1 ), . . . , h(tk )) for all k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ . Example 2.14. Let Σ0 = {e} and Σ1 = { / }. The trees in TΣ are in an obvious one-to-one correspondence with the natural numbers. The above principles are the usual induction principles for these numbers. To illustrate the use of the induction principles we give the following useful definitions. Definition 2.15. The mapping yield from TΣ into Σ∗0 is defined inductively as follows. ( a if a = 6 e (i) For a ∈ Σ0 , yield(a) = λ if a = e. (ii) For a ∈ Σk and t1 , . . . , tk ∈ TΣ , yield(a[t1 · · · tk ]) = yield(t1 ) · yield(t2 ) · · · yield(tk ). † †

That is, the concatenation of yield(t1 ), . . . , yield(tk ).

4

Moreover, for a tree language L ⊆ TΣ , we define yield(L) = {yield(t) | t ∈ L}. We shall sometimes abbreviate “yield” by “y”.

Definition 2.16. The mapping height from TΣ into N is defined recursively as follows. (i) For a ∈ Σ0 , height(a) = 0. (ii) For a ∈ Σk and t1 , . . . , tk ∈ TΣ , height(a[t1 · · · tk ]) = max (height(ti )) + 1.

1≤i≤k

Example 2.17. As an example of a proof by induction on trees we show that, if e ∈ / Σ0 and Σ1 = ∅, then, for all t ∈ TΣ , height(t) < |yield(t)|. Proof. For a ∈ Σ0 , height(a) = 0 and |yield(a)| = |a| = 1 (since a 6= e). Now let a ∈ Σk (k ≥ 2) and assume (induction hypothesis) that height(ti ) < |yield(ti )| for 1 ≤ i ≤ k. k P Then |yield(a[t1 · · · tk ])| = |yield(ti )| (Def. 2.15(ii)) i=1

≥(

k P

height(ti )) + k

(ind. hypothesis)

i=1

≥ ( max height(ti )) + 2

(k ≥ 2 and height(ti ) ≥ 0)

1≤i≤k

> height(a[t1 · · · tk ])

(Def. 2.16(ii)).

Exercise 2.18. Let Σ be a ranked alphabet such that Σ0 ∩ Σk = ∅ for all k ≥ 1. Define a (string) homomorphism h from (Σ ∪ {[ , ]})∗ into Σ∗0 such that, for all t ∈ TΣ , h(t) = yield(t). Exercise 2.19. Give a recursive definition of the notion of “subtree”, for instance as a mapping sub : TΣ → P(TΣ ) such that sub(t) is the set of all subtrees of t. Give also an alternative definition of “subtree” in a more string-like fashion. Exercise 2.20. Let path(t) denote the set of all paths from the top of t to its bottom. Think of a formal definition for “path”. The generalization of formal language theory to a formal tree language theory will come about by viewing a :::::: string::: as :a::::::: special::::: kind::: of :::: tree and taking the obvious generalizations. To be able to view strings as trees we “turn them 90 degrees” to a vertical position, as follows. Definition 2.21. A ranked alphabet Σ is monadic if (i) Σ0 = {e}, and (ii) for k ≥ 2, Σk = ∅. The elements of TΣ are called monadic trees.

5

Thus a monadic ranked alphabet Σ is fully determined by the alphabet Σ1 . Monadic trees obviously can be made to correspond to the strings in Σ∗1 . There are two ways to do this, depending on whether we read top-down or bottom-up: ftd : TΣ → Σ∗1 is defined by (i) ftd (e) = λ (ii) ftd (a[t]) = a · ftd (t) for a ∈ Σ1 and t ∈ TΣ and fbu : TΣ → Σ∗1 is defined by (i) fbu (e) = λ (ii) fbu (a[t]) = fbu (t) · a for a ∈ Σ1 and t ∈ TΣ . (Obviously both ftd and fbu are bijections). Accordingly, when generalizing a string-concept to trees, we often have the choice between a top-down and a bottom-up generalization. Example 2.22. The string alphabet ∆ = {a, b, c} corresponds to the monadic alphabet Σ with Σ0 = {e} and Σ1 = ∆. The tree a b c b e in TΣ corresponds either to the string abcb in ∆∗ (top-down), or to the string bcba in ∆∗ (bottom-up). Note that, due to our “prefix definition” of trees (Definition 2.5), the above tree looks “top-down like” in its official form a[b[c[b[e]]]]. Obviously this is not essential. Let us consider some basic operations on trees. A basic operation on strings is rightconcatenation with one symbol (that is, for each symbol a in the alphabet there is an operation rca such that, for each string w, rca (w) = wa). Every string can uniquely be built up from the empty string by these basic operations (consider the way you write and read!). Generalizing bottom-up, the corresponding basic operations on trees, here called “top concatenation”, are the following. Definition 2.23. For each a ∈ Σk (k ≥ 1) we define the (k-ary) operation of top concatenation with a, denoted by tcka , to be the mapping from TΣk into TΣ such that, for all t1 , . . . , tk ∈ TΣ , tcka (t1 , . . . , tk ) = a[t1 · · · tk ]. Moreover, for tree languages L1 , . . . , Lk , we define tcka (L1 , . . . , Lk ) = {a[t1 · · · tk ] | ti ∈ Li for all 1 ≤ i ≤ k}.

6

Note that every tree can uniquely be built up from the elements of Σ0 by repeated top concatenation. The next basic operation on strings is concatenation. When viewed monadically, concatenation corresponds to substituting one vertical string into the e of the other vertical string. In the general case, we may take one tree and substitute a tree into each leaf of the original tree, such that different trees may be substituted into leaves with different labels. Thus we obtain the following basic operation on trees. Definition 2.24. Let n ≥ 1, a1 , . . . , an ∈ Σ0 all different, and s1 , . . . , sn ∈ TΣ . For t ∈ TΣ , the tree concatenation of t with s1 , . . . , sn at a1 , . . . , an , denoted by tha1 ← s1 , . . . , an ← sn i, is defined recursively as follows. (i) for a ∈ Σ0 , ( si aha1 ← s1 , . . . , an ← sn i = a

if a = ai otherwise

(ii) for a ∈ Σk and t1 , . . . , tk ∈ TΣ , a[t1 · · · tk ]h. . . i = a[t1 h. . . i · · · tk h. . . i], where h. . . i abbreviates ha1 ← s1 , . . . , an ← sn i. If, in particular, n = 1, then, for each a ∈ Σ0 and t, s ∈ TΣ , the tree tha ← si is also denoted by t ·a s. Example 2.25. Let ∆0 = {x, y, c}, ∆2 = {b} and ∆3 = {a}. If t = a[b[xy]xc], then thx ← b[cx], y ← ci = a[b[b[cx]c]b[cx]c]. Exercise 2.26. Check that in the monadic case tree concatenation corresponds to string concatenation. For tree languages tree concatenation is defined analogously. Definition 2.27. Let n ≥ 1, a1 , . . . , an ∈ Σ0 all different, and L1 , . . . , Ln ⊆ TΣ . For L ⊆ TΣ we define the tree concatenation of L with L1 , . . . , Ln at a1 , . . . , an , denoted by Lha1 ← L1 , . . . , an ← Ln i, as follows. † (i) for a ∈ Σ0 , ( Li aha1 ← L1 , . . . , an ← Ln i = a

if a = ai otherwise

(ii) for a ∈ Σk and t1 , . . . , tk ∈ TΣ , ††

a[t1 · · · tk ]h. . . i = a[t1 h. . . i · · · tk h. . . i]

(iii) for L ⊆ TΣ , S Lha1 ← L1 , . . . , an ← Ln i = tha1 ← L1 , . . . , an ← Ln i. t∈L †

As usual, given a string w, we use w also to denote the language {w}. For tree languages M1 , . . . , Mk we also write a[M1 · · · Mk ] to denote tcka (M1 , . . . , Mk ). This notation is fully justified since a[M1 · · · Mk ] is the (string) concatenation of the languages a, [ , M1 , . . . , Mk and ] ! ††

7

If, in particular, n = 1, then, for each a ∈ Σ0 and each L1 , L2 ⊆ TΣ , we denote L1 ha ← L2 i also by L1 ·a L2 . Remarks 2.28. (1) Obviously, if L, L1 , . . . , Ln are singletons, then Definition 2.27 is the same as Definition 2.24. (2) Note that tree concatenation, as defined above, is “nondeterministic” in the sense that, for instance, to obtain tha1 ← L1 , . . . , an ← Ln i different elements of L1 may be substituted at different occurrences of a1 in t. “Deterministic” tree concatenation of t with L1 , . . . , Ln at a1 , . . . , an could be defined as {tha1 ← s1 , . . . , an ← sn i | si ∈ Li for all 1 ≤ i ≤ n}. In this case different occurrences of a1 in t should be replaced by the same element of L1 . It is clear that, in the case that L1 , . . . , Ln are singletons, this distinction cannot be made. Intuitively, since trees are strings, tree concatenation is nothing else but ordinary string substitution, familiar from formal language theory (see, for instance, [Sal, I.3]). For completeness we give the definition of substitution of string languages. Definition 2.29. Let ∆ be an alphabet. Let n ≥ 1, a1 , . . . , an ∈ ∆ all different and let L1 , . . . , Ln be languages over ∆. For any L ⊆ ∆∗ , the substitution of L1 , . . . , Ln for a1 , . . . , an in L, denoted by Lha1 ← L1 , . . . , an ← Ln i, is the language over ∆ defined as follows: (i) λha1 ← L1 , . . . , an ← Ln i = λ (ii) for a ∈ ∆, ( Li aha1 ← L1 , . . . , an ← Ln i = a

if a = ai otherwise

(iii) for w ∈ ∆∗ and a ∈ ∆, wah. . . i = wh. . . i · ah. . . i (iv) for L ⊆ ∆∗ , S Lha1 ← L1 , . . . , an ← Ln i = wha1 ← L1 , . . . , an ← Ln i. w∈L

If n = 1, L1 ha ← L2 i will also be denoted as L1 ·a L2 . If L1 , . . . , Ln are singletons, then the substitution is called a homomorphism. Exercise 2.30. Let n ≥ 1, a1 , . . . , an ∈ Σ0 all different, ai 6= e for all 1 ≤ i ≤ n, and L, L1 , . . . Ln ⊆ TΣ . Prove that yield(Lha1 ← L1 , . . . , an ← Ln i) = yield(L)ha1 ← yield(L1 ), . . . , an ← yield(Ln )i. (Thus: “yield of tree concatenation is string substitution of yields”).

Exercise 2.31. Prove that Definitions 2.27 and 2.29 give exactly the same result for Lha1 ← L1 , . . . , an ← Ln i where a1 , . . . , an ∈ Σ0 and L, L1 , . . . Ln are tree languages over Σ (and thus, string languages over Σ ∪ {[ , ]}).

8

Exercise 2.32. Define the notion of associativity for tree concatenation, and show that tree concatenation is associative. Show that, in general, “deterministic tree concatenation” is not associative (cf. Remark 2.28(2)). We shall need the following special case of tree concatenation. Definition 2.33. Let Σ be a ranked alphabet and let S be a set of symbols or a tree language. Then the set of trees indexed by S, denoted by TΣ (S), is defined inductively as follows. (i) S ∪ Σ0 ⊆ TΣ (S) (ii) If k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ (S), then a[t1 · · · tk ] ∈ TΣ (S). Note that TΣ (∅) = TΣ .

Thus, if S is a set of symbols, then TΣ (S) = TΣ∪S , where the elements of S are assumed to have rank 0. If S is a tree language over a ranked alphabet ∆, then TΣ (S) is a tree language over the ranked alphabet Σ ∪ ∆. Exercise 2.34. Show that, for any a ∈ Σ0 , TΣ (S) = TΣ ·a (S ∪ {a}).

We close this section with two general remarks. Remark 2.35. Definition 2.5 of a tree is of course rather arbitrary. Other, equally useful, ways of defining trees as a special kind of strings are obtained by replacing a[t1 · · · tk ] in Definition 2.5 by [t1 · · · tk ]a or at1 · · · tk ] or [at1 · · · tk ] or at1 · · · tk (only in the case that each symbol has exactly one rank) or [ t1 · · · tk ] (where [ is a new symbol for each a) or a

a

a[t1 ,t2 , . . . ,tk ] (where “ , ” is a new symbol).

Remark 2.36. Remark on the general philosophy in tree language theory. The general philosophy looks like this:

(1)

(3)

(2)

(1) Take vertical string language theory (cf. Definition 2.21), (2) generalize it to tree language theory, and (3) map this into horizontal string language theory via the yield operation (Definition 2.15). The fourth part of the philosophy is (4) Tree language theory is a specific part of string language theory, illustrated as follows:

9

a[b[cd]d]

a

a [b d] [cd]

b c

d d

Example: (1). (vertical) string concatenation (2). tree concatenation (3). (horizontal) string substitution

(see Exercise 2.30)

(4). (2) is a special case of (3)

(see Exercise 2.31)

3 Recognizable tree languages 3.1 Finite tree automata and regular tree grammars Let us first consider the usual finite automaton on strings. A deterministic finite automaton is a structure M = (Q, Σ, δ, q0 , F ), where Q is the set of states, Σ is the input alphabet, q0 is the initial state, F is the set of final states and δ is a family {δa }a∈Σ , where δa : Q → Q is the transition function for the input a. There are several ways to describe the functioning of M and the language it recognizes. One of them (see for instance [Sal, I.4]), is to describe explicitly the sequence of steps taken by the automaton while processing some input string. This point of view will be considered in Part (4). Another way is to give a recursive definition of the effect of an input string on the state of M . Since a recursive definition is in particular suitable for generalization to trees, let b us consider one in detail. We define a function δb : Σ∗ → Q such that, for w ∈ Σ∗ , δ(w) is intuitively the state M reaches after processing w, starting from the initial state q0 : b (i) δ(λ) = q0 b b (ii) for w ∈ Σ∗ and a ∈ Σ, δ(wa) = δa (δ(w)). b The language recognized by M is L(M ) = {w ∈ Σ∗ | δ(w) ∈ F }. When considering this b definition of δ for “bottom-up” monadic trees (see Definition 2.21), one easily arrives at the following generalization to the tree case: There should be a start state for each element of Σ0 . The finite tree automaton starts at all leaves (“at the same time”, “in parallel”) and processes the tree in a bottom-up fashion. The automaton arrives at each node of rank k with a sequence of k states (one state for each direct subtree of the node), and the transition function δa of the label a of that node is a mapping δa : Qk → Q, which, from that sequence of k states, determines

10

the state at that node. A tree is recognized iff the tree automaton is in a final state at the root of the tree. Formally: Definition 3.1. A deterministic bottom-up finite tree automaton is a structure M = (Q, Σ, δ, s, F ), where Q is a finite set (of states), Σ is a ranked alphabet (of input symbols), δ is a family {δak }k≥1,a∈Σk of mappings δak : Qk → Q (the transition function for a ∈ Σk ), s is a family {sa }a∈Σ0 of states sa ∈ Q (the initial state for a ∈ Σ0 ), and F is a subset of Q (the set of final states). The mapping δb : TΣ → Q is defined recursively as follows: b = sa , (i) for a ∈ Σ0 , δ(a) (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , b 1 · · · tk ]) = δ k (δ(t b 1 ), . . . , δ(t b k )). δ(a[t a

b ∈ F }. The tree language recognized by M is defined to be L(M ) = {t ∈ TΣ | δ(t)

b is the state reached by M after bottom-up processing of t. Intuitively, δ(t) For convenience, when k is understood, we shall write δa rather than δak . Note therefore that each symbol a ∈ Σ may have several transition functions δa (one for each of its ranks). We shall abbreviate “finite tree automaton” by “fta”, and “deterministic” by “det.”. Definition 3.2. A tree language L is called recognizable (or regular) if L = L(M ) for some det. bottom-up fta M . The class of recognizable tree languages will be denoted by RECOG.

Example 3.3. Let us consider the det. bottom-up fta M = (Q, Σ, δ, s, F ), where Q = {0, 1, 2, 3}, Σ0 = {0, 1, 2, . . . , 9}, Σ2 = {+, ∗}, sa ≡ a (mod 4), F = {1}, and δ+ and δ∗ (both mappings Q2 → Q) are addition modulo 4 and multiplication modulo 4 respectively. Then M recognizes the set of all “expressions” whose value modulo 4 is 1. Consider for instance the expression +[+[07]∗[2∗[73]]], the prefix form of (0+7)+(2∗(7∗3)). In the following picture, + + 0 (0)

(1) ∗ (2)

(3) 7 (3)

∗ (1)

2 (2) 7 (3)

3 (3)

the state of M at each node of the tree is indicated between parentheses.

11

Example 3.4. Let Σ0 = {a} and Σ2 = {b}. Consider the language of all trees in TΣ which have a “right comb-like” structure like for instance the tree b[ab[ab[ab[aa]]]]. This tree language is recognized by the det. bottom-up fta M = (Q, Σ, δ, s, F ), where Q = {A, C, W }, sa = A, F = {C} and δb is defined by δb (A, A) = δb (A, C) = C and δb (q1 , q2 ) = W for all other pairs of states (q1 , q2 ). Exercise 3.5. Let Σ0 = {a, b}, Σ1 = {p} and Σ2 = {p, q}. Construct det. bottom-up finite tree automata recognizing the following tree languages: (i) the language of all trees t, such that if a node of t is labeled q, then its descendants are labeled q or a; (ii) the set of all trees t such that yield(t) ∈ a+ b+ ; (iii) the set of all trees t such that the total number of p’s occurring in t is odd.

A (theoretically) convenient extension of the deterministic finite automaton is to make it nondeterministic. A nondeterministic finite automaton (on strings) is a structure M = (Q, Σ, δ, S, F ), where Q, Σ and F are the same as in the deterministic case, S is a set of initial states, and, for each a ∈ Σ, δa is a mapping Q → P(Q) (intuitively, δa (q) is the set of states which M can possibly, nondeterministically, enter when reading a in b now from Σ∗ into P(Q), can be defined, such that for every state q). Again a mapping δ, b w ∈ Σ∗ , δ(w) is the set of states M can possibly reach after processing w, having started from one of the initial states in S: b (i) δ(λ) = S, S b b (ii) for w ∈ Σ∗ and a ∈ Σ, δ(wa) = {δa (q) | q ∈ δ(w)}. b The language recognized by M is L(M ) = {w ∈ Σ∗ | δ(w) ∩ F 6= ∅}. Generalizing to trees we obtain the following definition. Definition 3.6. A nondeterministic bottom-up finite tree automaton is a 5-tuple M = (Q, Σ, δ, S, F ), where Q, Σ and F are as in the deterministic case, S is a family {Sa }a∈Σ0 such that Sa ⊆ Q for each a ∈ Σ0 , and δ is a family {δak }k≥1,a∈Σk of mappings δak : Qk → P(Q). The mapping δb : TΣ → P(Q) is defined recursively by b = Sa , (i) for a ∈ Σ0 , δ(a) (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , b 1 · · · tk ]) = S{δa (q1 , . . . , qk ) | qi ∈ δ(t b i ) for 1 ≤ i ≤ k}. δ(a[t b ∩ F 6= ∅}. The tree language recognized by M is L(M ) = {t ∈ TΣ | δ(t)

Note that, for q ∈ Qk , δak (q) may be empty. Example 3.7. Let Σ0 = {p} and Σ2 = {a, b}. Consider the following tree language over Σ: L = {u1 a[a[s1 s2 ]a[t1 t2 ]]u2 ∈ TΣ | · · · } ∪ {u1 b [b[s1 s2 ] b[t1 t2 ]]u2 ∈ TΣ | · · · }

12

where “· · · ” stands for u1 , u2 ∈ (Σ ∪ {[ , ]})∗ , s1 , s2 , t1 , t2 ∈ TΣ . In other words, L is a b the set of all trees containing a configuration or a configuration a a b b (or both). L is recognized by the nondet. bottom-up fta M = (Q, Σ, δ, S, F ), where Q = {qs , qa , qb , r}, Sp = {qs }, F = {r} and δa (qs , qs ) = {qs , qa }, δb (qs , qs ) = {qs , qb }, δa (qa , qa ) = δb (qb , qb ) = {r}, for all q ∈ Q : δa (q, r) = δa (r, q) = δb (q, r) = δb (r, q) = {r}, and δx (q1 , q2 ) = ∅ for all other possibilities. It is rather obvious in the last example that we can find a deterministic bottom-up fta recognizing the same language (find it!). We now show that this is possible in general (as in the case of strings). Theorem 3.8. For each nondeterministic bottom-up fta we can find a deterministic one recognizing the same language. Proof. The proof uses the “subset-construction”, well known from the string-case. Let M = (Q, Σ, δ, S, F ) be a nondeterministic bottom-up fta. Construct the deterministic bottom-up fta M1 = (P(Q), Σ, δ1 , s1 , F1 ) such that (s1 )a = Sa for all a ∈ Σ0 , F1 = {Q1 ∈ P(Q) | Q1 ∩ F 6= ∅}, and, for a ∈ Σk and Q1 , . . . , Qk ⊆ Q, [ (δ1 )a (Q1 , . . . , Qk ) = {δa (q1 , . . . , qk ) | qi ∈ Qi for all 1 ≤ i ≤ k}. b for all It is straightforward to show, using Definitions 3.1 and 3.6, that δb1 (t) = δ(t) b t ∈ TΣ (proof by induction on t). From this it follows that L(M1 ) = {t | δ1 (t) ∈ F1 } = b ∩ F 6= ∅} = L(M ). {t | δ(t) Exercise 3.9. Check the proof of Theorem 3.8. Construct the det. bottom-up fta corresponding to the fta M of Example 3.7 according to that proof, and compare this det. fta with the one you found before. Let us now consider the top-down generalization of the finite automaton. Let M = (Q, Σ, δ, q0 , F ) be a det. finite automaton. Another way to define L(M ) is by giving a recursive definition of a mapping δe : Σ∗ → P(Q) such that intuitively, for e each w ∈ Σ∗ , δ(w) is the set of states q such that the machine M , when started in state q, enters a final state after processing w. The definition of δe is as follows: e (i) δ(λ) =F (ii) for w ∈ Σ∗ and a ∈ Σ, e e δ(aw) = {q | δa (q) ∈ δ(w)} (the last line may be read as: to check whether, starting in q, M recognizes aw, compute q1 = δa (q) and check whether M recognizes w starting in q1 ). The language recognized e by M is L(M ) = {w ∈ Σ∗ | q0 ∈ δ(w)}. This definition, applied to “top-down” monadic trees, leads to the following generalization to arbitrary trees. The finite tree automaton starts at the root of the tree in the initial state, and processes the tree in a top-down

13

fashion. The automaton arrives at each node in one state, and the transition function δa of the label a of that node is a mapping δa : Q → Qk (where k is the rank of the node), which, from that state, determines the state in which to continue for each direct descendant of the node (the automaton “splits up” into k independent copies, one for each direct subtree of the node). Finally the automaton arrives at all leaves of the tree. There should be a set of final states for each element of Σ0 . The tree is recognized if the fta arrives at each leaf in a state which is final for the label of that leaf. Formally: Definition 3.10. A deterministic top-down finite tree automaton is a 5-tuple M = (Q, Σ, δ, q0 , F ), where Q is a finite set (of states), Σ is a ranked alphabet (of input symbols), is a family {δak }k≥1,a∈Σk of mappings δak : Q → Qk (the transition δ function for a ∈ Σk ), q0 is in Q (the initial state), and F is a family {Fa }a∈Σ0 of sets Fa ⊆ Q (the set of final states for a ∈ Σ0 ). The mapping δe : TΣ → P(Q) is defined recursively by e = Fa (i) for a ∈ Σ0 , δ(a) (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , e 1 · · · tk ]) = {q | δa (q) ∈ δ(t e 1 ) × · · · × δ(t e k )}. δ(a[t e The tree language recognized by M is defined to be L(M ) = {t ∈ TΣ | q0 ∈ δ(t)}.

e is the set of states q such that M , when starting at the root of t in Intuitively, δ(t) state q, arrives at the leaves of t in final states. Example 3.11. Consider the tree language of Exercise 3.5(i). A det. top-down fta recognizing this language is M = (Q, Σ, δ, q0 , F ) where Q = {A, R, W }, q0 = A, Fa = {A, R}, Fb = {A} and δp1 (A) = A,

δp1 (R) = δp1 (W ) = W,

δp2 (A) = (A, A),

δp2 (R) = δp2 (W ) = (W, W ),

δq (A) = (R, R),

δq (R) = (R, R), δq (W ) = (W, W ).

Exercise 3.12. Let Σ be a ranked alphabet, and p ∈ Σ2 . Let L be the tree language defined recursively by (i) for all t1 , t2 ∈ TΣ , p[t1 t2 ] ∈ L (ii) for all a ∈ Σk , if t1 , . . . , tk ∈ L, then a[t1 · · · tk ] ∈ L (k ≥ 1). Construct a deterministic top-down fta recognizing L. Give a nonrecursive description of L. Exercise 3.13. Construct a det. top-down fta M such that yield(L(M )) = a+ b+ .

We now show that the det. top-down fta recognizes less languages than its bottom-up counterpart.

14

Theorem 3.14. There are recognizable tree languages which cannot be recognized by a deterministic top-down fta. Proof. Let Σ0 = {a, b} and Σ2 = {S}. Consider the (finite!) tree language L = {S[ab], S[ba]}. Suppose that the det. top-down fta M = (Q, Σ, δ, q0 , F ) recognizes L. Let δS (q0 ) = (q1 , q2 ). Since S[ab] ∈ L(M ), q1 ∈ Fa and q2 ∈ Fb . But, since S[ba] ∈ L(M ), q1 ∈ Fb and q2 ∈ Fa . Hence both S[aa] and S[bb] are in L(M ). Contradiction. Exercise 3.15. Show that the tree languages of Exercise 3.5(ii,iii) are not recognizable by a det. top-down fta. It will be clear that the nondeterministic top-down fta is able to recognize all recognizable languages. We give the definition without comment. Definition 3.16. A nondeterministic top-down finite tree automaton is a structure M = (Q, Σ, δ, S, F ), where Q, Σ and F are as in the deterministic case, S is a subset of Q and δ is a family {δak }k≥1,a∈Σk of mappings δak : Q → P(Qk ). The mapping δe : TΣ → P(Q) is defined recursively as follows e = Fa , (i) for a ∈ Σ0 , δ(a) (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , e 1 · · · tk ]) = {q | ∃(q1 , . . . , qk ) ∈ δa (q) : qi ∈ δ(t e i ) for all 1 ≤ i ≤ k}. δ(a[t e ∩ S 6= ∅}. The tree language recognized by M is L(M ) = {t ∈ TΣ | δ(t)

We now show that, nondeterministically, there is no difference between bottom-up or top-down recognition. Theorem 3.17. A tree language is recognizable by a nondet. bottom-up fta iff it is recognizable by a nondet. top-down fta. Proof. Let us say that a nondet. bottom-up fta M = (Q, Σ, δ, S, F ) and a nondet. topdown fta N = (P, ∆, µ, R, G) are “associated” if the following requirements are satisfied: (i) Q = P , Σ = ∆, F = R and, for all a ∈ Σ0 , Sa = Ga ; (ii) for all k ≥ 1, a ∈ Σk and q1 , . . . , qk , q ∈ Q, q ∈ δa (q1 , . . . , qk ) iff (q1 , . . . , qk ) ∈ µa (q). In that case, one can easily prove by induction that δb = µ e, and so L(M ) = L(N ). Since obviously for each nondet. bottom-up fta there is an associated nondet. top-down fta, and vice versa, the theorem holds. Thus the classes of tree languages recognized by the nondet. bottom-up, det. bottom-up and nondet. top-down fta are all equal (and are called RECOG), whereas the class of tree languages recognized by the det. top-down fta is a proper subclass of RECOG. The next victim of generalization is the regular grammar (right-linear, type-3 grammar). In this case it seems appropriate to take the top-down point of view only. Consider an

15

ordinary regular grammar G = (N, Σ, R, S). All rules have either the form A → wB or the form A → w, where A, B ∈ N and w ∈ Σ∗ . Monadically, the string wB may be considered as the result of treeconcatenating the tree we with B at e, where B is of rank 0. Thus we can take the generalization of strings of the form wB or w to be trees in T∆ (N ), where ∆ is a ranked alphabet (for the definition of T∆ (N ), see Definition 2.33). Thus, let us consider a “tree grammar” with rules of the form A → t, where A ∈ N and t ∈ T∆ (N ). Obviously, the application of a rule A → t to a tree s ∈ T∆ (N ) should intuitively consist of replacing one occurrence of A in s by the tree t. Starting with the initial nonterminal, nonterminals at the frontier of the tree are then repeatedly replaced by right hand sides of rules, until the tree does not contain nonterminals any more. Now, since trees are defined as strings, it turns out that this process is precisely the way a context-free grammar works. Thus we arrive at the following formal definition. Definition 3.18. A regular tree grammar is a tuple G = (N, Σ, R, S) where N is a finite set (of nonterminals), Σ is a ranked alphabet (of terminals), such that Σ ∩ N = ∅, S ∈ N is the initial nonterminal, and R is a finite set of rules of the form A → t with A ∈ N and t ∈ TΣ (N ). The tree language generated by G, denoted by L(G), is defined to be L(H), ∗ ⇒ (or where H is the context-free grammar (N, Σ ∪ {[ , ]}, R, S). We shall use = ⇒ and = ∗

∗

G

G

⇒ and = ⇒ when G is understood) to denote the restrictions of =⇒ and =⇒ to TΣ (N ). H

H

Example 3.19. Let Σ0 = {a, b, c, d, e}, Σ2 = {p} and Σ3 = {p, q}. Consider the regular tree grammar G = (N, Σ, R, S), where N = {S, T } and R consists of the rules S → p[aT a], T → q[cp[dT ]b] and T → e. Then G generates the tree p[aq[cp[de]b]a] as follows: S ⇒ p[aT a] ⇒ p[aq[cp[dT ]b]a] ⇒ p[aq[cp[de]b]a] or, pictorially, p

S ⇒ a

T

p

⇒ a

p

⇒

.

a

q

a

a

q

a

c

p

b

c

p

b

d

T

d

e

The tree language generated by G is {p[a(q[cp[d)n e(]b])n a] | n ≥ 0}.

Exercise 3.20. Write regular tree grammars generating the tree languages of Exercise 3.5. As in the case of strings, each regular tree grammar is equivalent to one that has the property that at each step in the derivation exactly one terminal symbol is produced. Definition 3.21. A regular tree grammar G = (N, Σ, R, S) is in normal form, if each of its rules is either of the form A → a[B1 · · · Bk ] or of the form A → b, where k ≥ 1, a ∈ Σk , A, B1 , . . . , Bk ∈ N and b ∈ Σ0 .

16

Theorem 3.22. Each regular tree grammar has an equivalent regular tree grammar in normal form. Proof. Consider an arbitrary regular tree grammar G = (N, Σ, R, S). Let G1 = (N, Σ, R1 , S) be the regular tree grammar such that (A → t) ∈ R1 if and only if ∗ t∈ / N and there is a B in N such that A = ⇒ B and (B → t) ∈ R1 . Then L(G1 ) = L(G), G

and R1 does not contain rules of the form A → B with A, B ∈ N . (This is the well-known procedure of removing rules A → B from a context-free grammar). Suppose that G1 is not yet in normal form. Than there is a rule of the form A → a[t1 · · · ti · · · tk ] such that ti ∈ / N . Construct a new regular tree grammar G2 by adding a new nonterminal B to N and replacing the rule A → a[t1 · · · ti · · · tk ] by the two rules A → a[t1 · · · B · · · tk ] and B → ti in R1 . It should be clear that L(G2 ) = L(G1 ), and that, by repeating the latter process a finite number of times, one ends up with an equivalent grammar in normal form. Exercise 3.23. Put the regular tree grammar of Example 3.19 into normal form.

Exercise 3.24. What does Theorem 3.22 actually say in the case of strings (the monadic case)? In the next theorem we show that the regular tree grammars generate exactly the class of recognizable tree languages. Theorem 3.25. A tree language can be generated by a regular tree grammar iff it is an element of RECOG. Proof. Exercise.

Note therefore that each recognizable tree language is a special kind of context-free language. Exercise 3.26. Show that all finite tree languages are in RECOG.

Exercise 3.27. Show that each recognizable tree language can be generated by a “backwards deterministic” regular tree grammar. A regular tree grammar is called “backwards deterministic” if (1) it may have more than one initial nonterminal, (2) it is in normal form, and (3) rules with the same right hand side are equal. It is now easy to show the connection between recognizable tree languages and contextfree languages. Let CFL denote the class of context-free languages. Theorem 3.28. yield(RECOG) = CFL (in words, the yield of each recognizable tree language is context-free, and each context-free language is the yield of some recognizable tree language).

17

Proof. Let G = (N, Σ, R, S) be a regular tree grammar. Consider the context-free grammar G = (N, Σ0 , R, S), where R = {A → yield(t) | A → t ∈ R}. Then L(G) = yield(L(G)). Now let G = (N, Σ, R, S) be a context-free grammar. Let ∗ be a new symbol, and let ∆ = Σ ∪ {e, ∗} be the ranked alphabet such that ∆0 = Σ ∪ {e}, and, for k ≥ 1, ∆k = {∗} if and only if there is a rule in R with a right hand side of length k. Consider the regular tree grammar G = (N, ∆, R, S) such that (i) if A → w is in R, w 6= λ, then A → ∗[w] is in R, (ii) if A → λ is in R, then A → e is in R. Then yield(L(G)) = L(G).

In the next section we shall give the connection between regular tree languages and derivation trees of context-free languages. Exercise 3.29. A context-free grammar is “invertible” if rules with the same right hand side are equal. Show that each context-free language can be generated by an invertible context-free grammar. For regular string languages a useful stronger version of Theorem 3.28 can be proved. Theorem 3.30. Let Σ be a ranked alphabet. If R is a regular string language over Σ0 , then the tree language {t ∈ TΣ | yield(t) ∈ R} is recognizable. Proof. Let M = (Q, Σ, δ, q0 , F ) be a deterministic finite automaton recognizing R. We construct a nondeterministic bottom-up fta N = (Q × Q, Σ, µ, S, G), which, for each tree t, checks whether a successful computation of M on yield(t) is possible. The states of N are pairs of states of M . Intuitively we want that (q1 , q2 ) ∈ µ b(t) if and only if M arrives in state q2 after processing yield(t), starting from state q1 . Thus we define (i) for all a ∈ Σ0 , Sa = {(q1 , q2 ) | δa (q1 ) = q2 }, (ii) for all k ≥ 1, a ∈ Σk and states q1 , q2 , . . . , q2k ∈ Q, {(q1 , q2k )} if q2i = q2i+1 for µa ((q1 , q2 ), (q3 , q4 ), . . . , (q2k−1 , q2k )) = all 1 ≤ i ≤ k − 1 ∅ otherwise . Then L(N ) = {t ∈ TΣ | yield(t) ∈ R}.

Exercise 3.31. Show that, if Σ2 6= ∅, then Theorem 3.30 holds conversely: if L is a string language such that {t ∈ TΣ | yield(t) ∈ L} is recognizable, then L is regular. What can you say in case Σ2 = ∅?

3.2 Closure properties of recognizable tree languages We first consider set-theoretic operations.

18

Theorem 3.32. RECOG is closed under union, intersection and complementation. Proof. To show closure under complementation, consider a deterministic bottom-up fta M = (Q, Σ, δ, s, F ). Let N be the det. bottom-up fta (Q, Σ, δ, s, Q − F ). Then, obviously, L(N ) = TΣ − L(M ). To show closure under union, consider two regular tree grammars Gi = (Ni , Σi , Ri , Si ), i = 1, 2 (with N1 ∩ N2 = ∅). Then G = (N1 ∪ N2 ∪ {S}, Σ1 ∪ Σ2 , R1 ∪ R2 ∪ {S → S1 , S → S2 }, S) is a regular tree grammar such that L(G) = L(G1 ) ∪ L(G2 ). As a corollary we obtain the following closure property of context-free languages. Corollary 3.33. CFL is closed under intersection with regular languages. Proof. Let L and R be a context-free and regular language respectively. According to Theorem 3.28, there is a recognizable tree language U such that yield(U ) = L. Consequently, by Theorems 3.30 and 3.32, the tree language V = U ∩ {t | yield(t) ∈ R} is recognizable. Obviously L ∩ R = yield(V ) and so, again by Theorem 3.28, L ∩ R is context-free. We now turn to the closure of RECOG under concatenation operations (see Definitions 2.23 and 2.27). Theorem 3.34. For every k ≥ 1 and a ∈ Σk , RECOG is closed under tcka . Proof. Exercise.

Theorem 3.35. RECOG is closed under tree concatenation. Proof. The proof is obtained by generalizing that for regular string languages. Let n ≥ 1, a1 , . . . , an ∈ Σ0 all different and L0 , L1 , . . . , Ln recognizable tree languages (we may assume that all languages are over the same ranked alphabet Σ). Let Gi = (Ni , Σ, Ri , Si ) be a regular tree grammar in normal form for Li (i = 0, 1, . . . , n). A regular tree grammar n n S S generating L0 ha1 ← L1 , . . . , an ← Ln i is G = ( Ni , Σ, R, S0 ), where R = R0 ∪ Ri , i=0

i=1

and R0 is R0 with each rule of the form A → ai replaced by the rule A → Si (1 ≤ i ≤ n). Corollary 3.36. CFL is closed under substitution. Proof. Use Theorem 3.28 and Exercise 2.30.

Note also that Theorem 3.35 is essentially a special case of Corollary 3.36. Next we generalize the notion of (concatenation) closure of string languages to trees, and show that RECOG is closed under this closure operation. We shall, for convenience, restrict ourselves to the case that tree concatenation happens at one element of Σ0 .

19

Definition 3.37. Let a ∈ Σ0 and let L be a tree language over Σ. Then the tree ∞ S concatenation closure of L at a, denoted by L∗a , is defined to be Xn , where X0 = {a} n=0

and, for n ≥ 0, Xn+1 = Xn ·a (L ∪ {a}). †

Example 3.38. Let G = (N, Σ, R, S) be the regular tree grammar with N = {S}, Σ0 = {a}, Σ2 = {b} and R = {S → b[aS], S → a}. Then L(G) = {b[aS]}∗S ·S a. The “corresponding” operation on strings has several names in the literature. Let us call it “substitution closure”. Definition 3.39. Let ∆ be an alphabet and a ∈ ∆. For a language L over ∆, the ∞ S substitution closure of L at a, denoted by L∗a , is defined to be Xn , where X0 = {a} n=0

and, for n ≥ 0, Xn+1 = Xn ·a (L ∪ {a}).

Exercise 3.40. Let a ∈ Σ0 , a 6= e, and let L ⊆ TΣ . Prove that yield(L∗a ) = (yield(L))∗a . Theorem 3.41. RECOG is closed under tree concatenation closure. Proof. Again the proof is a straightforward generalization of the string case. Let G = (N, Σ, R, S) be a regular tree grammar in normal form, and let a ∈ Σ0 . Construct the regular tree grammar G = (N ∪ {S0 }, Σ, R, S0 ), where R = R ∪ {A → S | A → a is in R} ∪ {S0 → S, S0 → a}. Then L(G) = (L(G))∗a . Corollary 3.42. CFL is closed under substitution closure. Proof. Use Theorem 3.28 and Exercise 3.40.

It is well known that the class of regular string languages is the smallest class containing the finite languages and closed under union, concatenation and closure. A similar result holds for recognizable tree languages. Theorem 3.43. RECOG is the smallest class of tree languages containing the finite tree languages and closed under union, tree concatenation and tree concatenation closure. Proof. We have shown that RECOG satisfies the above conditions in Exercise 3.26 and Theorems 3.32, 3.35 and 3.41. It remains to show that every recognizable tree language can be built up from the finite tree languages using the operations ∪, ·a and ∗a . Let G = (N, Σ, R, S) be a regular tree grammar (it is easy to think of it as being in normal form). We shall use the elements of N to do tree concatenation at. For A ∈ N and P, Q ⊆ N with P ∩ Q = ∅, let us denote by LQ A,P the set of all trees t ∈ TΣ (P ) for which there is a derivation A ⇒ t1 ⇒ t2 ⇒ · · · ⇒ tn ⇒ tn+1 = t (n ≥ 0) such that, for 1 ≤ i ≤ n, ti ∈ TΣ (Q ∪ P ) and a rule with left hand side in Q is applied to ti to obtain †

Recall the notation L1 ·a L2 from Definition 2.27.

20

ti+1 . We shall show, by induction on the cardinality of Q, that all sets LQ A,P can be built ∗B up from the finite tree languages by the operations ∪, ·B and (for all B ∈ N ). For ∅ Q = ∅, LA,P is the set of all those right hand sides of rules with left hand side A, that are in TΣ (P ). Thus L∅A,P is a finite tree language for all A and P . Assuming now that, for Q ⊆ N , all sets LQ A,P can be built up from the finite tree languages, the same holds Q∪{B}

for all sets LA,P

, where B ∈ N − Q, since Q∪{B}

LA,P

Q ∗B = LQ · B LQ B,P A,P ∪{B} ·B (LB,P ∪{B} )

(a formal proof of this equation is left to the reader). Thus, since L(G) = LN S,∅ , the theorem is proved. In other words, each recognizable tree language can be denoted by a “regular expression” with trees as constants and ∪, ·A and ∗A as operators. Exercise 3.44. Try to find a regular expression for the language generated by the regular tree grammar G = (N, Σ, R, S) with N = {S, T }, Σ0 = {a}, Σ2 = {p} and R = {S → p[T S], S → a, T → p[T T ], T → a}. Use the algorithm in the proof of Theorem 3.43. As a corollary we obtain the result that all context-free languages can be denoted by “context-free expressions”. Corollary 3.45. CFL is the smallest class of languages containing the finite languages and closed under union, substitution and substitution closure. Proof. Exercise.

Exercise 3.46. Define the operation of “iterated concatenation at a” (for tree languages) and “iterated substitution at a” (for string languages) by ita (L) = L∗a ·a ∅. Prove (using Theorem 3.43) that RECOG is the smallest class of tree languages containing the finite tree languages and closed under the operations of union, top concatenation and iterated concatenation. Show that this implies that CFL is the smallest class of languages containing the finite languages and closed under the operations of union, concatenation and iterated substitution (cf. [Sal, VI.11]). Let us now turn to another operation on trees: that of relabeling the nodes of a tree. Definition 3.47. Let Σ and ∆ be ranked alphabets. A relabeling r is a family {rk }k≥0 of mappings rk : Σk → P(∆k ). A relabeling determines a mapping r : TΣ → P(T∆ ) by the requirements (i) for a ∈ Σ0 , r(a) = r0 (a), (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , r(a[t1 · · · tk ]) = {b[s1 · · · sk ] | b ∈ rk (a) and si ∈ r(ti )}.

21

If, for each k ≥ 0 and each a ∈ Σk , rk (a) consists of one element only, then r is called a projection. Obviously, RECOG is closed under relabelings. Theorem 3.48. RECOG is closed under relabelings. Proof. Let r be a relabeling, and consider some regular tree grammar G. By replacing each rule A → t of G by all rules A → s, s ∈ r(t), one obtains a regular tree grammar for r(L(G)). (In order that “r(t)” makes sense, we define r(B) = {B} for each nonterminal B of G). We are now in a position to study the connection between recognizable tree languages and sets of derivation trees of context-free grammars. We shall consider two kinds of derivation trees. First we define the “ordinary” kind of derivation tree (cf. Example 1.1). Definition 3.49. Let G = (N, Σ, R, S) be a context-free grammar. Let ∆ be the ranked alphabet such that ∆0 = Σ ∪ {e} and, for k ≥ 1, ∆k is the set of nonterminals A ∈ N for which there is a rule A → w with |w| = k (in case k = 1 : |w| = 1 or |w| = 0). For each α , is the tree language α ∈ N ∪ Σ, the set of derivation trees with top α, denoted by DG over ∆ defined recursively as follows a; (i) for each a in Σ, a ∈ DG

(ii) for each rule A → α1 · · · αn in R (n ≥ 1, A ∈ N , αi ∈ Σ ∪ N ), αi A; if ti ∈ DG for 1 ≤ i ≤ n, then A[t1 · · · tn ] ∈ DG A. (iii) for each rule A → λ in R, A[e] ∈ DG

Definition 3.50. A tree language L is said to be local if, for S some context-free grammar α. G = (N, Σ, R, S) and some set of symbols V ⊆ N ∪ Σ, L = DG α∈V

Exercise 3.51. Show that each local tree language is recognizable.

Note that a local tree language is the set of all derivation trees of a context-free grammar which has a set of initial symbols (instead of one initial nonterminal). The reason for the name “local” is that such a tree language L is determined by (1) a finite set of trees of height one, (2) a finite set of “initial symbols”, (3) a finite set of “final symbols”, and the requirement that L consists of all trees t such that each node of t together with its direct descendants belongs to (1), the top label of t belongs to (2), and the leaf labels of t to (3). We now show that the class of local tree languages is properly included in RECOG. Theorem 3.52. There are recognizable tree languages which are not local. Proof. Let Σ0 = {a, b} and Σ2 = {S}. Consider the tree language L = {S[S[ba]S[ab]]}. Obviously L is recognizable. Suppose that L is local. Then there is a context-free

22

S = L. Thus S → SS, S → ba and S → ab are rules of G. But grammar G such that DG then S[S[ab]S[ba]] ∈ L. Contradiction. †

Note that the recognizable tree language L in the above proof can be recognized by a deterministic top-down fta. Note also that the tree language given in the proof of Theorem 3.14 is local. Hence the local tree languages and the tree languages recognized by det. top-down fta are incomparable. Exercise 3.53. Find a recognizable tree language which is neither local nor recognizable by a det. top-down fta. It is clear that, if Σ0 = {a, b} and Σ2 = {S1 , S2 , S3 }, then L0 = {S1 [S2 [ba]S3 [ab]]} is a local language. Hence the language L in Theorem 3.52 is the projection of the local language L0 (project S1 , S2 and S3 on S). We will show that this is true in general: each recognizable tree language is the projection of a local tree language. In fact we shall show a slightly stronger fact. To do this we define the second type of derivation tree of a context-free grammar, called “rule tree”. Definition 3.54. Let G = (N, Σ, R, S) be a context-free grammar. Let R be any set of symbols in one-to-one correspondence with R, R = {r | r ∈ R}. Each element of R is given a rank such that, if r in R is of the form A → w0 A1 w1 A2 w2 · · · Ak wk (for some k ≥ 0, A1 , . . . , Ak ∈ N and w0 , w1 , . . . , wk ∈ Σ∗ ), then r ∈ Rk . The set of rule trees of G, denoted by RT (G), is defined to be the tree language generated by the regular tree grammar G = (N, R, P, S), where P is defined by (i) if r = (A → w0 A1 · · · wk−1 Ak wk ), k ≥ 1, is in R, then A → r[A1 · · · Ak ] is in P ; (ii) if r = (A → w0 ) is in R, then A → r is in P .

Definition 3.55. We shall say that a tree language L is a rule tree language if L = RT (G) for some context-free grammar G. Thus, a rule tree is a derivation tree in which the nodes are labeled by the rules applied during the derivation. It should be obvious, that for each context-free grammar G = (N, Σ, R, S) there is a one-to-one correspondence between the tree languages RT (G) S. and DG Example 3.56. Consider Example 1.1. For each rule r in that example, let (r) stand for a new symbol. The rule tree “corresponding” to the derivation tree displayed in Example 1.1 is

†

Other examples are for instance {S[T [a]T [b]]} and {S[S[a]]}.

23

(S → AD) (A → AA)

(D → Ddd)

(A → bAa)

(A → aAb)

(A → λ)

(A → λ)

(D → d)

Note that this tree is obtained from the other one by viewing the building blocks (trees of height one) of the local tree as the nodes of the rule tree. The following theorem shows the relationship of the rule languages to those defined before. Theorem 3.57. The class of rule tree languages is properly included in the intersection of the class of local tree languages and the class of tree languages recognizable by a det. top-down fta. Proof. We first show inclusion in the class of local tree languages. Let G = (N, Σ, R, S) be a context-free grammar, and R = {r | r ∈ R}. Consider the context-free grammar G1 = (R − R0 , R0 , P, −) where P is defined as follows: if r = (A → w0 A1 w1 · · · Ak wk ), k ≥ 1, is in R, then r → r1 · · · rk is in P for all rules r1 , . . . , rk ∈ R such that the left hand side of S ri αis Ai (1 ≤ i ≤ k). Let V = {r | r ∈ R has left hand side S}. Then DG1 , and hence RT (G) is local. RT (G) = α∈V

To show that RT (G) can be recognized by a deterministic top-down fta, consider M = (Q, R, δ, q0 , F ), where Q = N ∪ {W }, q0 = S, for r ∈ R0 , Fr consists of the left hand side of r only, and for r ∈ Rk , r of the form A → w0 A1 w1 A2 w2 · · · Ak wk , δr (A) = (A1 , . . . , Ak ) and δr (B) = (W, . . . , W ) for all other B ∈ Q. Then L(M ) = RT (G). To show proper inclusion, let H be the context-free grammar with rules S → SS, S is a local tree language. It is easy to see that S → aS, S → Sb and S → ab. Then DH S S = RT (G) for some DH can be recognized by a det. top-down fta. Now suppose that DH S

context-free grammar G. Since S has rank 2 and since the configuration S

occurs S

S , S is the name of a rule of G of the form A → w Aw Aw . Now, since a and b are in DH 0 1 2

S

of rank 0 and since a

S , a and b are names of rules A → w and A → w . is in DH 3 4

b Hence S[ba] is a rule tree of G. Contradiction.

24

We now characterize the recognizable tree languages in terms of rule tree languages. Theorem 3.58. Every recognizable tree language is the projection of a rule tree language. Proof. Let G = (N, Σ, R, S) be a regular tree grammar in normal form. We shall define a regular tree grammar G and a projection p such that L(G) = p(L(G)) and L(G) is a rule tree language. G will simulate G, but G will put all information about the rules, applied during the derivation of a tree t, into the tree itself. This is a useful technique. Let R be a set of symbols in one-to-one correspondence with R, and let G = (N, R, P, S). The ranking of R, the set P of rules and the projection p are defined simultaneously as follows: (i) if r ∈ R is the rule A → a[B1 · · · Bk ], then r has rank k, A → r[B1 · · · Bk ] is in P and pk (r) = a; (ii) if r ∈ R is the rule A → a, then r has rank 0, A → r is in P and p0 (r) = a. It is obvious that p(L(G)) = L(G). Now note that G may be viewed as a context-free grammar (over Σ ∪ {[ , ]}). In fact, G is the same as the one constructed in Definition 3.54! Thus L(G) is a rule tree language. Since RECOG is closed under projections (Theorem 3.48), we now easily obtain the following corollary. Corollary 3.59. For each tree language L the following four statements are equivalent: (i) L is recognizable (ii) L is the projection of a rule tree language (iii) L is the projection of a local tree language (iv) L is the projection of a tree language recognizable by a det. top-down fta.

Exercise 3.60. Show that, in the case of local tree languages, the projection involved in the above corollary (iii) can be taken as the identity on symbols of rank 0 (thus the yields are preserved). As a final operation on trees we consider the notion of tree homomorphism. For strings, a homomorphism h associates a string h(a) with each symbol a of the alphabet, and transforms a string a1 a2 · · · an into the string h(a1 ) · h(a2 ) · · · h(an ). Generalizing this to trees, a tree homomorphism h associates a tree h(a) with each symbol a of the ranked alphabet (actually, one tree for each rank). The application of h to a tree t consists in replacing each symbol a of t by the tree h(a), and tree concatenating all the resulting trees. Note that, if a is of rank k, then h(a) should be tree concatenated with k other trees; therefore, since tree concatenation happens at symbols of rank 0, the tree h(a) should contain at least k different symbols of rank 0. Since, in general, the number of symbols of rank 0 in some alphabet may be less than the rank of some other symbol, we allow for the use of an arbitrary number of auxiliary symbols of rank 0, called “variables” (recall the use of nonterminals as auxiliary symbols of rank 0 in Theorem 3.43).

25

Definition 3.61. Let x1 , x2 , x3 , . . . be an infinite sequence of different symbols, called variables. Let X = {x1 , x2 , x3 , . . . }, for k ≥ 1, Xk = {x1 , x2 , . . . , xk }, and X0 = ∅. Elements of X will also be denoted by x, y and z. Definition 3.62. Let Σ and ∆ be ranked alphabets. A tree homomorphism h is a family {hk }k≥0 of mappings hk : Σk → T∆ (Xk ). A tree homomorphism determines a mapping h : TΣ → T∆ as follows: (i) for a ∈ Σ0 , h(a) = h0 (a); (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , h(a[t1 · · · tk ]) = hk (a)hx1 ← h(t1 ), . . . , xk ← h(tk )i. In the particular case that, for each a ∈ Σk , hk (a) does not contain two occurrences of the same xi (i = 1, 2, 3, . . . ), h is called a linear tree homomorphism. A general tree homomorphism h has the abilities of deleting (hk (a) does not contain ::::::: xi ), ::::::: (if i < j, then xj may copying (hk (a) contains ≥ 2 occurrences of xi ) and permuting :::::::::: occur before xi in hk (a)) subtrees. Moreover, at each node, it can add pieces of tree (the frontier of hk (a) need not to be an element of X ∗ ). A linear homomorphism cannot copy. Note that, to obtain the proper generalization of the monadic case, one should also forbid deletion, and require that h0 (a) = a for all a ∈ Σ0 (moreover, no pieces of tree should be added). Exercise 3.63. Let Σ0 = {a, b} and Σ2 = {p}. Consider the tree homomorphism h such that h0 (a) = a, h0 (b) = b and h2 (p) = p[x2 x1 ]. Show that, for every t in TΣ , yield(h(t)) is the mirror image of yield(t). It is easy to see that recognizable tree languages are not closed under arbitrary tree homomorphisms; they are closed under linear tree homomorphisms. This is shown in the following two theorems. Theorem 3.64. RECOG is not closed under arbitrary tree homomorphisms. Proof. Let Σ0 = {a} and Σ1 = {b}. Let h be the tree homomorphism defined by h0 (a) = a and h1 (b) = b[x1 x1 ]. Consider the recognizable tree language TΣ . It is easy n n to prove that yield(h(TΣ )) = {a2 | n ≥ 0}. Since {a2 | n ≥ 0} is not a context-free language, Theorem 3.28 implies that h(TΣ ) is not recognizable. Theorem 3.65. RECOG is closed under linear tree homomorphisms. Proof. The idea of the proof is obvious. Given some regular tree grammar generating a recognizable tree language, we change the right hand sides of all rules into their homomorphic images. The resulting grammar generates homomorphic images of “sentential forms” of the original grammar (note that this wouldn’t work in the nonlinear case). The only thing we should worry about is that the homomorphism may be deleting. In that case superfluous rules in the original grammar might be transformed into useful rules

26

in the new grammar. This is solved by requiring that the original grammar does not contain any superfluous rule. The formal construction is as follows. Let G = (N, Σ, R, S) be a regular tree grammar in normal form, such that for each ∗ nonterminal A there is at least one t ∈ TΣ such that A = ⇒ t (since G is a context-free grammar, it is well known that each regular tree grammar has an equivalent one satisfying this condition). Let h be a homomorphism from TΣ into T∆ , for some ranked alphabet ∆. Extend h to trees in TΣ (N ) by defining h0 (A) = A for all A in N . Thus h is now a homomorphism from TΣ (N ) into T∆ (N ). Construct the regular tree grammar H = (N, ∆, R, S), where R = {A → h(t) | A → t is in R}. To show that L(H) = h(L(G)) we shall prove that ∗

∗

G ∗

H

(1) if A = ⇒ t, then A =⇒ h(t)

(t ∈ TΣ ); and

(2) if A =⇒ s, then there exists t such that H

∗

h(t) = s and A = ⇒t

(s ∈ T∆ , t ∈ TΣ ).

G

Let us give the straightforward proofs as detailed as possible. ∗

(1) The proof is by induction on the number of steps in the derivation A = ⇒ t. If, G

in one step, A = ⇒ t, then t ∈ Σ0 and A → t is in R. Hence A → h(t) is in R, and so G

∗

∗

A =⇒ h(t). Now suppose that the first step in A = ⇒ t results from the application of a H

∗

G

∗

rule of the form A → a[B1 · · · Bk ]. Then A = ⇒ t is of the form A = ⇒ a[B1 · · · Bk ] = ⇒ t. It G

G

G

∗

follows that t is of the form a[t1 · · · tk ] such that Bi = ⇒ ti for all 1 ≤ i ≤ k. Hence, by G

∗

induction, Bi =⇒ h(ti ). Now, since the rule A → hk (a)hx1 ← B1 , . . . , xk ← Bk i is in R H

by definition, we have (prove this!) ∗

A =⇒ hk (a)hx1 ← B1 , . . . , xk ← Bk i =⇒ hk (a)hx1 ← h(t1 ), . . . , xk ← h(tk )i H

H

= h(a[t1 · · · tk ]) = h(t). ∗

(2) The proof is by induction on the number of steps in A =⇒ s. H

For zero

∗

steps the statement is trivially true. Suppose that the first step in A =⇒ s results H

from the application of a rule A → h0 (a) for some a in Σ0 . Then h(a) = s and ∗ A = ⇒ a. Now suppose that the first step results from the application of a rule G

A → hk (a)hx1 ← B1 , . . . , xk ← Bk i, where A → a[B1 · · · Bk ] is a rule of G. Then ∗ the derivation is A =⇒ hk (a)hx1 ← B1 , . . . , xk ← Bk i =⇒ s. At this point we need H

H

both linearity of h (to be sure that each Bi in hk (a)hx1 ← B1 , . . . , xk ← Bk i produces at most one subtree of s) and the condition on G (to deal with deletion: since hk (a)hx1 ← B1 , . . . , xk ← Bk i need not contain an occurrence of Bi , we need an arbitrary tree generated, in G, by Bi to be able to construct the tree t such that h(t) = s). There exist trees s1 , . . . , sk in T∆ such that s = hk (a)hx1 ← s1 , . . . , xk ← sk i and

27

∗

(i) if xi occurs in hk (a), then Bi =⇒ si ; H

∗

(ii) if xi does not occur in hk (a), then si = h(ti ) for some arbitrary ti such that Bi = ⇒ ti . G

∗

Hence, by induction and (ii), there are trees t1 , . . . , tk such that h(ti ) = si and Bi = ⇒ ti for ∗

G

all 1 ≤ i ≤ k. Consequently, if t = a[t1 · · · tk ], then A = ⇒ a[B1 · · · Bk ] = ⇒ a[t1 · · · tk ] = t, G

G

and h(t) = hk (a)hx1 ← h(t1 ), . . . , xk ← h(tk )i = s. This proves the theorem.

Exercise 3.66. In the string case one can also prove that the regular languages are closed under homomorphisms by using the Kleene characterization theorem. Give an alternative proof of Theorem 3.65 by using Theorem 3.43 (use the fact, which is implicit in the proof of that theorem, that each regular tree language over the ranked alphabet Σ can be built up from finite tree languages using operations ∪, ·A and ∗A , where A ∈ / Σ). As an indication how one could use theorems like Theorem 3.65, we prove the following theorem, which is (slightly!) stronger than Theorem 3.28. Theorem 3.67. Each context-free language over ∆ is the yield of a recognizable tree language over Σ, where Σ0 = ∆ ∪ {e} and Σ2 = {∗}. Proof. Let L be a context-free language over ∆. By Theorem 3.28, there is a recognizable tree language U over some ranked alphabet Ω with Ω0 = ∆ ∪ {e}, such that yield(U ) = L. Let h be the linear tree homomorphism from TΩ into TΣ such that h0 (a) = a for all a in ∆ ∪ {e}, h1 (a) = x1 for all a in Ω1 , and hk (a) = ∗[x1 ∗ [x2 ∗ [· · · ∗ [xk−1 xk ] · · · ]]] for all a in Ωk , k ≥ 2. By Theorem 3.65, h(U ) is a recognizable tree language over Σ. It is easy to show that, for each t in TΩ , yield(h(t)) = yield(t). Hence yield(h(U )) = yield(U ) = L. Note that Theorem 3.67 is “equivalent” to the fact that each context-free language can be generated by a context-free grammar in Chomsky normal form. Exercise 3.68. Try to show that RECOG is closed under inverse (not necessarily linear) homomorphisms; that is, if L ∈ RECOG and h is a tree homomorphism, then h−1 (L) = {t | h(t) ∈ L} is recognizable. (Represent L by a det. bottom-up fta). We have now discussed all AFL operations (see [Sal, IV]) generalized to trees: union, tree concatenation, tree concatenation closure, (linear) tree homomorphism, inverse tree homomorphism and intersection with a recognizable tree language. Thus, according to previous results, RECOG is a “tree AFL”. Exercise 3.69. Generalize the operation of string substitution (see Definition 2.29) to trees, and show that RECOG is closed under “linear tree substitution”.

28

Exercise 3.70. Suppose you don’t know about context-free grammars. Consider the notion of regular tree grammar. Give a recursive definition of the relation = ⇒ for such a ∗ grammar. Show that, if a[t1 · · · tk ] = ⇒ s, then there are s1 , . . . , sk such that s = a[s1 · · · sk ] ∗ and ti = ⇒ si for all 1 ≤ i ≤ k. Which of the two definitions of regular tree grammar do you prefer?

3.3 Decidability Obviously the membership problem for recognizable tree languages is solvable: given a tree t and an fta M , just feed t into M and see whether t is recognized or not. We now want to prove that the emptiness and finiteness problems for recognizable tree languages are solvable. To do this we generalize the pumping lemma for regular string languages to recognizable tree languages: for each regular string language L there is an integer p such that for all strings z in L, if |z| ≥ p, then there are strings u, v and w such that z = uvw, |vw| ≤ p, |v| ≥ 1 and, for all n ≥ 0, uv n w ∈ L. Theorem 3.71. Let Σ be a ranked alphabet, and x a symbol not in Σ. For each recognizable tree language L over Σ we can find an integer p such that for all trees t in L, if height(t) ≥ p, then there are trees u, v, w ∈ TΣ ({x}) such that (i) u and v contain exactly one occurrence of x, and w ∈ TΣ ; (ii) t = u ·x v ·x w; (iii) height(v ·x w) ≤ p; (iv) height(v) ≥ 1, and (v) for all n ≥ 0, u ·x v nx ·x w ∈ L, where v nx = v ·x v ·x · · · ·x v (n times). Proof. Let M = (Q, Σ, δ, s, F ) be a deterministic bottom-up fta recognizing L. Let p be the number of states of M . Consider a tree t ∈ L(M ) such that height(t) ≥ p. Considering some path of maximal length through t, it is clear that there are trees t1 , t2 , . . . , tn ∈ TΣ ({x}) such that n ≥ p + 1, t = tn ·x tn−1 ·x · · · ·x t2 ·x t1 , the trees t2 , . . . , tn contain exactly one occurrence of x and have height ≥ 1, and t1 ∈ TΣ (this is a b i ·x · · · ·x t1 ) “linearization” of t according to some path). Now consider the states qi = δ(t for 1 ≤ i ≤ n. Then, among q1 , . . . , qp+1 there are two equal states: there are i, j such that qi = qj and 1 ≤ i < j ≤ p + 1. Let u = tn ·x · · · ·x tj+1 , v = tj ·x · · · ·x ti+1 and w = ti ·x · · · ·x t1 . Then requirements (i)-(iv) in the statement of the theorem are obviously b 1 ) = δ(s b 2 ), then δ(s b ·x s1 ) = δ(s b ·x s2 ). Hence, satisfied. Furthermore, in general, if δ(s b ·x w) = δ(w), b since δ(v requirement (v) is also satisfied. As a corollary to Theorem 3.71 we obtain the pumping lemma for context-free languages. Corollary 3.72. For each context-free language L over ∆ we can find an integer q such that for all strings z ∈ L, if |z| ≥ q, then there are strings u1 , v1 , w0 , v2 and u2 in ∆∗ , such that z = u1 v1 w0 v2 u2 , |v1 w0 v2 | ≤ q, |v1 v2 | > 0, and, for all n ≥ 0, u1 v1n w0 v2n u2 ∈ L.

29

Proof. By Theorem 3.67, and the fact that each context-free language can be generated by a λ-free context-free grammar, there is a recognizable tree language U over Σ such that yield(U ) = L, where Σ0 = ∆ and Σ2 = {∗}. Let p be the integer corresponding to U according to Theorem 3.71, and put q = 2p . Obviously, if z ∈ L and |z| ≥ q, then there is a t in U such that yield(t) = z and height(t) ≥ p. Then, by Theorem 3.71 there are trees u, v and w such that (i)-(v) in that theorem hold. Thus t = u ·x v ·x w. Let yield(u) = u1 xu2 , yield(v) = v1 xv2 and yield(w) = w0 (see (i)). Then z = yield(t) = yield(u ·x v ·x w) = yield(u) ·x yield(v) ·x yield(w) = u1 v1 w0 v2 u2 . It is easy to see that all other requirements stated in the corollary are also satisfied. Exercise 3.73. Let Σ be a ranked alphabet such that Σ0 = {a, b}. Show that the tree language {t ∈ TΣ | yield(t) has an equal number of a’s and b’s} is not recognizable. From the pumping lemma the decidability of both emptiness and finiteness problem for RECOG follows. Theorem 3.74. The emptiness problem for recognizable tree languages is decidable. Proof. Let L be a recognizable tree language, and let p be the integer of Theorem 3.71. Obviously (using n = 0 in point (v)), L is nonempty if and only if L contains a tree of height < p. Theorem 3.75. The finiteness problem for recognizable tree languages is decidable. Proof. Let L be a recognizable tree language, and let p be the integer of Theorem 3.71. Obviously, L is finite if and only if all trees in L are of height < p. Thus, L is finite iff L ∩ {t | height(t) ≥ p} = ∅. Since L ∩ {t | height(t) ≥ p} is recognizable (Exercise 3.26 and Theorem 3.32), this is decidable by the previous theorem. Note that the decidability of emptiness and finiteness problem for context-free languages follows from these two theorems together with the “yield-theorem” (with e ∈ / Σ0 , Σ1 = ∅). As in the string case we now obtain the decidability of inclusion of recognizable tree languages (and hence of equality). Theorem 3.76. It is decidable, for arbitrary recognizable tree languages U and V , whether U ⊆ V (and also, whether U = V ). Proof. Since U is included in V iff the intersection of U with the complement of V is empty, the theorem follows from Theorems 3.32 and 3.74. Note again that each regular tree language is a special kind of context-free language. Note also that inclusion of context-free languages is not decidable (neither equality). Therefore it is nice that we have found a subclass of CFL for which inclusion and equality are decidable. Note also that CFL is not closed under intersection, but RECOG is. We shall now relate these facts to some results in the literature concerning “parenthesis languages” and “structural equivalence” of context-free grammars (see [Sal, VIII.3]).

30

Definition 3.77. A parenthesis grammar is a context-free grammar G = (N, Σ ∪ {[ , ]}, R, S) such that each rule in R is of the form A → [w] with A ∈ N and w ∈ (Σ ∪ N )∗ . The language generated by G is called a parenthesis language. To relate parenthesis languages to recognizable tree languages, let us restrict attention to ranked alphabets ∆ such that, for k ≥ 1, if ∆k = 6 ∅, then ∆k = {∗}, where ∗ is a fixed symbol. Suppose that in our recursive definition of “tree” we change a[t1 · · · tk ] into [ t1 · · · tk ] (see Definition 2.5 and Remark 2.35). Then, obviously, all our results about a

the class RECOG are still valid. Furthermore, since ∗ is the only symbol of rank ≥ 1, we may as well replace [ by [. In this way, each parenthesis language is in RECOG (in fact, ∗

each parenthesis grammar is a regular tree grammar). It is also easy to see that, if L is a recognizable tree language (over a restricted ranked alphabet ∆), then L − ∆0 is a parenthesis language. From these considerations we obtain the following theorem. Theorem 3.78. The class of parenthesis languages is closed under union, intersection and subtraction. The inclusion problem for parenthesis languages is decidable. Proof. The first statement follows directly from Theorem 3.32 and the last remark above. The second statement follows directly from Theorem 3.76. A paraphrase of this theorem is obtained as follows. Definition 3.79. For any ranked alphabet, let p be the projection such that p(a) = a for all symbols of rank 0, and p(a) = ∗ for all symbols of rank ≥ 1. Let G be a context-free S ), where S is the initial grammar. The bare tree language of G, denoted by BT(G), is p(DG nonterminal of G. We say that two context-free grammars G1 and G2 are structurally equivalent iff they generate the same bare tree language (i.e., BT(G1 ) = BT(G2 )). Thus, G1 and G2 are structurally equivalent if their sets of derivation trees are the same after “erasing” all nonterminals. Theorem 3.80. It is decidable for arbitrary context-free grammars whether they are structurally equivalent. Proof. For any context-free grammar G = (N, Σ, R, S), let [G] be the parenthesis grammar (N, Σ ∪ {[ , ]}, R, S), where R = {A → [w] | A → w is in R}. Obviously, L([G]) = BT(G). Hence, by Theorem 3.78, the theorem holds. Exercise 3.81. Show that, for any two context-free grammars G1 and G2 there exists a context-free grammar G3 such that BT(G3 ) = BT(G1 ) ∩ BT(G2 ). Exercise 3.82. Show that each context-free grammar has a structurally equivalent context-free grammar that is invertible (cf. Exercise 3.29). Exercise 3.83. Consider the “bracketed context-free languages” of Ginsburg and Harrison, and show that some of their results follow easily from results about RECOG (show first that each recognizable tree language is a deterministic context-free language).

31

Exercise 3.84. Investigate whether it is decidable for an arbitrary recognizable tree language R (i) whether R is local; (ii) whether R is a rule tree language; (iii) whether R is recognizable by a det. top-down fta.

4 Finite state tree transformations 4.1 Introduction: Tree transducers and semantics In this part we will be concerned with the notion of a tree transducer: a machine that takes a tree as input and produces another tree as output. In all generality we may view a tree transducer as a device that gives meaning to structured objects (i.e., a semantics defining device). Let us try to indicate this aspect of tree transducers. Consider a ranked alphabet Σ. The elements of Σ may be viewed as “operators”, i.e., symbols denoting operations (functions of several arguments). The rank of an operator stands for the number of arguments of the operation (note therefore that one operator may denote several operations). The operators of rank 0 have no arguments: they are (denote) constants. As an example, the ranked alphabet Σ with Σ0 = {e, a, b} and Σ2 = {f } may be viewed as consisting of three constants e, a and b and one binary operator f . From operators we may form “terms” or “expressions”, like for instance f (a, f (e, b)), or perhaps, denoting f by ∗, (a ∗ (e ∗ b)). Obviously the terms are in one-to-one correspondence with the set TΣ of trees over Σ . Thus the notions tree and term may be identified. Intuitively, terms denote structured objects, obtained by applying the operations to the constants. Formally, meaning is given to operators and terms by way of an “interpretation”. An interpretation of Σ consists of a “domain” B, for each element a ∈ Σ0 an element h0 (a) of B, and for each k ≥ 1 and operator a ∈ Σk an operation hk (a) : B k → B. An interpretation of Σ is also called a “Σ-algebra” or “algebra of type Σ”. An interpretation (B, {h0 (a)}a∈Σ0 , {hk (a)}a∈Σk ) determines a mapping h : TΣ → B (giving an interpretation to each term as an element of B) as follows: (i) for a ∈ Σ0 , h(a) = h0 (a); (ii) for k ≥ 1 and a ∈ Σk , h(a[t1 · · · tk ]) = hk (a)(h(t1 ), . . . , h(tk )). (Such a mapping is also called a “homomorphism” from TΣ into B). Thus the meaning of a tree is uniquely determined by the meaning of its subtrees and the interpretation of the operator applied to these subtrees. In general we can say that the meaning of a structured object is a function of the meanings of its substructures, the function being determined by the way the object is constructed from its substructures. As an example, an interpretation of the above-mentioned ranked alphabet Σ = {e, a, b, f } might for instance consist of a group B with unity h0 (e), multiplication h2 (f ) and two specific elements h0 (a) and h0 (b). Or it might consist of B = {a, b}∗ ,

32

h0 (e) = λ, h0 (a) = a, h0 (b) = b and h2 (f ) is concatenation. Note that in this case the mapping h : TΣ → B is the yield! It is now easy to see that a deterministic bottom-up fta with input alphabet Σ is nothing else but a Σ-algebra with a finite domain (its set of states). Such an automaton may therefore be used as a semantic defining device in case there are only a finite number of possible semantical values. Obviously, in general, one needs an infinite number of semantical values. However, it is not attractive to consider arbitrary infinite domains B since this provides us with no knowledge about the structure of the elements of B. We therefore assume that the elements of B are structured objects: trees (or interpretations of them). Thus we consider Σ-algebras with domain T∆ for some ranked alphabet ∆. Our complete semantics of TΣ may then consist of two parts: an interpretation of TΣ into T∆ and an interpretation of T∆ in some ∆-algebra. The interpretation of TΣ into T∆ may be realized by a tree transducer. An example of an interpretation of TΣ into T∆ is the tree homomorphism of Definik → T , tion 3.62. In fact each tree s ∈ T∆ (Xk ) may be viewed as an operation se : T∆ ∆ defined by se(s1 , . . . , sk ) = shx1 ← s1 , . . . , xk ← sk i. A tree homomorphism is then the same thing as an interpretation of Σ with domain T∆ , where the allowable interpretations of the elements of Σ are the mappings se above. Note that these interpretations are very natural, since the interpretation of a tree is obtained by “applying a finite number of ∆-operators to the interpretations of its subtrees”. To show the relevance of tree homomorphisms (and therefore tree transducers in general) to the semantics of context-free languages we consider the following very simple example. Example 4.1. Consider a context-free grammar generating expressions by the rules E → E + T , T → T ∗ F , E → a, T → a, F → a and F → (E). Suppose we want to translate each expression into the equivalent post-fix expression. To do this we consider the rule tree language corresponding to this grammar and apply to the rule trees in this language the tree homomorphism h defined by h2 (E → E + T ) = E[x1 x2 +], h2 (T → T ∗F ) = T [x1 x2 ∗], h0 (E → a) = E[a], h0 (T → a) = T [a], h0 (F → a) = F [a] and h1 (F → (E)) = F [x1 ]. Then the rule tree corresponding to an expression is translated into a tree whose yield is the corresponding post-fix expression. For instance, E

(E → E + T ) (E → a)

(T → T ∗ F )

goes into

E a

(T → a)

(F → a)

+

T T F a

,

∗

a

so that a + a ∗ a is translated into aaa ∗ +. Note that, moreover, the transformed tree is the derivation tree of the post-fix expression in the context-free grammar with the rules E → ET +, T → T F ∗, E → a, T → a, F → a and F → E. Instead of interpreting this

33

derivation tree as its yield (the post-fix expression), one might also interpret it as, for instance, a sequence of machine code instructions, like “load a; load a; load a; multiply; add”. It is not difficult to see that the syntax-directed translation schemes of [A&U, I.3] correspond in some way to linear, nondeleting homomorphisms working on rule tree languages. To arrive at our general notion of tree transducer, we combine the finite tree automaton and the tree homomorphism into a “tree homomorphism with states” or a “finite tree automaton with output”. This tree transducer will not any more be an interpretation of TΣ into T∆ , but involves a generalization of this concept (although, by replacing T∆ by some other set, it can again be formulated as an interpretation of TΣ ). Two ideas occur in this generalization. Idea 4.2. The translation (meaning) of a tree may depend not only on the translation of its subtrees but also on certain properties of these subtrees. Assuming that these properties are recognizable (that is, the set of all trees having the property is in RECOG), they may be represented as states of a (deterministic) bottom-up fta. Thus we can combine the deterministic bottom-up fta and the tree homomorphism by associating to each symbol a ∈ Σk a mapping fa : Qk → Q × T∆ (Xk ). This fa may be split up into two mappings δa : Qk → Q and ha : Qk → T∆ (Xk ). The δ-functions determine a mapping δb : TΣ → Q, as for the bottom-up fta, and the h-functions determine an output-mapping b h : TΣ → T∆ by the formula (cf. the corresponding tree homomorphism formula): b b 1 ), . . . , δ(t b k ))hx1 ← b h(a[t1 · · · tk ]) = ha (δ(t h(t1 ), . . . , xk ← b h(tk )i. Thus our tree transducer works through the tree in a bottom-up fashion just like the bottom-up fta, but at each step, it produces output by combining the output trees, already obtained from the subtrees, into one new output tree. Note that, if we allow our bottom-up tree transducer to be nondeterministic, then the above formula for b h is intuitively wrong (we need “deterministic substitution”). Idea 4.3. To obtain the translation of the input tree one may need several different translations of each subtree. Suppose that one needs m different kinds of translation of each tree (where one of them is the “main meaning” and the others are “auxiliary meanings”), then these may be realized by m states of the transducer, say q1 , . . . , qm . The ith translation may then be specified by associating to each a ∈ Σk a tree hqi (a) ∈ T∆ (Ym,k ), where Ym,k = {yi,j | 1 ≤ i ≤ m, 1 ≤ j ≤ k}. The ith translation of a tree a[t1 · · · tk ] may then be defined by the formula b hqi (a[t1 · · · tk ]) = hqi (a)hyr,s ← b hqr (ts )i1≤r≤m,1≤s≤k . Thus the ith translation of a tree is expressed in terms of all possible translations of its subtrees. Realizing such a translation in a bottom-up fashion would mean that we should compute all m possible translations of each tree in parallel, whereas working in a top-down way we know exactly from hqi (a) which translations of which subtrees are

34

needed (note that, in general, not all elements of Ym,k appear in hqi (a)). Therefore, such a translation seems to be realized best by a top-down tree transducer. We note that the generalized syntax-directed translation scheme of [A&U, II.9.3] corresponds to such a top-down tree transducer working on a rule tree language. As already indicated in Example 4.1, tree transducers are of interest to the translation of context-free languages (in particular the context-free part of a programming language). For this reason we often restrict the tree transducer to a rule tree language, a local tree language or a recognizable tree language (the difference being slight: a projection). This restriction is also of interest from a linguistical point of view: a natural language may be described by a context-free set of kernel sentences to which transformations may be applied, working on the derivation trees (as for instance the transformation active → passive). The language then consists of all transformations of kernel sentences. We note that if derivation tree d1 of sentence s1 is transformed into tree d2 with yield s2 , then the sentence s2 is said to have “deep structure” d1 and “surface structure” d2 .

4.2 Top-down and bottom-up finite tree transducers Since tree transducers define tree transformations (recall Definition 2.10), we start by recalling some terminology concerning relations. We note first that, for ranked alphabets Σ and ∆, we shall identify any mapping f : TΣ → T∆ with the tree transformation {(s, t) | f (s) = t}, and we shall identify any mapping f : TΣ → P(T∆ ) with the tree transformation {(s, t) | t ∈ f (s)}. Definition 4.4. Let Σ, ∆ and Ω be ranked alphabets. If M1 ⊆ TΣ ×T∆ and M2 ⊆ T∆ ×TΩ , then the composition of M1 and M2 , denoted by M1 ◦ M2 , is the tree transformation {(s, t) ∈ TΣ × TΩ | (s, u) ∈ M1 and (u, t) ∈ M2 for some u ∈ T∆ }. If F and G are classes of tree transformations, then F ◦ G denotes the class {M1 ◦ M2 | M1 ∈ F and M2 ∈ G}. Definition 4.5. Let M be a tree transformation from TΣ into T∆ . The inverse of M , denoted by M −1 , is the tree transformation {(t, s) ∈ T∆ × TΣ | (s, t) ∈ M }. Definition 4.6. Let M be a tree transformation and L a tree language. The image of L under M , denoted by M (L), is the tree language M (L) = {t | (s, t) ∈ M for some s in L}. If M is a tree transformation from TΣ into T∆ , then the domain of M , denoted by dom(M ), is M −1 (T∆ ), and the range of M , denoted by range(M ), is M (TΣ ). In Part (3) we already considered certain simple tree transformations: relabelings and tree homomorphisms. Notation 4.7. We shall use REL to denote the class of all relabelings, HOM to denote the class of all tree homomorphisms, and LHOM to denote the class of linear tree homomorphisms.

35

Moreover we want to view each finite tree automaton as a simple “checking” tree transducer, which, given some input tree, produces the same tree as output if it belongs to the tree language recognized by the fta, and produces no output if not. Definition 4.8. Let Σ be a ranked alphabet. A tree transformation R ⊆ TΣ × TΣ is called a finite tree automaton restriction if there is a recognizable tree language L such that R = {(t, t) | t ∈ L}. If M is an fta, then we shall denote the finite tree automaton restriction {(t, t) | t ∈ L(M )} by T (M ). We shall use FTA to denote the class of all finite tree automaton restrictions. Exercise 4.9. Prove that the classes of tree transformations REL, HOM and FTA are each closed under composition. Show that REL and FTA are also closed under inverse. Before defining tree transducers we first discuss a very general notion of tree rewriting system that can be used to define both tree transducers and tree grammars. The reason to introduce these tree rewriting systems is that recursive definitions like those for the finite tree automata and tree homomorphisms tend to become very cumbersome when used for tree transducers, whereas rewriting systems are more “machine-like” and therefore easier to visualize. To arrive at the notion of tree rewriting system we first generalize the notion of string rewriting system to allow for the use of rule “schemes”. Recall the set X of variables from Definition 3.61. Definition 4.10. A rewriting system with variables is a pair G = (∆, R), where ∆ is an alphabet and R a finite set of “rule schemes”. A rule scheme is a triple (v, w, D) such that, for some k ≥ 0, v and w are strings over ∆ ∪ Xk and D is a mapping from Xk into P(∆∗ ). Whenever D is understood, (v, w, D) is denoted by v → w. For 1 ≤ i ≤ k, the language D(xi ) is called the range (or domain) of the variable xi . A relation = ⇒ on ∆∗ is defined as follows. For strings s, t ∈ ∆∗ , s = ⇒ t if and only if there G

G

exists a rule scheme (v, w, D) in R, strings φ1 , . . . , φk in D(x1 ), . . . , D(xk ) respectively (where Xk is the domain of D), and strings α and β in ∆∗ such that s = α · vhx1 ← φ1 , . . . , xk ← φk i · β

and

t = α · whx1 ← φ1 , . . . , xk ← φk i · β . ∗

As usual = ⇒ denotes the transitive-reflexive closure of = ⇒. G

G

For convenience we shall, in what follows, use the word “rule” rather than “rule scheme”. Of course, in a rewriting system with variables, the ranges of the variables should be specified in some effective way (note that we would like the relation ⇒ to be decidable). In what follows we shall only use the case that the variables range over recognizable tree languages.

36

Examples 4.11. (1) Consider the rewriting system with variables G = (∆, R), where ∆ = {a, b, c} and R consists of the one rule ax1 c → aax1 bcc, where D(x1 ) = b∗ . Then, for instance, aabbcc ⇒ aaabbbccc (by application of the ordinary rewriting rule abbc → aabbbcc obtained by substituting bb for x1 in the rule above). It is easy to see that ∗ {w ∈ ∆∗ | abc = ⇒ w} = {an bn cn | n ≥ 1}. (2) Consider the rewriting system with variables G = (∆, R), where ∆ = {[ , ], ∗, 1} and R consists of the rules [x1 ∗ x2 1] → [x1 ∗ x2 ]x1 and [x1 ∗ 1] → x1 , where in both rules D(x1 ) = D(x2 ) = 1∗ . It is easy to see that, for arbitrary u, v, w ∈ 1∗ , ∗ [u ∗ v] = ⇒ w iff w is the product of u and v (in unary notation). (3) The two-level grammar used to describe Algol 68 may be viewed as a rewriting system with variables. The variables ( = meta notions) range over context-free languages, specified by the meta grammar. By specializing to trees we obtain the notion of tree rewriting system. Definition 4.12. A rewriting system with variables G = (∆, R) is called a tree rewriting system if (i) ∆ = Σ ∪ {[ , ]} for some ranked alphabet Σ; (ii) for each rule (v, w, D) in R, v and w are trees in TΣ (Xk ) and, for 1 ≤ i ≤ k, D(xi ) ⊆ TΣ (where Xk is the domain of D). It should be clear that, for a tree rewriting system G = (Σ ∪ {[ , ]}, R), if s ∈ TΣ and s= ⇒ t, then t ∈ TΣ . In fact, the application of a rule to a tree consists of replacing some G

piece in the middle of the tree by some other piece, where the variables indicate how the subtrees of the old piece should be connected to the new one. As an example, if we have a rule a[b[x1 x2 ]b[x3 d]] → b[x2 a[x1 dx2 ]], then the application of this rule to a tree t (if possible) consists of replacing a subtree of t of the form a

b

b

b

t2

d t1

t2

a

by

t3

,

d t1

t2

where t1 , t2 and t3 are in the ranges of x1 , x2 and x3 . Thus t is of the form αa[b[t1 t2 ]b[t3 d]]β and is transformed into αb[t2 a[t1 dt2 ]]β. Example 4.13. Let Σ0 = {a}, Σ1 = {b}, ∆0 = {a}, ∆2 = {b}, Ω0 = {a}, Ω1 = {∗, b} and Ω2 = {b}. (i) Consider the tree rewriting system G = (Ω ∪ {[ , ]}, R), where R consists of the rules a → ∗[a], b[∗[x1 ]] → ∗[b[x1 x1 ]], and D(x1 ) = T∆ . Then, for instance,

37

b

b

∗

b ∗

b ⇒ b ⇒ . b ⇒ ∗ b b b a a a a a a a a It is easy to see that, if h is the tree homomorphism from TΣ into T∆ defined by ∗ h0 (a) = a and h1 (b) = b[x1 x1 ], then, for s ∈ TΣ and t ∈ T∆ , h(s) = t iff s = ⇒ ∗[t]. (ii) Consider the tree rewriting system G0 = (Ω ∪ {[ , ]}, R0 ), where R0 consists of the rules ∗[b[x1 ]] → b[∗[x1 ] ∗ [x1 ]], ∗[a] → a, and D(x1 ) = TΣ . Then, for instance, ∗ b b a

b ⇒

b

∗

∗

b a

b a

⇒

b

b ∗

∗ ∗ b a a a

⇒

b

∗

∗ a b a a

b

∗

⇒ =

b

b

.

a aa a

It is easy to see that, if h is the homomorphism defined above, then, for s ∈ TΣ ∗ and t ∈ T∆ , h(s) = t iff ∗[s] = ⇒ t. The tree transducers to be defined will be a generalization of the generalized sequential machine working on strings, which is essentially a finite automaton with output. A (nondeterministic) generalized sequential machine is a 6-tuple M = (Q, Σ, ∆, δ, S, F ), where Q is the set of states, Σ is the input alphabet, ∆ the output alphabet, δ is a mapping Q × Σ → P(Q × ∆∗ ), S is a set of initial states and F a set of final states. Intuitively, if δ(q, a) contains (q 0 , w) then, in state q and scanning input symbol a, the machine M may go into state q 0 and add w to the output. Formally we may define the functioning of M in several ways. As already said, the recursive definition (as for the fta) is too cumbersome, although it is the most exact one (and should be used in very formal proofs). The other way is to describe the sequence of configurations the machine goes through during the translation of the input string. A configuration is usually a triple (v, q, s), where v is the output generated so far, q is the state and s is the rest of the input. If s = as1 , then the next configuration might be (vw, q 0 , s1 ). A useful variation of this is to replace (v, q, s) by the string vqs ∈ ∆∗ QΣ∗ . The next configuration can now be obtained by applying the string rewriting rule qa → wq 0 , thus vqas1 ⇒ vwq 0 s1 . Replacing δ by a corresponding set of rewriting rules, the string translation realized by ∗ M can be defined as {(v1 , v2 ) | q0 v1 = ⇒ v2 qf for some q0 ∈ S and qf ∈ F }. Let us first consider the bottom-up generalization of this machine to trees, which is conceptually easier than the top-down version, although perhaps less interesting. The bottom-up finite tree transducer goes through the input tree in the same way as the bottom-up fta, at each step producing a piece of output to which the already generated

38

output is concatenated. The transducer arrives at a node of rank k with a sequence of k states and a sequence of k output trees (one state and one output tree for each direct subtree of the node). The sequence of states and the label at the node determine (nondeterministically) a new state and a piece of output containing the variables x1 , . . . , xk . The transducer processes the node by going into the new state and computing a new output tree by substituting the k output trees for x1 , . . . , xk in the piece of output. There should be start states and output for each node of rank 0. If the transducer arrives at the top of the tree in a final state, then the computed output tree is the transformation of the input tree. (Cf. the story in Idea 4.2). To be able to put the states of the transducer as labels on trees, we make them into symbols of rank 1. The configurations of the bottom-up tree transducer will be elements of TΣ (Q[T∆ ]), † and the steps of the tree transducer (including the start steps) are modelled by the application of tree rewriting rules to these configurations. We now give the formal definition. Definition 4.14. A bottom-up (finite) tree transducer is a structure M = (Q, Σ, ∆, R, Qd ), where Q is a ranked alphabet (of states), such that all elements of Q have rank 1 and no other ranks; Σ is a ranked alphabet (of input symbols); ∆ is a ranked alphabet (of output symbols), Q ∩ (Σ ∪ ∆) = ∅; Qd is a subset of Q (the set of final states); and R is a finite set of rules of one of the forms (i) or (ii): (i) a → q[t], where a ∈ Σ0 , q ∈ Q and t ∈ T∆ ; (ii) a[q1 [x1 ] · · · qk [xk ]] → q[t], where k ≥ 1, a ∈ Σk , q1 , . . . , qk , q ∈ Q and t ∈ T∆ (Xk ). M is viewed as a tree rewriting system over the ranked alphabet Q ∪ Σ ∪ ∆ with R as the set of rules, such that the range of each variable occurring in R is T∆ . Therefore the ∗ relations =⇒ and =⇒ are well defined according to Definition 4.10. M

M

The tree transformation realized by M , denoted by T (M ) or simply M , is ∗

{(s, t) ∈ TΣ × T∆ | s =⇒ q[t] for some q in Qd }. M

We shall abbreviate “(finite) tree transducer” by “ftt”. Remark 4.15. Note that T (M ) is also denoted by M . In general, we shall often make no distinction between a tree transducer and the tree transformation it realizes. Hopefully this will not lead to confusion. Definition 4.16. The class of tree transformations realized by bottom-up ftt will be denoted by B. An element of B will be called a bottom-up tree transformation. †

Note that Q[T∆ ] = {q[t] | q ∈ Q and t ∈ T∆ }.

39

Example 4.17. An example of a bottom-up ftt realizing a homomorphism was given in Example 4.13(i) (it had one state ∗). Example 4.18. Consider the bottom-up ftt M = (Q, Σ, ∆, R, Qd ), where Q = {q0 , q1 }, Σ0 = {a, b}, Σ2 = {f, g}, ∆0 = {a, b}, ∆2 = {m, n}, Qd = {q0 } and the rules are a → q0 [a],

b → q0 [b],

f [qi [x1 ]qj [x2 ]] → q1−i [m[x1 x1 ]],

g[qi [x1 ]qj [x2 ]] → q1−i [n[x1 x1 ]],

f [qi [x1 ]qj [x2 ]] → q1−j [m[x2 x2 ]],

g[qi [x1 ]qj [x2 ]] → q1−j [n[x2 x2 ]],

for all i, j ∈ {0, 1}. The transformation realized by M may be described by saying that, given some input tree t, M selects some path of even length through t, relabels f by m and g by n and then doubles every subtree. For example f [a[g[ab]]] may be transformed into the tree m[n[bb]n[bb]] corresponding to the path f gb. The tree g[ab] is not in the domain of M . Exercise 4.19. Construct bottom-up tree transducers M1 and M2 such that (i) {(yield(s), yield(t)) | (s, t) ∈ T (M1 )} = {(a(cd)n f en b, acn f d2n b) | n ≥ 0}; (ii) M2 deletes, given an input tree t, all subtrees t0 of t such that yield(t0 ) ∈ a+ b+ . Exercise 4.20. Give a recursive definition of the transformation realized by a bottom-up tree transducer (without using the notion of tree rewriting system). Exercise 4.21. Given a bottom-up ftt M with input alphabet Σ, find a suitable Σ-algebra such that M may be viewed as an interpretation of Σ into this Σ-algebra (cf. Section 4.1). We now define some subclasses of the class of bottom-up tree transformations. Definition 4.22. Let Σ be a ranked alphabet and k ≥ 0. A tree t in TΣ (Xk ) is linear if each element of Xk occurs at most once in t. The tree t is called nondeleting with respect to Xk if each element of Xk occurs at least once in t. Definition 4.23. Let M = (Q, Σ, ∆, R, Qd ) be a bottom-up ftt. M is called linear if the right hand side of each rule in R is linear. M is called nondeleting if the right hand side of each rule in R is nondeleting with respect to Xk , where k is the rank of the input symbol in the left hand side. M is called one-state (or pure) if Q is a singleton. M is called (partial) deterministic if (i) for each a ∈ Σ0 there is at most one rule in R with left hand side a, and (ii) for each k ≥ 1, a ∈ Σk and q1 , . . . , qk ∈ Q there is at most one rule in R with left hand side a[q1 [x1 ] · · · qk [xk ]]. M is called total deterministic if (i) and (ii) hold with “at most one” replaced by “exactly one” and Qd = Q.

40

Notation 4.24. The same terminology will be applied to the transformations realized by such transducers. Thus, for instance, a linear deterministic bottom-up tree transformation is one that can be realized by a linear deterministic bottom-up ftt. The classes of tree transformations obtained by putting one or more of the above restrictions on the bottom-up tree transducers will be denoted by adding the symbols L, N, P, D and Dt (standing for linear, nondeleting, pure, deterministic, and total deterministic respectively) to the letter B. Thus the class of linear deterministic bottom-up tree transformations is denoted by LDB. Example 4.25. Let Σ0 = {e}, Σ1 = {a, f }, ∆0 = {e}, ∆1 = {a, b} and ∆2 = {f }. Consider the bottom-up ftt M = (Q, Σ, ∆, R, Qd ), where Q = Qd = {∗} and R consists of the rules e → ∗[e], a[∗[x1 ]] → ∗[a[x1 ]],

a[∗[x1 ]] → ∗[b[x1 ]],

f [∗[x1 ]] → ∗[f [x1 x1 ]]. Then M ∈ PNB.

Let us make the following remarks about the concepts defined in Definition 4.23. Remarks 4.26. (1) Deletion is different from erasing. A rule may be called erasing if its right hand side belongs to Q[X]. Thus, symbols of rank 0 cannot be erased. Symbols of rank 1 can be erased without any deletion, but symbols of rank k ≥ 2 can only be erased by deleting also k − 1 subtrees. Thus a nondeleting tree transducer is still able to erase symbols of rank 1. (2) The one-state bottom-up tree transformations correspond intuitively to the finite substitutions in the string case. (3) The total deterministic bottom-up ftt realize tree transformations which are total functions. Exercise 4.27. Show that, in the definition of “(partial) deterministic”, we may replace the phrase “at most one” by “exactly one” without changing the corresponding class DB of deterministic bottom-up tree transformations. In the next theorem we show that all relabelings, finite tree automaton restrictions and tree homomorphisms are realizable by bottom-up ftt. Theorem 4.28. (1) REL ⊆ PNLB, (2) FTA ⊆ NLDB, (3) HOM = PDt B and LHOM = PLDt B.

41

Proof. (1) Let r be a relabeling from TΣ into T∆ . Thus r is determined by a family of mappings rk : Σk → P(∆k ). Obviously the following bottom-up ftt realizes r: M = ({∗}, Σ, ∆, R, {∗}), where R is constructed as follows: (i) for a ∈ Σ0 , if b ∈ r0 (a), then a → ∗[b] is in R; (ii) for k ≥ 1 and a ∈ Σk , if b ∈ rk (a), then a[∗[x1 ] · · · ∗ [xk ]] → ∗[b[x1 · · · xk ]] is in R. Clearly M ∈ PNLB. (2) From the definition of FTA and from Part (3) it follows that we need only consider a deterministic bottom-up fta M = (Q, Σ, δ, s, F ) and show that T (M ) = {(t, t) | t ∈ L(M )} f = (Q, Σ, Σ, R, F ), where is realized by a bottom-up ftt. Consider the bottom-up ftt M R is constructed as follows: (i) for a ∈ Σ0 , a → q[a] is in R, where q = sa ; (ii) for k ≥ 1 and a ∈ Σk , if δak (q1 , . . . , qk ) = q, then a[q1 [x1 ] · · · qk [xk ]] → q[a[x1 · · · xk ]] is in R. f realizes T (M ) and M f ∈ NLDB (the determinism of M f follows from that Clearly M of M ). (3) We first show that HOM ⊆ PDt B (and LHOM ⊆ PLDt B). An example of this was already given in Example 4.13(i). Let h be a tree homomorphism from TΣ into T∆ determined by the mappings hk : Σk → T∆ (Xk ). Consider the bottom-up ftt M = ({∗}, Σ, ∆, R, {∗}), where R contains the following rules: (i) for a ∈ Σ0 , a → ∗[h0 (a)] is in R; (ii) for k ≥ 1 and a ∈ Σk , the rule a[∗[x1 ] · · · ∗ [xk ]] → ∗[hk (a)] is in R. Obviously M is in PDt B (and linear, if h is linear). Let us prove that M realizes h. Thus ∗ we have to show that, for s ∈ TΣ and t ∈ T∆ , h(s) = t iff s = ⇒ ∗[t]. The proof is by induction on s. The case s ∈ Σ0 is clear. Now let s = a[s1 · · · sk ]. Suppose that h(s) = t. Then, by definition of h, t = hk (a)hx1 ← h(s1 ), . . . , xk ← h(sk )i. ∗ By induction, si = ⇒ ∗[h(si )] for all i, 1 ≤ i ≤ k. Hence (but note that formally ∗ ⇒ a[∗[h(s1 )] · · · ∗ [h(sk )]]. But, by rule (ii) above, this needs a proof) a[s1 · · · sk ] = ∗ a[∗[h(s1 )] · · · ∗ [h(sk )]] ⇒ ∗[hk (a)hx1 ← h(s1 ), . . . , xk ← h(sk )i]. Consequently s = ⇒ ∗[t]. ∗

Now suppose that s = a[s1 · · · sk ] = ⇒ ∗[t]. Then (and again this needs a for∗ mal proof) there are trees t1 , . . . , tk such that si = ⇒ ∗[ti ] for 1 ≤ i ≤ k and ∗ a[s1 · · · sk ] = ⇒ a[∗[t1 ] · · · ∗ [tk ]] ⇒ ∗[hk (a)hx1 ← t1 , . . . , xk ← tk i] = ∗[t]. By induction, ti = h(si ) for all i, 1 ≤ i ≤ k. Hence t = hk (a)hx1 ← h(s1 ), . . . , xk ← h(sk )i = h(s). This proves that (L)HOM ⊆ P(L)Dt B. To show the converse, consider a one-state total deterministic bottom-up tree transducer M = ({∗}, Σ, ∆, R, {∗}). Define the tree homomorphism h from TΣ into T∆ as follows: (i) for a ∈ Σ0 , h0 (a) is the tree t occurring in the (unique) rule a → ∗[t] in R; (ii) for k ≥ 1 and a ∈ Σk , hk (a) is the tree t occurring in the (unique) rule a[∗[x1 ] · · · ∗ [xk ]] → ∗[t] in R.

42

Then, obviously, by the same proof as above, h = T (M ).

Exercise 4.29. Prove that the domain of a bottom-up tree transformation is a recognizable tree language, and vice-versa. Let us now consider the top-down generalization of the generalized sequential machine. The top-down finite tree transducer goes through the input tree in the same way as the top-down fta, at each step producing a piece of output to which the (unprocessed) rest of the input is concatenated. Note therefore that the transducer not really “goes through” the input tree in the same way as the bottom-up ftt does, since in the topdown case the rest of the input may be modified (deleted, permuted, copied) during translation, whereas in the bottom-up case the rest of the input is unmodified during translation. The top-down transducer arrives at a node of rank k in a certain state; on that moment the configuration is an element of T∆ (Q[TΣ ]), where Σ and ∆ are the input and output alphabet, and Q the set of states. The state and the label at the node determine (nondeterministically) a piece of output containing the variables x1 , . . . , xk , and states with which to continue the translation of the subtrees. These states are also specified in the piece of output, which is in fact a tree in T∆ (Q[Xk ]), where an occurrence of q[xi ] means that the processing of the ith subtree should, at this point, be continued in state q. The transducer processes the node by replacing it and its direct subtrees by the piece of output, in which the k subtrees are substituted for the variables x1 , . . . , xk . The processing of (all copies of) the subtrees is continued as indicated above. The transducer starts at the root of the input tree in some initial state. There should be final states and output for each node of rank 0. If the transducer arrives in a final state at each leaf, then it replaces each leaf by the final output, and the resulting tree is the transformation of the input tree. (Cf. the story in Idea 4.3). The steps of the transducer, including the final steps, are modelled by the application of rewriting rules to the elements of T∆ (Q[TΣ ]). We now give the formal definition. Definition 4.30. A top-down (finite) tree transducer is a structure M = (Q, Σ, ∆, R, Qd ), where Q, Σ and ∆ are as for the bottom-up ftt, Qd is a subset of Q (the set of initial states), and R is a finite set of rules of one of the forms (i) or (ii): (i) q[a[x1 · · · xk ]] → t, where k ≥ 1, a ∈ Σk , q ∈ Q and t ∈ T∆ (Q[Xk ]); (ii) q[a] → t, where q ∈ Q, a ∈ Σ0 and t ∈ T∆ . M is viewed as a tree rewriting system over the ranked alphabet Q ∪ Σ ∪ ∆ with R as the set of rules, such that the range of each variable in X is TΣ . The tree transformation realized by M , denoted by T (M ) or simply M , is ∗ {(s, t) ∈ TΣ × T∆ | q[s] =⇒ t for some q in Qd }. M

Definition 4.31. The class of tree transformations realized by top-down ftt will be denoted by T. An element of T will be called a top-down tree transformation.

43

Example 4.32. An example of a top-down ftt realizing a homomorphism was given in Example 4.13(ii) (it had one state ∗). The next example is a top-down ftt computing the formal derivative of an arithmetic expression. Example 4.33. Consider the top-down ftt M = (Q, Σ, ∆, R, Qd ), where Σ0 = {a, b}, ∆0 = {a, b, 0, 1}, Σ1 = ∆1 = {−, sin, cos}, Σ2 = ∆2 = {+, ∗}, Q = {q, i}, Qd = {q}, the rules for q are q[+[x1 x2 ]] → +[q[x1 ]q[x2 ]], q[∗[x1 x2 ]] → +[∗[q[x1 ]i[x2 ]] ∗ [i[x1 ]q[x2 ]]], q[−[x1 ]] → −[q[x1 ]], q[sin[x1 ]] → ∗[cos[i[x1 ]]q[x1 ]], q[cos[x1 ]] → ∗[−[sin[i[x1 ]]]q[x1 ]], q[a] → 1, q[b] → 0, and the rules for i are i[+[x1 x2 ]] → +[i[x1 ]i[x2 ]], i[∗[x1 x2 ]] → ∗[i[x1 ]i[x2 ]], i[−[x1 ]] → −[i[x1 ]], i[sin[x1 ]] → sin[i[x1 ]], i[cos[x1 ]] → cos[i[x1 ]], i[a] → a and i[b] → b.

Then (t, s) ∈ T (M ) iff s is the formal derivative of t with respect to a. For instance, ∗ ∗ q[∗[+[ab] − [a]]] = ⇒ +[∗[+[10] − [a]] ∗ [+[ab] − [1]]]. Note that i[t1 ] = ⇒ t2 iff t1 = t2 (t1 , t2 ∈ TΣ ). Exercise 4.34. Let Σ0 = {a, b}, Σ1 = {∼} and Σ2 = {∧, ∨}. TΣ may be viewed as the set of all boolean expressions over the boolean variables a and b, using negation, conjunction and disjunction. Write a top-down tree transducer which transforms every boolean expression into an equivalent one in which a and b are the only subexpressions which may be negated. Exercise 4.35. (i) Give a recursive definition of the transformation realized by a topdown ftt. (ii) Find a suitable Σ-algebra such that the top-down ftt may be viewed as an interpretation of Σ into this Σ-algebra. As in the bottom-up case we define some subclasses of T. Definition 4.36. Let M = (Q, Σ, ∆, R, Qd ) be a top-down tree transducer. The definitions of linear, nondeleting and one-state are identical to the bottom-up ones

44

in Definition 4.23. M is called (partial) deterministic if (i) Qd is a singleton; (ii) for each q ∈ Q, k ≥ 1, and a ∈ Σk , there is at most one rule in R with left hand side q[a[x1 · · · xk ]]; (iii) for each q ∈ Q and a ∈ Σ0 there is at most one rule in R with left hand side q[a]. M is called total deterministic if (i), (ii) and (iii) hold with “at most one” replaced by “exactly one”. Notation 4.24 also applies to the top-down case. Thus, PLT is the class of one-state linear top-down tree transformations. Example 4.37. Let Σ0 = {e}, Σ1 = {a, f }, ∆0 = {e}, ∆1 = {a, b} and ∆2 = {f }. Consider the top-down tree transducer M = (Q, Σ, ∆, R, Qd ) with Q = Qd = {∗} and R consists of the rules ∗ [f [x1 ]] → f [∗[x1 ] ∗ [x1 ]], ∗ [a[x1 ]] → a[∗[x1 ]],

∗[a[x1 ]] → b[∗[x1 ]],

∗ [e] → e. Then M ∈ PNT.

Remarks 4.26 also apply to the top-down case. Exercise 4.38. Show that, in the definition of “(partial) deterministic” top-down ftt, we may replace in (ii) the phrase “at most one” by “exactly one” without changing DT. The next theorem shows that all relabelings, finite tree automaton restrictions and tree homomorphisms are realizable by top-down tree transducers (cf. Theorem 4.28). Note therefore that these tree transformations are not specifically bottom-up or top-down. Theorem 4.39. (1) REL ⊆ PNLT, (2) FTA ⊆ NLT, (3) HOM = PDt T and LHOM = PLDt T. Proof. Exercise.

In what follows we shall need one other type of tree transformation which corresponds to the ordinary sequential machine in the string case (which translates each input symbol into one output symbol). It is a combination of an fta and a relabeling. Definition 4.40. A top-down (resp. bottom-up) finite state relabeling is a (tree transformation realized by a) top-down (resp. bottom-up) tree transducer M = (Q, Σ, ∆, R, Qd )

45

in which all rules are of the form q[a[x1 · · · xk ]] → b[q1 [x1 ] · · · qk [xk ]] with q, q1 , . . . , qk ∈ Q, a ∈ Σk and b ∈ ∆k , or of the form q[a] → b with q ∈ Q, a ∈ Σ0 and b ∈ ∆0 (resp. of the form a[q1 [x1 ] · · · qk [xk ]] → q[b[x1 · · · xk ]] with q, q1 , . . . , qk ∈ Q, a ∈ Σk and b ∈ ∆k , or of the form a → q[b] with q ∈ Q, a ∈ Σ0 and b ∈ ∆0 ). It is clear that the classes of top-down and bottom-up finite state relabelings coincide. This class will be denoted by QREL. The classes of deterministic top-down and deterministic bottom-up finite state relabelings obviously do not coincide. They will be denoted by DTQREL and DBQREL respectively. Note that FTA ∪ REL ⊆ QREL ⊆ NLB ∩ NLT. Apart from the tree transformation realized by a tree transducer we will also be interested in the image of a recognizable tree language under a tree transformation and the yield of that image. Definition 4.41. Let F be a class of tree transformations. An F -surface tree language is a language M (L) with M ∈ F and L ∈ RECOG. An F -target language is the yield of an F -surface language. An F -translation is a string relation {(yield(s), yield(t)) | (s, t) ∈ M and s ∈ L} for some M ∈ F and L ∈ RECOG. The classes of F -surface and F -target languages will be denoted by F -Surface and F -Target respectively. It is clear that, for all classes F discussed so far, since the identity transformation is in F, RECOG ⊆ F -Surface, and so CFL ⊆ F -Target. Moreover it is clear from the proof of Theorem 3.64 that if HOM ⊆ F, then the above inclusions are proper.

4.3 Comparison of B and T, the nondeterministic case The main differences between the bottom-up and the top-down tree transducer are the following. Property (B). Nondeterminism followed by copying. A bottom-up ftt has the ability of first processing an input subtree nondeterministically and then copying the resulting output tree. Property (T). Copying followed by different processing (by nondeterminism or by different states). A top-down ftt has the ability of first copying an input subtree and then treating the resulting copies differently. Property (B0 ). Checking followed by deletion. A bottom-up ftt has the ability of first processing an input subtree and then deleting the resulting output subtree. In other words, depending on a (recognizable) property of the input subtree, it can decide whether to delete the output subtree or do something else with it. It should be intuitively clear that top-down ftt do not possess properties (B) and (B0 ), whereas bottom-up ftt do not have property (T). We now show that these differences

46

also result in differences in the corresponding classes of tree transformations. Notation 4.42. For any alphabet Σ, not containing the brackets [ and ], we define a function m : Σ+ → (Σ ∪ {[ , ]})+ as follows: for a ∈ Σ and w ∈ Σ+ , m(a) = a and m(aw) = a[m(w)]. For instance, m(aab) = a[a[b]]. Note that m is a kind of converse to the mapping ftd discussed after Definition 2.21. A tree of the form m(w) will also be called a monadic tree. Theorem 4.43. The classes of bottom-up and top-down tree transformations are incomparable. In particular, there are tree transformations in PNB − T and PNT − B. Proof. (1) Consider the bottom-up ftt M of Example 4.25. M is in PNB and is a typical example of an ftt having property (B). It is intuitively clear that M is not realizable by a top-down ftt. In fact, consider for each n ≥ 1 the specific input tree f [m(an e)]. This tree is nondeterministically transformed by M into all trees of the form f [m(we)m(we)], where w is a string over {a, b} of length n. Suppose that the top-down ftt N = (Q0 , Σ0 , ∆0 , R0 , Q0d ) could do the same transformation of these input trees. Then, roughly speaking, N would first have to make a copy of m(an e) and would then have to relabel the two copies in an arbitrary but identical way, which is clearly impossible. A formal proof goes as follows. If N realizes the same transformation, then, for each n ≥ 1 and each w ∈ {a, b}∗ of ∗ length n, there is a derivation q0 [f [m(an e)]] =⇒ f [m(we)m(we)] for some q0 in Q0d . N

Let us consider a fixed n. Consider, in each of these 2n derivations, the first string of the form f [t1 t2 ]; that is, consider the moment that f is produced as output. Note that this is not necessarily the second string of the derivation, since the transducer may first erase the input symbol f and some of the a’s before producing any output (thus the derivation ∗ ∗ may look like q0 [f [m(an e)]] = ⇒ q[a[m(ak e)]] ⇒ f [t1 t2 ] = ⇒ f [m(we)m(we)] for some q ∈ Q0 ∗ and some k, 0 ≤ k < n, or even like q0 [f [m(an e)]] = ⇒ q[e] ⇒ f [t1 t2 ] = f [m(we)m(we)] for some q ∈ Q). Obviously, for different derivations these strings have to be different: ∗ ∗ ∗ ∗ if, for w 6= w0 , both t1 = ⇒ m(we), t2 = ⇒ m(we) and t1 = ⇒ m(w0 e), t2 = ⇒ m(w0 e), then also ∗ f [t1 t2 ] = ⇒ f [m(we)m(w0 e)], which is an invalid output. Therefore there are 2n of such strings f [t1 t2 ]. However it is clear that f [t1 t2 ] is of the form f [ t1 t2 ]hx1 ← m(ak e)i, where 0 ≤ k ≤ n and f [ t1 t2 ] is the right hand side of a rule in R0 . Therefore the number of possible f [t1 t2 ]’s is less than (n + 1)r, where r = #(R0 ). For n sufficiently large this is a contradiction. (2) Consider now the top-down ftt M of Example 4.37. M is in PNT and is a typical example of an ftt having property (T). Suppose that M can be realized by a bottom-up ftt N = (Q0 , Σ0 , ∆0 , R0 , Q0d ). Consider again for each n ≥ 1 the specific input tree f [m(an e)]. This tree should be transformed by N into all trees of the form f [m(w1 e)m(w2 e)] for w1 , w2 ∈ {a, b}∗ of length n. Let us consider, in each of the derivations realizing this transformation, the first string which contains the output tree. Note that this is not necessarily the last string since N may end its computation by erasing a number of a’s and the input f . Obviously, this string is obtained from the previous one by application of a rule with right hand side of the form q[f [ t1 t2 ]], where q ∈ Q0 , t1 , t2 ∈ T∆ ({x1 })

47

and there are s1 and s2 such that t1 hx1 ← s1 i = m(w1 e) and t2 hx1 ← s2 i = m(w2 e). Obviously, if f [ t1 t2 ] contains no x1 or only one x1 , then the rule can only be used for exactly one input tree f [m(an e)]. Thus we may choose n such that in all derivations starting with f [m(an e)] the right hand side q[f [ t1 t2 ]] contains two x1 ’s (it cannot contain more). Thus q[f [ t1 t2 ]] is of the form q[f [m(v1 x1 )m(v2 x1 )]] for certain v1 , v2 ∈ {a, b}∗ . By choosing n larger than the length of all such v1 ’s and v2 ’s occurring in right hand sides of rules in R0 , we see that the output tree always has two equal subtrees 6= e: it has to be of the form f [m(v1 we)m(v2 we)] for some w ∈ {a, b}+ . Thus, for such an n, not all possible outputs are produced. This is a contradiction. An important property of a class F of tree transformations is whether it is closed under composition. If so, then we know that each sequence of transformations from F can be realized by one tree transducer (corresponding to the class F ). We then also know that the class of F -surface tree languages is closed under the transformations of F. The next theorem shows that unfortunately the classes of top-down and bottom-up tree transformations are not closed under composition. This nonclosure is caused by the failure of property (B) for top-down transformations (property (T) for bottom-up transformations). Theorem 4.44. T and B are not closed under composition. In particular, there are tree transformations in (REL ◦ HOM) − T and in (HOM ◦ REL) − B. Proof. (1) The bottom-up ftt M of Example 4.25 can be realized by the composition of a relabeling and a homomorphism. Let Ω0 = {e} and Ω1 = {a, b, f }. Let r be the relabeling from Σ into Ω defined by r0 (e) = {e}, r1 (a) = {a, b} and r1 (f ) = {f }. Let h be the tree homomorphism from Ω into ∆ defined by h0 (e) = e, h1 (a) = a[x1 ], h1 (b) = b[x1 ] and h1 (f ) = f [x1 x1 ]. Then, for all s ∈ TΣ and t ∈ T∆ , (s, t) ∈ M iff there exists u in TΩ such that u ∈ r(s) and h(u) = t. Thus, by the first part of the proof of Theorem 4.43, M is in (REL ◦ HOM) − T. (2) The top-down ftt M of Example 4.37 can be realized by the composition of a homomorphism and a relabeling. Let Π be the ranked alphabet with Π0 = {e}, Π1 = {a} and Π2 = {f }. Let h be the tree homomorphism from Σ into Π defined by h0 (e) = e, h1 (a) = a[x1 ] and h1 (f ) = f [x1 x1 ]. Let r be the relabeling from Π into ∆ defined by r0 (e) = {e}, r1 (a) = {a, b} and r2 (f ) = {f }. Then, for all s ∈ TΣ and t ∈ T∆ , (s, t) ∈ M iff there exists u in TΠ such that h(s) = u and t ∈ r(u). Thus, by the second part of the proof of Theorem 4.43, M is in (HOM ◦ REL) − B. Exercise 4.45. Prove the statements in the above proof.

One might get the impression that each bottom-up (resp. top-down) tree transformation can be realized by two top-down (resp. bottom-up) tree transducers (i.e., B ⊆ T ◦ T, resp. T ⊆ B ◦ B). We shall show later that this is true. Let us now consider the linear case. Since properties (B) and (T) are now eliminated, the only remaining difference between linear top-down and bottom-up tree transducers is caused by property (B0 ).

48

Lemma 4.46. There is a tree transformation M that belongs to LDB, but not to T. M can be realized by the composition of a deterministic top-down fta with a linear homomorphism. Proof. Let Σ0 = {c}, Σ1 = {b}, Σ2 = {a}, ∆0 = {c} and ∆1 = {a, b}. Consider the tree transformation M = {(a[tc], a[t]) | t = m(bn c) for some n ≥ 0}. We shall show that M∈ / T. The rest of the proof is left as an exercise. Suppose that there is a top-down ftt N = (Q, Σ0 , ∆0 , R, Qd ) such that T (N ) = M . Each successful derivation of N has to start with the application of a rule q0 [a[x1 x2 ]] → s, where q0 ∈ Qd and s ∈ T∆0 (X2 ). Now, if s contains no x1 , then we could change the input a[tc] into a[t0 c] without changing the output. If s contains no x2 , then we could change a[tc] into a[tb[c]] and still obtain (the same) output. But if s contains both x1 and x2 then it has to contain a symbol of rank 2 and so a[t] cannot be derived. Since both deterministic top-down fta and linear homomorphisms belong to LDT we can state the following corollary. Corollary 4.47. Composition of linear deterministic top-down tree transformations leads out of the class of top-down tree transformations; in a formula: (LDT ◦ LDT) − T 6= ∅. We now show that, in some sense, property (B0 ) is the only cause of difference between linear bottom-up and linear top-down tree transformations. Firstly, all linear top-down tree transformations can be realized linear bottom-up. Secondly, in the nondeleting linear case, all differences between top-down and bottom-up are gone (this can be considered as a generalization of Theorem 3.17). Theorem 4.48. (1) LT ( LB, (2) NLT = NLB. Proof. We first show part (2). Let us say that a nondeleting linear bottom-up ftt M = (Q, Σ, ∆, R, Qd ) and a nondeleting linear top-down ftt N = (Q0 , Σ0 , ∆0 , R0 , Q0d ) are “associated” if Q = Q0 , Σ = Σ0 , ∆ = ∆0 , Qd = Q0d and (i) for each a ∈ Σ0 , q ∈ Q and t ∈ T∆ , a → q[t] is in R iff q[a] → t is in R0 ; (ii) for each k ≥ 1, a ∈ Σk , q1 , . . . , qk , q ∈ Q and t ∈ T∆ (Xk ) linear and nondeleting w.r.t. Xk , a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R iff q[a[x1 · · · xk ]] → thx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i is in R0 .

49

Note that each tree r ∈ T∆ (Q[Xk ]), which is linear and nondeleting w.r.t. Xk , is of the form thx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i, where t ∈ T∆ (Xk ) is linear and nondeleting w.r.t. Xk (in fact, t is the result of replacing qi [xi ] by xi in r). Therefore it is clear that for each M ∈ NLB there exists an associated N ∈ NLT and vice versa. Hence it suffices to prove that associated ftt realize the same tree transformation. Let M and N be associated as above. We shall prove, by induction on s, that for every q ∈ Q, s ∈ TΣ and u ∈ T∆ , ∗

s =⇒ q[u]

iff

M

∗

q[s] =⇒ u.

(∗)

N

For s ∈ Σ0 , (∗) is obvious. Suppose now that s = a[s1 · · · sk ] for some k ≥ 1, a ∈ Σk and s1 , . . . , sk ∈ TΣ . The only-if part of (∗) is left to the reader (it is similar to the proof of Theorem 4.28(3)). The if-part of (∗) is proved as follows (it is similar to the ∗ proof of Theorem 3.65). Let the first rule applied in the derivation q[a[s1 · · · sk ]] =⇒ u be N

q[a[x1 · · · xk ]] → r, and let r = thx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i for certain t ∈ T∆ (Xk ) and ∗ q1 , . . . , qk ∈ Q. Thus q[a[s1 · · · sk ]] =⇒ thx1 ← q1 [s1 ], . . . , xk ← qk [sk ]i =⇒ u. Since t is linN

N

ear and nondeleting, there exist u1 , . . . , uk ∈ T∆ such that u = thx1 ← u1 , . . . , xk ← uk i ∗ ∗ and qi [si ] =⇒ ui for all i, 1 ≤ i ≤ k. Hence, by induction, si =⇒ qi [ui ] for all i, N

M

1 ≤ i ≤ k. Also, by associatedness, the rule a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R. Conse∗ quently, a[s1 · · · sk ] =⇒ a[q1 [u1 ] · · · qk [uk ]] =⇒ q[thx1 ← u1 , . . . , xk ← uk i] = q[u]. M

M

We now show part (1). By Lemma 4.46, it suffices to show that LT ⊆ LB. In principle we can use the construction used above to show NLT ⊆ NLB. The only problem is that the top-down transducer N may delete subtrees, whereas a bottom-up transducer is forced to process a subtree before deleting it. The solution is to add an “identity state” d to the set of states of M which allows M to process any subtree which has to be ∗ deleted (d is such that for all t ∈ TΣ , t =⇒ d[t]). The formal construction is as follows. M

Let N = (Q, Σ, ∆, R, Qd ) be a linear top-down ftt. Construct the linear bottom-up ftt M = (Q ∪ {d}, Σ, ∆ ∪ Σ, RM , Qd ), where RM is obtained as follows. (i) For each a ∈ Σ0 the rule a → d[a] is in RM , and for each k ≥ 1 and a ∈ Σk the rule a[d[x1 ] · · · d[xk ]] → d[a[x1 · · · xk ]] is in RM . (ii) For q ∈ Q, a ∈ Σ0 and t ∈ T∆ , if q[a] → t is in R, then a → q[t] is in RM . (iii) Let q[a[x1 · · · xk ]] → t be in R, where q ∈ Q, k ≥ 1, a ∈ Σk and t is a linear tree in T∆ (Q[Xk ]). Determine the (unique) states q1 , . . . , qk ∈ Q ∪ {d} such that, for 1 ≤ i ≤ k, either qi [xi ] occurs in t or (xi does not occur in t and) qi = d. Determine t0 ∈ T∆ (Xk ) such that t0 hx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i = t. Then the rule a[q1 [x1 ] · · · qk [xk ]] → q[t0 ] is in RM . Again (∗) can be proved, and since the proof only slightly differs from the previous one, it is left to the reader. Exercise 4.49. Find an example of a tree transformation in LDt B − T.

Exercise 4.50. Compare the classes PLT and PLB.

50

Exercise 4.51. Let a deterministic top-down ftt be called “simple” if it is not allowed to make different translations of the same input subtree (if q[a[x1 · · · xk ]] → t is a rule and q1 [xi ], q2 [xi ] occur in t, then q1 = q2 ). Prove that the class of simple deterministic top-down tree transformations is included in B. (This result should be expected from the fact that property (T) is eliminated. Similarly, one can prove that NDB ⊆ T, because properties (B) and (B0 ) are eliminated.)

4.4 Decomposition and composition of bottom-up tree transformations Since bottom-up tree transformations are theoretically easier to handle than top-down tree transformations, we start investigating the former. We have seen that a bottom-up ftt can copy after nondeterministic processing (property (B)). The next theorem shows that these two things can in fact be taken apart into different phases of the transformation: each bottom-up ftt can be decomposed into two transducers, the first doing the nondeterminism (linearly) and the second doing the copying (deterministically). Theorem 4.52. Each bottom-up tree transformation can be realized by a finite state relabeling followed by a homomorphism. In formula: B ⊆ QREL ◦ HOM. Moreover

LB ⊆ QREL ◦ LHOM and DB ⊆ DBQREL ◦ HOM.

Proof. Let M = (Q, Σ, ∆, R, Qd ) be a bottom-up ftt. To simulate M in two phases we apply a technique similar to the one use in the proof of Theorem 3.58: a finite state relabeling is used to put information on each node indicating by which piece of the tree the node should be replaced; then a homomorphism is used to actually replace each node by that piece of tree. The formal construction is as follows. We simultaneously construct a ranked alphabet Ω, the set of rules RN of a bottom-up ftt N = (Q, Σ, Ω, RN , Qd ) and a homomorphism h : TΩ → T∆ as follows. (i) If a → q[t] is a rule in R, then dt is a (new) symbol in Ω0 , a → q[dt ] is in RN and h0 (dt ) = t. (ii) If a[q1 [x1 ] · · · qk [xk ]] → q[t] is a rule in R, then dt is a (new) symbol in Ωk , a[q1 [x1 ] · · · qk [xk ]] → q[dt [x1 · · · xk ]] is in RN and hk (dt ) = t. The only requirement on the symbols of Ω is that if t1 6= t2 then dt1 6= dt2 . Obviously N is a (bottom-up) finite state relabeling. Also, if M is linear then h is linear, and if M is deterministic then so is N . It can easily be shown (by induction on s) that, for s ∈ TΣ , q ∈ Q and t ∈ T∆ , ∗

s =⇒ q[t] M

iff

∗

∃u ∈ TΩ : s =⇒ q[u] and h(u) = t. N

From this it follows that M = N ◦ h, which proves the theorem.

51

Example 4.53. Consider the bottom-up ftt M of Example 4.18. It can be decomposed as follows. Firstly, Ω0 = {a, b} and Ω2 = {m1 , m2 , n1 , n2 }, where da = a, db = b, dm[x1 x1 ] = m1 , dm[x2 x2 ] = m2 , dn[x1 x1 ] = n1 and dn[x2 x2 ] = n2 . Secondly, N = (Q, Σ, Ω, RN , Qd ), where RN consists of the rules a → q0 [a],

b → q0 [b]

f [qi [x1 ]qj [x2 ]] → q1−i [m1 [x1 x2 ]],

g[qi [x1 ]qj [x2 ]] → q1−i [n1 [x1 x2 ]],

f [qi [x1 ]qj [x2 ]] → q1−j [m2 [x1 x2 ]],

g[qi [x1 ]qj [x2 ]] → q1−j [n2 [x1 x2 ]]

for all i, j ∈ {0, 1}. Finally, h is defined by h0 (a) = a, h0 (b) = b, h2 (m1 ) = m[x1 x1 ], h2 (m2 ) = m[x2 x2 ], h2 (n1 ) = n[x1 x1 ] and h2 (n2 ) = n[x2 x2 ]. For example, ∗ f [a[g[ab]]] =⇒ q0 [m2 [a[n2 [ab]]]] and h(m2 [a[n2 [ab]]]) = m[n[bb]n[bb]]. N

Note that Theorem 4.52 means (among other things) that each bottom-up tree transformation can be realized by the composition of two top-down tree transformations (cf. Theorem 4.43). We now show that, in the nondeterministic case, the finite state relabeling can still be decomposed further into a relabeling followed by a finite tree automaton restriction. Theorem 4.54. B ⊆ REL ◦ FTA ◦ HOM and LB ⊆ REL ◦ FTA ◦ LHOM. Proof. By the previous theorem and the fact that HOM and LHOM are closed under composition (see Exercise 4.9) it clearly suffices to show that QREL ⊆ REL◦FTA◦LHOM. Let M = (Q, Σ, ∆, R, Qd ) be a bottom-up finite state relabeling. We shall actually show that M can be simulated by a relabeling, followed by an fta, followed by a projection (which is in LHOM ). The relabeling guesses which rule is applied by M at each node (and puts that rule as a label on the node), the bottom-up fta checks whether this guess is in accordance with the possible state transitions of M , and finally the projection labels the node with the right label. Formally we construct a ranked alphabet Ω, a relabeling r from TΣ into TΩ , a (nondeterministic) bottom-up fta N = (Q, Ω, δ, S, Qd ) and a projection p from TΩ into T∆ as follows. (i) If rule m in R is of the form a → q[b], then dm is a (new) symbol in Ω0 , dm ∈ r0 (a), q ∈ Sdm and p0 (dm ) = b. (ii) If rule m in R is of the form a[q1 [x1 ] · · · qk [xk ]] → q[b[x1 · · · xk ]], then dm is a (new) symbol in Ωk , dm ∈ rk (a), q ∈ δdkm (q1 , . . . , qk ) and pk (dm ) = b. We require that if m and n are different rules, then dm 6= dn . It is left to the reader to show that, for s ∈ TΣ , q ∈ Q and t ∈ T∆ , ∗

s =⇒ q[t] M

iff

∃u ∈ TΩ : u ∈ r(s), u ∈ L(N ) and t = p(u).

This proves the theorem.

52

These decomposition results are often very helpful when proving something about bottom-up tree transformations: the proof can often be split up into proofs about REL, FTA and HOM only. As an example, we immediately have the following result from Theorem 4.54 and Theorems 3.32, 3.48 and 3.65 (note that a class of tree languages is closed under fta restrictions if and only if it is closed under intersection with recognizable tree languages!). Corollary 4.55. RECOG is closed under linear bottom-up tree transformations.

(This expresses that the image of a recognizable tree language under a linear bottom-up tree transformation is again recognizable. In other words, LB -Surface = RECOG.) Exercise 4.56. Prove, using Theorem 4.54, that B -Surface = HOM -Surface. Prove that, in fact, each B -Surface tree language is the homomorphic image of a rule tree language. We now prove that under certain circumstances the composition of two elements in B is again in B. Recall from Section 4.3 that the non-closure of B under composition was caused by the failure of property (T) for B : in general, in B , we can’t compose a copying transducer with a nondeterministic one. We now show that if either the first transducer is noncopying or the second one is deterministic, then their composition is again in B . Thus, when eliminating (the failure of) property (T), closure results are obtained. Theorem 4.57. (1)

LB ◦ B ⊆ B and

LB ◦ LB ⊆ LB.

(2) B ◦ DB ⊆ B and DB ◦ DB ⊆ DB. Proof. Because of our decomposition of B we only need to look at special cases. These are treated in three lemmas, concerning composition with homomorphisms, fta restrictions and relabelings respectively. Detailed induction proofs of these lemmas are left to the reader. Lemma. B ◦ HOM ⊆ B, LB ◦ LHOM ⊆ LB and DB ◦ HOM ⊆ DB. Proof. Let M = (Q, Σ, ∆, R, Qd ) be a bottom-up ftt and h a tree homomorphism from T∆ into TΩ . We have to show that M ◦ h can be realized by a bottom-up ftt N . The idea is the same as that of Theorem 3.65: N simulates M but outputs at each step the homomorphic image of the output of M . Note that, contrary to the proof of Theorem 3.65 (which was concerned with regular tree grammars, a top-down device), we need not require linearity of h. The construction is as follows. Extend h to trees in T∆ (X) by defining h0 (xi ) = xi for all xi in X. Thus h is now a homomorphism from T∆ (X) into TΩ (X). Define N = (Q, Σ, Ω, RN , Qd ) such that (i) if a → q[t] is in R, then a → q[h(t)] is in RN ; (ii) if a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R, then a[q1 [x1 ] · · · qk [xk ]] → q[h(t)] is in RN .

53

Obviously, if M and h are linear then so is N (the linear homomorphism transforms a linear tree in T∆ (X) into a linear tree in TΩ (X)), and if M is deterministic then so is N . Lemma. B ◦ FTA ⊆ B, LB ◦ FTA ⊆ LB and DB ◦ FTA ⊆ DB. Proof. The idea of the proof is similar to the one used to solve Exercise 3.68. Let e = (QN , ∆, ∆, RN , QdN ) be a M = (Q, Σ, ∆, R, Qd ) be a bottom-up ftt and let N deterministic bottom-up ftt corresponding to a deterministic bottom-up fta as in the e can be realized by a bottom-up proof of Theorem 4.28(2). We have to show that M ◦ N ftt K. K will have Q × QN as its set of states and it will simultaneously simulate M e at the computed output of M . and keep track of the state of N e Extend N by expanding its alphabet to ∆ ∪ X (or, better, to ∆ ∪ Xm where m is the highest subscript of a variable occurring in the rules of M ) and by allowing the variables e in its rules to range over T∆ (X). Thus the computation of the finite tree automaton N may now be started with an element of T∆ (QN [X]), which means that, at certain places e has to start in prescribed start states. in the tree, N Construct K = (Q × QN , Σ, ∆, RK , Qd × QdN ) such that ∗

(i) if a → q[t] is in R and if t =⇒ q 0 [t], then a →

(q, q 0 )[t]

e N

is in RK ; ∗

(ii) if a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R, and if thx1 ← q10 [x1 ], . . . , xk ← qk0 [xk ]i =⇒ q 0 [t], then the rule a[(q1 , q10 )[x1 ] · · · (qk , qk0 )[xk ]] → (q, q 0 )[t] is in RK .

e N

e is deterministic, if M is Note that if M is linear, then so is K. Moreover, since N deterministic then so is K. Lemma. B ◦ DBQREL ⊆ B, DB ◦ DBQREL ⊆ DB and LB ◦ REL ⊆ LB. Proof. The proofs of the first two statements are easy generalizations of the proof of the previous lemma. The proof of the third statement is left as an easy exercise. We now finish the proof of Theorem 4.57. Firstly LB ◦ B ⊆ LB ◦ REL ◦ FTA ◦ HOM

(Thm. 4.54)

⊆ LB ◦ FTA ◦ HOM

(3d lemma)

⊆ LB ◦ HOM

(2d lemma)

⊆B

(1st lemma)

54

and similarly for LB ◦ LB ⊆ LB. Secondly B ◦ DB ⊆ B ◦ DBQREL ◦ HOM

(Thm. 4.52)

⊆ B ◦ HOM

(3d lemma)

⊆B

(1st lemma)

and similarly for DB ◦ DB ⊆ DB.

Note that the “right hand side” of Theorem 4.57 states that LB and DB are closed under composition. Exercise 4.58. Is LDB closed under composition?

Exercise 4.59. Prove that B -Surface is closed under intersection with recognizable tree languages. A consequence of Theorem 4.57 (or in fact its second lemma) is the following useful fact. Corollary 4.60. RECOG is closed under inverse bottom-up tree transformations (in particular under inverse tree homomorphisms, cf. Exercise 3.68). Proof. Let L ∈ RECOG and M ∈ B. We have to show that M −1 (L) is in the class RECOG. Obviously, if R is the finite tree automaton restriction {(t, t) | t ∈ L}, then M −1 (L) = dom(M ◦ R). By Theorem 4.57, M ◦ R is in B and so, by Exercise 4.29, its domain is in RECOG. Remark 4.61. Note that, since, for arbitrary tree transformations M1 and M2 , (M1 ◦ M2 )−1 = M2−1 ◦ M1−1 , Corollary 4.60 implies that if M is the composition of any finite number of bottom-up tree transformations (in particular, elements of REL, FTA and HOM ), then RECOG is closed under M −1 ; moreover, since dom(M ) = M −1 (T∆ ), the domain of any such tree transformation M is recognizable. We finally note that, because of the composition results of Theorem 4.57, the inclusion signs in Theorems 4.52 and 4.54 may be replaced by equality signs. It is also easy to prove equations like B = LB ◦ HOM, B = LB ◦ DB or B = REL ◦ DB (cf. property (B)). The first equation has some importance of its own, since it characterizes B without referring to the notion “bottom-up”; this is so because LB has such a characterization as shown next (we first need a definition). Definition 4.62. Let F be a class of tree transformations containing all identity transformations. For each n ≥ 1 we define F n inductively by F 1 = F and F n+1 = S Fn ◦ F. ∗ Moreover, the closure of F under composition, denoted by F , is defined to be F n. n≥1

Thus F ∗ consists of all tree transformations of the form M1 ◦ M2 ◦ · · · ◦ Mn , where n ≥ 1 and Mi ∈ F for all i, 1 ≤ i ≤ n.

55

Corollary 4.63. (1) LB = (REL ∪ FTA ∪ LHOM)∗ . (2)

B = LB ◦ HOM.

Proof. The inclusions ⊆ follow from Theorems 4.52 and 4.54; the inclusions ⊇ follow from Theorem 4.57.

4.5 Decomposition of top-down tree transformations We now show that, analogously to the bottom-up case, we can decompose each top-down ftt into two transducers, the first doing the copying and the second doing the rest of the work (cf. property (T)). Theorem 4.64. T ⊆ HOM ◦ LT and DT ⊆ HOM ◦ LDT. Proof. Let M = (Q, Σ, ∆, R, Qd ) be a top-down ftt. While processing an input tree, M generally makes a lot of copies of input subtrees in order to get different translations of these subtrees. To simulate M we can first use a homomorphism which simply makes as many copies of subtrees as are needed by M , and then we can simulate M linearly (since all copies are already there). The formal construction is as follows. We first determine the “degree of copying” of M , that is the maximal number of copies needed by M in any step of its computation. Thus, for x ∈ X and r ∈ R, let rx be the number of occurrences of x in the right hand side of r. Let n = max{rx | x ∈ X, r ∈ R}. We now let Ω be the ranked alphabet obtained from Σ by multiplying the ranks of all symbols by n (so that a node may be connected to n copies of each of its subtrees). Thus Ωkn = Σk for all k ≥ 0. The “copying” homomorphism h from TΣ into TΩ is now defined by (i) for a ∈ Σ0 , h0 (a) = a (ii) for k ≥ 1 and a ∈ Σk , hk (a) = a[xn1 xn2 · · · xnk ]. (For example, if k = 2 and n = 3, then h2 (a) = a[x1 x1 x1 x2 x2 x2 ]). Finally the top-down ftt N = (Q, Ω, ∆, RN , Qd ) is defined as follows. (i) If q[a] → t is a rule in R, then it is also in RN . (ii) Suppose that q[a[x1 · · · xk ]] → t is a rule in R. Let us denote the variables x1 , x2 , . . . , xkn by x1,1 , x1,2 , . . . , x1,n , x2,1 , . . . , x2,n , . . . , xk,1 , . . . , xk,n respectively. Then the rule q[a[x1,1 · · · x1,n · · · xk,1 · · · xk,n ]] → t0 is in RN , where t0 is taken such that it is linear and such that t0 hxi,j ← xi i 1≤i≤k = t 1≤j≤n

(t0

can be obtained by putting different second subscripts on different occurrences of the same variable in t. For instance, if q[a[x1 x2 ]] → b[q1 [x2 ]cd[q2 [x1 ]q3 [x2 ]]] is in R and n = 3, then we can put the rule q[a[x1,1 x1,2 x1,3 x2,1 x2,2 x2,3 ]] → b[q1 [x2,1 ]cd[q2 [x1,1 ]q3 [x2,2 ]]] in RN .) Obviously, if M is deterministic, then so is N . A formal proof of the fact that M = h◦N is left to the reader.

56

Example 4.65. Consider the top-down ftt M of Example 4.33 and let us consider its decomposition according to the above proof. Clearly n = 2 and therefore the definition of h for, for example, +, ∗, −, a and b is h2 (+) = +[x1 x1 x2 x2 ], h2 (∗) = ∗[x1 x1 x2 x2 ], h1 (−) = −[x1 x1 ], h0 (a) = a and h0 (b) = b. Thus, for example, h(∗[+[ab] − [a]]) = ∗[+[aabb] + [aabb] − [aa] − [aa]]. For instance the first three rules of M turn into the following three rules for N : q[+[x1,1 x1,2 x2,1 x2,2 ]] → +[q[x1,1 ]q[x2,1 ]], q[∗[x1,1 x1,2 x2,1 x2,2 ]] → +[∗[q[x1,1 ]i[x2,1 ]] ∗ [i[x1,2 ]q[x2,2 ]]], q[−[x1,1 x1,2 ]] → −[q[x1,1 ]]. It is left to the reader to see how N processes h(∗[+[ab] − [a]]).

Since LT ⊆ LB (Theorem 4.48), we know already how to decompose LT . This gives us the following result. Corollary 4.66. T ⊆ HOM ◦ LB = HOM ◦ REL ◦ FTA ◦ LHOM.

Notice that the inclusion is proper by Lemma 4.46. From this corollary we see that each top-down tree transformation can be realized by the composition of two bottom-up tree transformations (cf. Theorem 4.43). Another way of expressing our decomposition results concerning B and T is by the equation B ∗ = T ∗ = (REL ∪ FTA ∪ HOM)∗ . By the above corollary and Remark 4.61 we obtain that RECOG is closed under inverse top-down tree transformations, and in particular Corollary 4.67. The domain of a top-down tree transformation is recognizable.

The next theorem says, analogously to Theorem 4.52, that each deterministic element of LT can be decomposed into two simpler (deterministic) ones. Theorem 4.68. LDT ⊆ DTQREL ◦ LHOM. Proof. See the proof of Theorem 4.52.

We now note that we cannot obtain very nice results about closure under composition analogously to those of Theorem 4.57, since for instance LT and DT are not closed under composition (see Corollary 4.47). The reason is essentially the failure of property (B0 ) for top-down ftt. One could get closure results by eliminating both properties (B) and (B0 ). For example, one can show that Dt T ◦ T ⊆ T, T ◦ NLT ⊆ T, etc. However, after having compared DB with DT in the next section, we prefer to extend the top-down ftt in such a way that it has the capability of checking before deletion. It will turn out that the so extended top-down transducer has all the nice properties “dual” to those of B . The following is an easy exercise in composition. Exercise 4.69. Prove that every T -surface tree language is in fact the image of a rule tree language under a top-down tree transformation.

57

4.6 Comparison of B and T, the deterministic case Although, by determinism, some differences between B and T are eliminated, DB and DT are still incomparable for a number of reasons. We first discuss the question why DB contains elements not in DT , and then the reverse question. Firstly, we have seen in Lemma 4.46 that DB contains elements not in T . This was caused by property (B0 ). Secondly, we have seen in Theorem 3.14 (together with Theorem 3.8) that there are deterministic bottom-up recognizable tree languages which cannot be recognized deterministically top-down. It is easy to see that the corresponding fta restrictions cannot be realized by a deterministic top-down transducer. In fact the following can be shown. Exercise 4.70. Prove that the domain of a deterministic top-down ftt can be recognized by a deterministic top-down fta. Thirdly, there is a trivial reason that DB is stronger than DT : a bottom-up ftt can, for example, recognize the “lowest” occurrence of some symbol in a tree (since it is the first occurrence), whereas a deterministic top-down ftt cannot (since for him it is the last occurrence). Lemma 4.71. There is a tree transformation in DBQREL which is not in DT . Proof. Let Σ0 = {b}, Σ1 = {a, f }, ∆0 = {b} and ∆1 = {a, a, f }. Consider the bottom-up ftt M = (Q, Σ, ∆, R, Qd ) where Q = Qd = {q1 , q2 } and R consists of the rules b → q1 [b], a[q1 [x]] → q1 [a[x]],

f [q1 [x]] → q2 [f [x]],

a[q2 [x]] → q2 [a[x]],

f [q2 [x]] → q2 [f [x]],

where x denotes x1 . Thus M is a deterministic bottom-up finite state relabeling which bars all a’s below the lowest f . Obviously a deterministic top-down ftt cannot give the right translation for both m(an b) and m(an f b). Let us now consider DT . Firstly, we note that property (T) has not been eliminated: a deterministic topdown ftt still has the ability to copy an input subtree and continue the translation of these copies in different states. Consider for example the tree transformation M = {(m(abn c), a[m(pn c)m(q n c)]) | n ≥ 0}. Obviously M is in DT . It can be shown, similarly to the proof of Theorem 4.43(2), that M is not in B (see also Exercise 4.51). Secondly, deterministic bottom-up ftt cannot distinguish between left and right (because they start at the bottom!), whereas deterministic top-down ftt can. Consider for example the tree transformation M = {(a[m(bn c)m(bk c)], a[m(bn d)m(bk e)]) | n, k ≥ 0}. This is obviously an element of DT (even DTQREL) and can be shown not to be in DB (of course it is in B : a nondeterministic bottom-up ftt can guess whether it is left or right and check its guess when arriving at the top). Thus we have the following lemma.

58

Lemma 4.72. There is a tree transformation in DTQREL which is not in DB .

Thirdly, there is a proof of this lemma which is analogous to the one of Lemma 4.71: there is a deterministic top-down finite state relabeling that bars all a’s above the highest f , and this cannot be done by an element of DB .

4.7 Top-down finite tree transducers with regular look-ahead One way to take away the advantages of DB over DT is to allow the top-down tree transducer to have a look-ahead: that is, the ability to inspect an input subtree and, depending on the result of that inspection, decide which rule to apply next. Moreover it seems to be sufficient (and natural) that this look-ahead ability should consist of inspecting whether the input subtree belongs to a certain recognizable tree language or not (in other words, checking whether it has a certain “recognizable property”). As a result of this capability the top-down tree transducer would first of all have property (B0 ): it can check a recognizable property of a subtree and decide whether to delete it or not. Secondly, the domain of a deterministic top-down tree transducer would be arbitrary recognizable (it just starts by checking whether the whole input tree belongs to the recognizable tree language). And, thirdly, a deterministic top-down tree transducer would for instance be able to see the “lowest” occurrence of some symbol in a tree (it just checks whether the subtree beneath the symbol contains another occurrence of the same symbol, and that is a recognizable property). We now formally define the top-down transducer with regular (= recognizable) lookahead. It turns out that the look-ahead feature can be expressed easily in a tree rewriting system (see Definition 4.12): for each rule we specify the ranges of the variables in the rule to be certain recognizable tree languages (such a rule is then applicable only if the corresponding input subtrees belong to these recognizable tree languages). Definition 4.73. A top-down (finite) tree transducer with regular look-ahead is a structure M = (Q, Σ, ∆, R, Qd ), where Q, Σ, ∆ and Qd are as for the ordinary top-down ftt and R is a finite set of rules of the form (t1 → t2 , D), where t1 → t2 is an ordinary top-down ftt rule and D is a mapping from Xk into P(TΣ ) (where k is the number of variables in t1 ) such that, for 1 ≤ i ≤ k, D(xi ) ∈ RECOG. (Whenever D is understood or will be specified later we write t1 → t2 rather than (t1 → t2 , D). We call t1 and t2 the left hand side and right hand side of the rule respectively). M is viewed as a tree rewriting system in the obvious way, (t1 → t2 , D) being a “rule scheme” (t1 , t2 , D). The tree transformation realized by M , denoted by T (M ) or M , is ∗ {(s, t) ∈ TΣ × T∆ | q[s] =⇒ t for some q ∈ Qd }. M

Thus a top-down ftt with regular look-ahead works in exactly the same way as an ordinary one, except that the application of each of its rules is restricted: the (input sub-)trees substituted in the rule should belong to prespecified recognizable tree languages. Note that for rules of the form q[a] → t the mapping D need not be specified.

59

Notation 4.74. The phrase “with regular look-ahead” will be indicated by a prime. Thus the class of top-down tree transformations with regular look-ahead will be denoted by T 0 . An element of T 0 is also called a top-down0 tree transformation. Example 4.75. Consider the tree transformation M in the proof of Lemma 4.46. It can be realized by the top-down0 ftt N = (Q, Σ, ∆, R, Qd ) where Q = {q0 , q}, Qd = {q0 } and R consists of the following rules: q0 [a[x1 x2 ]] → a[q[x1 ]] with ranges D(x1 ) = {m(bn c) | n ≥ 0} and D(x2 ) = {c}, q[b[x1 ]] → b[q[x1 ]] with D(x1 ) = TΣ , and q[c] → c . Note that, in the first rule, D(x1 ) could as well be TΣ since it is checked later by N that the left subtree contains no a’s. The essential use of regular look-ahead in this example is the restriction of the right subtree to {c}. We now define some subclasses of T 0 . Definition 4.76. Let M = (Q, Σ, ∆, R, Qd ) be a top-down0 ftt. The definitions of linear, nondeleting and one-state are identical to the bottom-up and top-down ones (see Definition 4.23). M is called (partial) deterministic if the following holds. (i) Qd is a singleton. (ii) If (s → t1 , D1 ) and (s → t2 , D2 ) are different rules in R (with the same left hand side), then D1 (xi ) ∩ D2 (xi ) = ∅ for some i, 1 ≤ i ≤ k (where k is the number of variables in s). Since the ranges of the variables are recognizable, it can effectively be determined whether a top-down0 ftt is deterministic (if, of course, these ranges are effectively specified, which we always assume). Notation 4.24 also applies to T 0 . Thus LDT 0 is the class of linear deterministic top-down tree transformations with regular look-ahead. Observe that, obviously, T ⊆ T 0 since each top-down ftt can be transformed trivially into a top-down0 ftt by specifying all ranges of all variables in all rules to be the (recognizable) tree language TΣ (where Σ is the input alphabet). Moreover, if Z is a modifier, then ZT ⊆ ZT 0 . It can easily be seen that Theorems 4.43 and 4.44 still hold with T replaced by T 0 . In fact, the proofs of the theorems are true without further change. In the next theorem we show that the regular look-ahead can be “taken out” of a top-down0 ftt. Theorem 4.77. T 0 ⊆ DBQREL ◦ T and ZT 0 ⊆ DBQREL ◦ ZT for Z ∈ {L, D, LD}.

60

Proof. Let M = (Q, Σ, ∆, R, Qd ) be in T 0 . Consider all recognizable properties which M needs for its look-ahead (that is, all recognizable tree languages D(xi ) occurring in the rules of M ). We can use a total deterministic bottom-up finite state relabeling to check, for a given input tree t, whether the subtrees of t have these properties or not, and to put at each node a (finite) amount of information telling us whether the direct subtrees of that node have the properties or not. After this relabeling we can use an ordinary top-down ftt to simulate M , because the look-ahead information is now contained in the label of each node. The formal construction might look as follows. Let L1 , . . . , Ln be all the recognizable tree languages occurring as ranges of variables in the rules of M . Let U denote the set {0, 1}n , that is, the set of all sequences of 0’s and 1’s of length n. The j th element of u ∈ U will be denoted by uj . An element u of U will be used to indicate whether a tree belongs to L1 , . . . , Ln or not (uj = 1 iff the tree is in Lj ). We now introduce a new ranked alphabet Ω such that Ω0 = Σ0 and, for k ≥ 1, Ωk = Σk × U k . Thus an element of Ωk is of the form (a, (u1 , . . . , uk )) with a ∈ Σk and u1 , . . . , uk ∈ U . If a node is labeled by such a symbol, it will mean that ui contains all the information about the ith subtree of the node. Next we define the mapping f : TΣ → TΩ as follows: (i) for a ∈ Σ0 , f (a) = a; (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , f (a[t1 · · · tk ]) = b[f (t1 ) · · · f (tk )], where b = (a, (u1 , . . . , uk )) and, for 1 ≤ i ≤ k and 1 ≤ j ≤ n, uji = 1 iff ti ∈ Lj . It is left as an exercise to show that f can be realized by a (total) deterministic bottom-up finite state relabeling (given the deterministic bottom-up fta’s recognizing L1 , . . . , Ln ). We now define a top-down ftt N = (Q, Ω, ∆, RN , Qd ) such that (i) if q[a] → t is in R, then it is in RN ; (ii) if (q[a[x1 · · · xk ]] → t, D) is in R, then each rule of the form q[(a, u)[x1 · · · xk ]] → t is in RN , where u = (u1 , . . . , uk ) ∈ U k and u satisfies the condition: if D(xi ) = Lj then uji = 1 (for all i and j, 1 ≤ i ≤ k, 1 ≤ j ≤ n). This completes the construction. It is obvious from this construction that M = f ◦ N . Moreover, if M is linear, then so is N . It is also clear that, in the construction above, the set U may be replaced by the set {u ∈ U | for all j1 and j2 , if Lj1 ∩ Lj2 = ∅, then uj1 · uj2 6= 1} (note that this influences RN : rules containing elements not in this set are removed). One can now easily see that if M is deterministic, then so is N . An immediate consequence of this theorem and previous decomposition results is that each element of T 0 is decomposable into elements of REL, FTA and HOM . Corollary 4.78. The domain of a top-down ftt with regular look-ahead is recognizable. Proof. For instance by Remark 4.61.

61

Another consequence of Theorem 4.77 is that the addition of regular look-ahead has no influence on the surface tree languages. Corollary 4.79. T 0 -Surface = T -Surface and DT 0 -Surface = DT -Surface. Proof. Let L be a (D)T 0 -surface tree language, so L = M (L1 ) for some M ∈ (D)T 0 and L1 ∈ RECOG. Now, by Theorem 4.77, M = R ◦ N for some R ∈ DBQREL and N ∈ (D)T. Hence L = N (R(L1 )). Since RECOG is closed under linear bottom-up tree transformations (Corollary 4.55), N (R(L1 )) is a (D)T -surface tree language. We now show that, in the linear case, there is no difference between bottom-up and top-down0 tree transformations (all properties (B), (T) and (B0 ) are “eliminated”), cf. Theorem 4.48. Theorem 4.80. LT 0 = LB. Proof. First of all we have LT 0 ⊆ DBQREL ◦ LT

(Theorem 4.77)

⊆ DBQREL ◦ LB

(Theorem 4.48)

⊆ LB

(Theorem 4.57).

Let us now prove that LB ⊆ LT 0 . The construction is the same as in the proof of Theorem 4.48(2), but now we use look-ahead in case the bottom-up ftt is deleting. Let M = (Q, Σ, ∆, R, Qd ) be a linear bottom-up ftt. Define, for each q in Q, Mq to be the bottom-up ftt (Q, Σ, ∆, R, {q}). Construct the linear top-down0 ftt N = (Q, Σ, ∆, RN , Qd ) such that (i) if a → q[t] is in R, then q[a] → t is in RN ; (ii) if a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R, then the rule q[a[x1 · · · xk ]] → thx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i is in RN , where, for 1 ≤ i ≤ k, if xi does not occur in t, then D(xi ) = dom(Mqi ), and D(xi ) = TΣ otherwise. Note that dom(Mqi ) is recognizable by Exercise 4.29. It should be clear that T (N ) = T (M ).

From the proof of Theorem 4.80 it follows that each element of LT 0 can be realized by a linear top-down0 ftt with the property that look-ahead is only used on subtrees which are deleted. This property corresponds precisely to property (B0 ). Let us now consider composition of top-down0 ftt. Analogously to the bottom-up case, we can now expect the results in the next theorem from property (B). Theorem 4.81. (1)

T 0 ◦ LT 0 ⊆ T 0 .

(2) DT 0 ◦ T 0 ⊆ T 0

and DT 0 ◦ DT 0 ⊆ DT 0 .

62

Proof. As in the bottom-up case, we only consider a number of special cases. Lemma. T 0 ◦ LHOM ⊆ T 0 and DT 0 ◦ HOM ⊆ DT 0 . Proof. Let M = (Q, Σ, ∆, R, Qd ) be a top-down0 ftt and h a tree homomorphism from T∆ into TΩ . We construct a new top-down0 ftt using the old idea of applying the homomorphism to the right hand sides of the rules of M . Therefore we extend h to trees in T∆ (Q[X]) by defining h0 (x) = x for x in X and h1 (q) = q[x1 ] for q ∈ Q. Let, for q ∈ Q, Mq = (Q, Σ, ∆, R, {q}). Note that, by Corollary 4.78, dom(Mq ) is recognizable. Construct now N = (Q, Σ, Ω, RN , Qd ) such that (i) if q0 [a] → t is in R, then q0 [a] → h(t) is in RN ; (ii) if (q0 [a[x1 · · · xk ]] → t, D) is in R, then (q0 [a[x1 · · · xk ]] → h(t), D) is in RN , where, for 1 ≤ i ≤ k, D(xi ) is the intersection of D(xi ) and all tree languages dom(Mq ) such that q[xi ] occurs in t but not in h(t). Thus N simultaneously simulates M and applies h to the output of M . But, whenever M starts making a translation of a subtree t starting in a state q, and this translation is deleted by h, N checks that t is translatable by Mq . If h is linear or M is deterministic, then N = M ◦ h. Obviously, if M is deterministic, then so is N . Lemma. T 0 ◦ QREL ⊆ T 0 , DT 0 ◦ DTQREL ⊆ DT 0 and DT 0 ◦ DBQREL ⊆ DT 0 . Proof. It is not difficult to see that, given M ∈ T 0 and a top-down finite state relabeling N , the composition of these two can be realized by one top-down0 ftt K, which at each step simultaneously simulates M and N by transforming the output of M according to N . The construction is basically the same as that in the second lemma of Theorem 4.57. It is also clear that if M and N are both deterministic, then so is K. This gives us the first two statements of the lemma. The third one is more difficult since the finite state relabeling is now bottom-up. However the same kind of construction can be applied again, using the look-ahead facility to make the resulting top-down0 ftt deterministic. Let M = (Q, Σ, ∆, R, Qd ) with Qd = {qd } be in DT 0 and let N = (QN , ∆, Ω, RN , QdN ) be in DBQREL. Without loss of generality we assume that qd does not occur in the right hand sides of the rules in R. Let, as usual, for q ∈ Q, Mq = (Q, Σ, ∆, R, {q}), and, for p ∈ QN , Np = (QN , ∆, Ω, RN , {p}). We shall realize M ◦ N by a deterministic top-down0 ftt K = (Q, Σ, Ω, RK , Qd ), where the set RK of rules is determined as follows. Let q ∈ QM and p ∈ QN , such that if q = qd then p ∈ QdN . ∗

(i) If q[a] → t is in R and t =⇒ p[t0 ], then q[a] → t0 is in RK . N

63

(ii) Let (q[a[x1 · · · xk ]] → t, D) be a rule in R. Then t can be written as t = shx1 ← q1 [xi1 ], . . . , xm ← qm [xim ]i for certain m ≥ 0, s ∈ T∆ (Xm ), q1 , . . . , qm ∈ Q and xi1 , . . . , xim ∈ Xk , such that s is nondeleting w.r.t. Xm . Let p1 , . . . , pm be a sequence of m states of N such that ∗ shx1 ← p1 [x1 ], . . . , xm ← pm [xm ]i =⇒ p[s0 ], where s0 ∈ TΩ (Xm ). (Of course N

N was first extended in the usual way). Then the rule q[a[x1 · · · xk ]] → s0 hx1 ← q1 [xi1 ], . . . , xm ← qm [xim ]i is in RK , where the ranges of the variables are specified by D as follows. For 1 ≤ u ≤ k, D(xu ) is the intersection of D(xu ) and all tree languages dom(Mqj ◦ Npj ) such that xij = xu . This ends the construction. Intuitively, when K arrives at the root of an input subtree a[t1 · · · tk ] (in the same state as M ), it first uses its regular look-ahead to determine the rule applied by M and to determine, for every 1 ≤ j ≤ m, the (unique) state pj in which N will arrive after translation of the Mqj -translation of tij . It then runs N on the piece of output of M , starting in the states p1 , . . . , pm , and produces the output of N as its piece of output. It is straightforward to check formally that K is a deterministic top-down0 ftt (using the determinism of both M and N ). We now complete the proof of Theorem 4.81. Firstly T 0 ◦ LT 0 = T 0 ◦ LB

(Theorem 4.80)

0

(Theorem 4.52)

0

(second lemma)

0

(first lemma).

⊆ T ◦ QREL ◦ LHOM ⊆ T ◦ LHOM ⊆T Secondly DT 0 ◦ (D)T 0 ⊆ DT 0 ◦ DBQREL ◦ (D)T

(Theorem 4.77)

0

(second lemma)

0

(Theorem 4.64)

0

(first lemma).

⊆ DT ◦ (D)T ⊆ DT ◦ HOM ◦ L(D)T ⊆ DT ◦ L(D)T Now DT 0 ◦ LT ⊆ T 0 (by (1) of this theorem) and DT 0 ◦ LDT ⊆ DT 0 ◦ DTQREL ◦ LHOM

(second lemma)

0

(first lemma).

⊆ DT ◦ LHOM ⊆ DT

(Theorem 4.68)

0

This proves Theorem 4.81.

Note that the “right hand side” of Theorem 4.81(2) states that DT 0 is closed under composition. Clearly we know already that LT 0 is closed under composition (since

64

LT 0 = LB). It can also easily be checked from the proof of Theorem 4.81 that LDT 0 is closed under composition. We can now show that indeed regular look-ahead has made DT stronger than DB . Corollary 4.82. DB ( DT 0 . Proof. By Theorem 4.52, DB ⊆ DBQREL ◦ HOM. Hence, since the identity tree transformation is in DT 0 , we trivially have DB ⊆ DT 0 ◦ DBQREL ◦ HOM. But, by the second and the first lemma in the proof of Theorem 4.81, DT 0 ◦ DBQREL ◦ HOM ⊆ DT 0 . Hence DB ⊆ DT 0 . Proper inclusion follows from Lemma 4.72. Exercise 4.83. Show that each T 0 -surface tree language is the range of some element of T 0 . Exercise 4.84. Prove that T -Surface is closed under linear tree transformations (recall Corollary 4.79). Prove that DT -Surface is closed under deterministic top-down and bottom-up tree transformations. It follows from Theorem 4.81 that the inclusion signs in Theorem 4.77 may be replaced by equality signs. Hence, for example, DT 0 = DBQREL ◦ DT = DB ◦ DT. We finally show a result “dual” to Corollary 4.63(2) (recall that LT 0 = LB). Theorem 4.85. T 0 = HOM ◦ LT 0 . Proof. The inclusion HOM ◦ LT 0 ⊆ T 0 is immediate from Theorem 4.81. The inclusion T 0 ⊆ HOM ◦ LT 0 can be shown in exactly the same way as in the proof of T ⊆ HOM ◦ LT (Theorem 4.64). The only problem is the regular look-ahead: the image of a recognizable tree language under the homomorphism h need not be recognizable (we use the notation of the proof of Theorem 4.64). The solution is to consider a homomorphism g from TΩ into TΣ such that, for all t in TΣ , g(h(t)) = t; g is easy to find. Now, whenever, in a rule of M , we have a recognizable tree language L as look-ahead (for some variable), we can use g −1 (L) as the look-ahead in the corresponding rule of N . The details are left to the reader.

4.8 Surface and target languages In this section we shall consider a few properties of the tree (and string) languages which are obtained from the recognizable tree languages by application of a finite number of tree transducers. In other words we shall consider the classes (REL ∪ FTA ∪ HOM)∗ -Surface, briefly denoted by Surface, and (REL ∪ FTA ∪ HOM)∗ -Target, briefly denoted by Target. Note that Target = yield(Surface). Note also that, by various composition results, (REL ∪ FTA ∪ HOM)∗ = T ∗ = (T 0 )∗ = B ∗ = (T 0 ∪ B)∗ = . . . etc. Let us first consider some classes of tree languages obtained by restricting the number of transducers applied. In particular, let us consider, for each k ≥ 1, the

65

classes T k -Surface, (T 0 )k -Surface and B k -Surface. by the above remark, S k S S Obviously, k 0 k Surface = T -Surface = (T ) -Surface = B -Surface. k≥1

k≥1

k≥1

As a corollary to previous results we can show that regular look-ahead has no influence on the class of surface languages (cf. Corollary 4.79). Corollary 4.86. For all k ≥ 1, (i) (T 0 )k = DBQREL ◦ T k , (ii) (T 0 )k -Surface = T k -Surface, and (iii) T k -Surface is closed under linear tree transformations. Proof. (i) By Theorem 4.77, T 0 ⊆ DBQREL ◦ T. Also, by Corollary 4.82 and Theorem 4.81(2), DBQREL ◦ T ⊆ T 0 . Hence T 0 = DBQREL ◦ T. We now show that T 0 ◦ T 0 = T 0 ◦ T. Trivially, T 0 ◦ T ⊆ T 0 ◦ T 0 . Also 0 T ◦ T 0 = T 0 ◦ DBQREL ◦ T and, by Theorem 4.81(1), T 0 ◦ DBQREL ⊆ T 0 . Hence T 0 ◦ T 0 ⊆ T 0 ◦ T. From this it is straightforward to see that (T 0 )k+1 = T 0 ◦ T k = DBQREL ◦ T ◦ T k = DBQREL ◦ T k+1 . (ii) This is an immediate consequence of (i) and the fact that RECOG is closed under linear tree transformations. (iii) This follows easily from (ii) and the fact that (T 0 )k is closed under composition with linear tree transformations (Theorem 4.81(1), recall also Theorem 4.80). We mention here that it can also be shown that T k -Surface is closed under union, tree concatenation and tree concatenation closure. From that a lot of closure properties of T k -Target and Target follow. The relation between the top-down and the bottom-up surface tree languages is easy. Corollary 4.87. For all k ≥ 1, (i) T k -Surface = (B k ◦ LB)-Surface and B k+1 -Surface = (T k ◦ HOM)-Surface; (ii) B k -Surface ⊆ T k -Surface ⊆ B k+1 -Surface. Proof. (i) follows from the fact that B = LB ◦ HOM (Corollary 4.63(2)) and that T 0 = HOM ◦ LB (Theorem 4.85). (ii) is an obvious consequence of (i). Note that B-Surface = HOM-Surface (Exercise 4.56). It is not known, but conjectured, that for all k the inclusions in Corollary 4.87(ii) are proper. Note that, by taking yields, Corollary 4.87 also holds for the corresponding target languages. Again it is not known whether the inclusions are proper.

66

In the rest of this section we show that the emptiness-, the membership- and the finiteness-problem are solvable for Surface and Target. Theorem 4.88. The emptiness- and membership-problem are solvable for Surface. Proof. Let M ∈ (REL ∪ FTA ∪ HOM)∗ and L ∈ RECOG. Consider the tree language M (L) ∈ Surface. Obviously, M (L) = ∅ iff L ∩ dom(M ) = ∅. But, by Remark 4.61, dom(M ) is recognizable. Hence, by Theorem 3.32 and Theorem 3.74, it is decidable whether L ∩ dom(M ) = ∅. To show solvability of the membership-problem note first that Surface is closed under b is the fta restriction intersection with a recognizable tree language (if R ∈ RECOG and R b b such that dom(R) = R, then M (L) ∩ R = (M ◦ R)(L)). Now, for any tree t, t ∈ M (L) iff M (L) ∩ {t} 6= ∅. Since {t} is recognizable, M (L) ∩ {t} ∈ Surface, and we just showed that it is decidable whether a surface tree language is empty or not. To show decidability of the finiteness-problem we shall use the following result. Lemma 4.89. Each monadic tree language in Surface is recognizable (and hence regular as a set of strings). Proof. Let L be a monadic tree language. Obviously it suffices to show that, for each k ≥ 1, if L ∈ (T 0 )k -Surface, then L ∈ (T 0 )k−1 -Surface (where, by definition, (T 0 )0 -Surface = RECOG). Suppose therefore that L ∈ (T 0 )k -Surface, so L = (M1 ◦ · · · ◦ Mk−1 ◦ Mk )(R) for certain Mi ∈ T 0 and R ∈ RECOG. Consider all right hand sides of rules in Mk . Obviously, since L is monadic, these right hand sides do not contain elements of rank ≥ 2, that is, they are monadic in the sense of Notation 4.42 (rules which have nonmonadic right hand sides may be removed). But from this it follows that Mk is linear. It now follows from Theorem 4.81(1) (and Corollary 4.55 and Theorem 4.80 in the case k = 1), that L ∈ (T 0 )k−1 -Surface. Theorem 4.90. The finiteness-problem is solvable for Surface. Proof. Intuitively, a tree language is finite if and only if the set of paths through this tree language is finite. For L ⊆ TΣ we define path(L) ⊆ Σ∗ recursively as follows (i) for a ∈ Σ0 , path(a) = {a}; (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , path(a[t1 · · · tk ]) = {a} · (path(t1 ) ∪ · · · ∪ path(tk )); S (iii) for L ⊆ TΣ , path(L) = path(t). t∈L

Thus, path(t) consists of all strings which can be obtained by following a path through t (for instance, if t = a[bb[cc]], then path(t) = {ab, abc}). We remark here that any other similar definition of “path” would also satisfy our purposes. It is left to the reader to show that, for any tree language L, L is finite iff path(L) is finite. We now show that, given L ⊆ TΣ , the set path(L) can be obtained by feeding L into a top-down tree transducer Mp : Mp (L) will be equal to path(L), modulo the correspondence

67

between strings and monadic trees (see Definition 2.21). In fact, Mp = (Q, Σ, ∆, R, Qd ), where Q = Qd = {p}, ∆0 = {e}, ∆1 = Σ and R consists of the following rules: (i) for each a ∈ Σ0 , p[a] → a[e] is in R; (ii) for every k ≥ 1, a ∈ Σk and 1 ≤ i ≤ k, p[a[x1 · · · xk ]] → a[p[xi ]] is in R. Consequently, L is finite iff Mp (L) is finite. Now, let L ∈ Surface. Then, obviously, Mp (L) ∈ Surface. Moreover Mp (L) is monadic. Hence, by Lemma 4.89, Mp (L) is recognizable. Thus, by Theorem 3.75, it is decidable whether Mp (L) is finite. To show that the above mentioned problems are solvable for Target, we need the following lemma. Lemma 4.91. Each language in Target is (effectively) of the form yield(L) or yield(L) ∪ {λ}, where L ⊆ TΣ for some Σ such that Σ1 = ∅ and e ∈ / Σ, L ∈ Surface. Proof. It is left as an exercise to show that, for any L0 ⊆ TΣ0 , there exists a bottom-up tree transducer M such that M (L0 ) ⊆ TΣ for some Σ satisfying the requirements, and such that yield(M (L0 )) = yield(L0 ) − {λ}. It is also left as an exercise to show that it is decidable whether λ ∈ yield(L0 ). From these two facts the lemma follows. Theorem 4.92. The emptiness-, membership- and finiteness-problem are solvable for Target. Proof. It is obvious from the previous lemma that we may restrict ourselves to targetlanguages yield(L), where L ∈ Surface and L ⊆ TΣ for some Σ such that Σ1 = ∅ and e∈ / Σ0 . Obviously, yield(L) = ∅ iff L = ∅. Hence the emptiness-problem is solvable by Theorem 4.88. Note that, by Example 2.17, for a given w ∈ Σ∗0 , there are only a finite number of trees such that yield(t) = w. From this and Theorem 4.88 the decidability of the membership-problem follows. Moreover it follows that yield(L) is finite iff L is finite. Hence, by Theorem 4.90, the finiteness-problem is solvable. We note that it can be shown that Target is (properly, by Theorem 4.92) contained in the class of context-sensitive languages. Thus, Target lies properly between the context-free and the context-sensitive languages. We finally note that, in a certain sense, Surface ⊆ Target (cf. the similar situation for RECOG and CFL). In fact, let L ⊆ TΣ be in Surface. Let J and K be two new symbols (“standing for [ and ]”). Let ∆ be the ranked alphabet such that ∆0 = Σ ∪ {J, K}, ∆1 = ∆2 = ∆3 = ∅ and ∆k+3 = Σk for k ≥ 1. Let M = (Q, Σ, ∆, R, Qd ) be the (deterministic) top-down tree transducer such that Q = Qd = {f } and the rules are the following (i) for k ≥ 1 and a ∈ Σk , f [a[x1 · · · xk ]] → a[aJf [x1 ] · · · f [xk ]K] is in R,

68

(ii) for a ∈ Σ0 , f [a] → a is in R. (Note that M is in fact a homomorphism). It is left to the reader to show that yield(M (L)) = Lh[ ← J, ] ← Ki (as string languages).

5 Notes on the literature In the text there are some references to [A&U] A.V. Aho and J.D. Ullman, The theory of parsing, translation and compiling, I and II, Prentice-Hall, 1972. [Sal] A. Salomaa, Formal languages, Academic Press, 1973. An informal survey of the theory of tree automata and tree transducers (uptil 1970) is given by [Thatcher, 1973].

On Section 3 Bottom-up finite tree automata were invented around 1965 independently by [Doner, 1965, 1970] and [Thatcher & Wright, 1968] (and somewhat later by [Pair & Quere, 1968]). The original aim of the theory of tree automata was to apply it to the decision problems of second-order logical theories concerning strings. The connection with context-free languages was established in [Mezei & Wright, 1967] and [Thatcher, 1967], and the idea to give “tree-oriented” proofs for results on strings is expressed in [Thatcher, 1973] and [Rounds, 1970a]. Independently, results concerning parenthesis languages and structural equivalence were obtained by [McNaugthon, 1967], [Knuth, 1967], [Ginsburg & Harrison, 1967] and [Paull & Unger, 1968]. Top-down finite tree automata were introduced by [Rabin, 1969] and [Magidor & Moran, 1969], and regular tree grammars by [Brainerd, 1969]. The notion of rule tree languages occurs in one form or another in several places in the literature. Most of the results of Section 3 can be found in the above mentioned papers. Other work on finite tree automata and recognizable tree languages is written down in the following papers: [Arbib & Give’on, 1968], automata on acyclic graphs, category theory; [Brainerd, 1968], state minimalization of finite tree automata; [Costich, 1972], closure properties of RECOG; [Eilenberg & Wright, 1967], category theoretic formulation of fta; [Ito & Ando, 1974], axioms for regular tree expressions; [Maibaum, 1972, 1974], tree automata on “many-sorted” trees;

69

[Ricci, 1973], decomposition of fta; [Takahashi, 1972, 1973], several results; [Yeh, 1971], generalization of “semigroup of fa” to fta. Remark: The notation of substitution (tree concatenation) can be formalized algebraically as in [Eilenberg & Wright, 1967], [Goguen & Thatcher, 1974], [Thatcher, 1970] and [Yeh, 1971]. Generalizations of finite automata are the following: – finite automata on derivation graphs of type 0 grammars, [Benson, 1970], [Buttelmann, 1971], [Hart, 1974]; – probabilistic tree automata, [Ellis, 1970], [Magidor & Moran, 1970]; – automata and grammars for infinite trees, [Rabin, 1969], [Goguen & Thatcher, 1974], [Engelfriet, 1972]; – recognition of subsets of an arbitrary algebra (rather than the algebra of trees), [Mezei & Wright, 1967], [Shephard, 1969]. There are no real open problems in finite tree automata theory. It is only a question of (i) how far one wants to go in generalizing finite automata theory to trees (for instance, decomposition theory, theory of incomplete sequential machines, noncounting regular languages, etc), and (ii) which results on context-free languages one can prove via trees (for instance, Greibach normal form, Parikh’s theorem, etc.).

On Section 4 For the literature on syntax-directed translation, see [A&U]. The notion of “generalized syntax-directed translation” is defined in [Aho & Ullman, 1971]. The top-down tree transducer was introduced in [Rounds, 1968], [Thatcher, 1970] and [Rounds, 1970b] as a model for syntax-directed translation and transformational grammars. The bottom-up tree transducer was introduced in [Thatcher, 1973]. The notion of a tree rewriting system occurs in one form or another in [Brainerd, 1969], [Rosen, 1971, 1973], [Engelfriet, 1971] and [Maibaum, 1974]. Most results of Section 4 can be found in [Rounds, 1970b], [Thatcher, 1970], [Rosen, 1971], [Engelfriet, 1971], [Baker, 1973] and [Ogden & Rounds, 1972]. Other papers on tree transformations are [Alagi´c, 1973], [Benson, 1971], [Bertsch, 1973], [Kosaraju, 1973], [Levy & Joshi, 1973], [Martin & Vere, 1970] and [Rounds, 1973]. We mention the following problems concerning tree transformations.

70

– “Statements such as “similar models have been studied by [x,y,z]” are symptomatic of the disarray in the development of the theory of translation and semantics” (free after [Thatcher, 1973]). Establish the precise relationships between various models of syntax-directed translation, semantics of context-free languages and tree transducers (see [Goguen & Thatcher, 1974]): (i) Compare existing models of syntax-directed translation with top-down tree transducers, in particular with respect to the classes of translations they define (see Definition 4.41). See [A&U], [Aho & Ullman, 1971], [Thatcher, 1973] and [Martin & Vere, 1970]. (ii) Define a tree transducer corresponding to the semantics definition method of [Knuth, 1968]. – Develop a general theory of operations on tree languages (tree AFL theory). Relate these operations to the yield languages, as illustrated in Section 3. This work was started by [Baker, 1973]. It would perhaps be convenient to consider tree transducers (Q, Σ, ∆, R, Qd ) such that R consists of rules t1 → t2 with t1 ∈ Q[TΣ (X)], t2 ∈ T∆ [Q(X)] in the top-down case, or t1 ∈ TΣ (Q[X]), t2 ∈ Q[T∆ (X)] in the bottom-up case. – Consider surface tree languages and target languages more carefully. Prove that the class T -Target is incomparable with the class of indexed languages. Prove that the classes T k -Surface form a proper hierarchy (see [Ogden & Rounds, 1972] and [Baker, 1973]). Prove that the classes T k -Target form a proper hierarchy (then you also proved the previous one). Prove that DT-Target ( T-Target. Is it possible to obtain each target language by a sequence of nondeleting (and nonerasing) tree transducers? etc. – Consider the complexity of target languages and [Aho & Ullman, 1971], [Baker, 1973] and [Rounds, 1973]).

translations

(see

– What is the practical use of tree transducers? (see [A&U], [de Remer, 1974]).

Other subjects We mention finally the following subjects in tree theory. – Context-free tree grammars, [Fischer, 1968], [Rounds, 1969, 1970a, 1970b], [Downey, 1974], [Maibaum, 1974]. Let us explain briefly the notion of a context-free tree grammar. Consider a system G = (N, Σ, R, S), where N is a ranked alphabet of nonterminals, Σ is a ranked alphabet of terminals, V := N ∪ Σ, S ∈ N0 is the initial nonterminal, and R is a finite set of rules of one of the forms A → t with A ∈ N0 and t ∈ TV , or

71

A[x1 · · · xk ] → t with A ∈ Nk and t ∈ TV (Xk ). G is considered as a tree rewriting system on TV . If we let all variables in the rules range over TV then G is called a context-free tree grammar. If we let all variables range over TΣ , then G is called a bottom-up (or inside-out) context-free ∗ tree grammar. The language generated by G is L(G) = {t ∈ TΣ | S = ⇒ t}. In general the languages generated by G under the above two interpretations differ (consider for instance the grammar S → F [A], F [x1 ] → f [x1 x1 ], A → a, A → b). Thus restriction to bottom-up ( ≈ right-most in the string case) generation gives another language. On the other hand, restriction to top-down ( ≈ left-most) generation can be done without changing the language. The yield of the class of context-free tree languages is equal to the class of indexed languages. The yield of the class of bottom-up context-free tree languages is called IO. These two classes are incomparable ([Fischer, 1968]). How do these classes compare with the target languages? Is it possible to iterate the CFL → RECOG procedure and obtain results about (bottom-up) context-free tree languages from regular tree languages (of a “higher order”) (see [Maibaum, 1974]). Is there any sense in considering pushdown tree automata? – General computability on trees: [Rus, 1967], [Mahn, 1969], the Vienna method. – Tree walking automata (at each moment of time the finite automaton is at one node of the tree; depending on its state and the label of the node it goes to the father node or to one of the son nodes): [Aho & Ullman, 1971], [Martin & Vere, 1970]. – Tree adjunct grammars (another, linguistically motivated, way of generating tree languages): [Joshi & Levy & Takahashi, 1973]. – Lindenmayer tree systems (parallel rewriting): ˇ ˇ [Culik, 1974], [Culik & Maibaum, 1974], [Engelfriet, 1974], [Szilard, 1974].

References A.V. Aho and J.D. Ullman, 1971. Translations on a context-free grammar, Inf. & Control 19, 439-475. S. Alagi´c, 1973. Natural state transformations, Techn. Report 73B-2, Univ. of Massachusetts at Amherst. M.A. Arbib and Y. Give’on, 1968. Algebra automata, I & II, Inf. & Control 12, 331-370. B.S. Baker, 1973. Tree transductions and families of tree languages, Report TR-9-73, Harvard University (Abstract in: 5th Theory of Computing, 200-206). D.B. Benson, 1970. Syntax and semantics: a categorical view, Inf. & Control 17, 145-160.

72

D.B. Benson, 1971. Semantic preserving translations, Working paper, Washington State University, Washington. E. Bertsch, 1973. Some considerations about classes of mappings between context-free derivation systems, Lecture Note in Computer Science 2, 278-283. D. Bjørner, 1972. Finite state tree computations, IBM Report RJ 1053. W.S. Brainerd, 1968. The minimalization of tree automata, Inf. & Control 13, 484-491. W.S. Brainerd, 1969. Tree generating regular systems, Inf. & Control 14, 217-231. W.S. Brainerd, 1969a. Semi-Thue systems and representation of trees, 10th SWAT, 240-244. H.W. Buttelmann, 1971. On generalized finite automata and unrestricted generative grammars, 3d Theory of Computing, 63-77. O.L. Costich, 1972. A Medvedev characterization of sets recognized by generalized finite automata, Math. Syst. Th. 6, 263-267. S.C. Crespi Reghizzi and P. Della Vigna, 1973. Approximation of phrase markers by regular sets, Automata, Languages and Programming (ed. M. Nivat), North-Holland Publ. Co., 367-376. ˇ K. Culik II, 1974. Structured OL-Systems, L-Systems (eds. Rozenberg & Salomaa), Lecture Notes in Computer Science 15, 216-229. ˇ K. Culik II and T.S.E. Maibaum, 1974. Parallel rewriting systems on terms, Automata, Languages and Programming (ed. Loeckx), Lecture Notes in Computer Science 14, 495-511. J. Doner, 1970. Tree acceptors and some of their applications, J. Comp. Syst. Sci. 4, 406-451 (announced in Notices Am. Math. Soc. 12(1965), 819 as Decidability of the weak second-order theory of two successors). P.J. Downey, 1974. Formal languages and recursion schemes, Report TR 16-74, Harvard University. S. Eilenberg and J.B. Wright, 1967. Automata in general algebras, Inf. & Control 11, 452-470. C.A. Ellis, 1970. Probabilistic tree automata, 2nd Theory of Computing, 198-205. J. Engelfriet, 1971. Bottom-up and top-down tree transformations – a comparison, Memorandum 19, T.H. Twente, Holland (to be publ. in Math. Syst. Th.). J. Engelfriet, 1972. A note on infinite trees, Inf. Proc. Letters 1, 229-232.

73

J. Engelfriet, 1974. Surface tree languages and parallel derivation trees, Daimi Report PB-44, Aarhus University, Denmark. M.J. Fischer, 1968. Grammars with macro-like productions, 9th SWAT, 131-142 (Doctoral dissertation, Harvard University). S. Ginsburg and M.A. Harrison, 1967. Bracketed context-free languages, J. Comp. Syst. Sci. 1, 1-23. J.A. Goguen and J.W. Thatcher, 1974. Initial algebra semantics, 15th SWAT. J.M. Hart, 1974. Acceptors for the derivation languages of phrase-structure grammars, Inf. & Control 25, 75-92. T. Ito and S. Ando, 1974. A complete axiom system of super-regular expressions, Proc. IFIP Congress 74, 661-665. A.K. Joshi, L.S. Levy and M. Takahashi, 1973. A tree generating system, Automata, Languages and Programming (ed. Nivat), North-Holland Publ. Co., 453-465. D.E. Knuth, 1967. A characterization of parenthesis languages, Inf. & Control 11, 269-289. D.E. Knuth, 1968. Semantics of context-free languages, Math. Syst. Th. 2, 127-145 (see also: correction in Math. Syst. Th. 5(1971), 95-96, and “Examples of formal semantics” in Lecture Notes in Mathematics 188 (ed. Engeler)). S. Kosaraju, 1973. Context-sensitiveness of translational languages, 7th Princeton Conf. on Inf. Sci. and Syst. L.S. Levy and A.K. Joshi, 1973. Some results in tree automata, Math. Syst. Th. 6, 334-342. M. Magidor and G. Moran, 1969. Finite automata over finite trees, Techn. Report No. 30, Hebrew University, Jerusalem. M. Magidor and G. Moran, 1970. Probabilistic tree automata and context-free languages, Israel J. Math. 8, 340-348. F.K. Mahn, 1969. Primitiv-rekursive Funktionen auf Termmengen, Archiv f. Math. Logik und Grundlagenforschung 12, 54-65. T.S.E. Maibaum, 1972. The characterization of the derivation trees of context-free sets of terms as regular sets, 13th SWAT, 224-230. T.S.E. Maibaum, 1974. A generalized approach to formal languages, J. Comp. Syst. Sci. 8, 409-439. D.F. Martin and S.A. Vere, 1970. On syntax-directed transduction and tree-transducers, 2nd Theory of Computing, 129-135.

74

R. McNaugthon, 1967. Parenthesis grammars, Journal of the ACM 14, 490-500. J. Mezei and J.B. Wright, 1967. Algebraic automata and context-free sets, Inf. & Control 11, 3-29. D.E. Muller, 1968. Use of multiple index matrices in generalized automata theory, 9th SWAT, 395-404. W.F. Ogden and W.C. Rounds, 1972. Composition of n tree transducers, 4th Theory of Computing. C. Pair and A. Quere, 1968. D´efinition et etude des bilangages r´eguliers, Inf. & Control 13, 565-593. M.C. Paull and S.H. Unger, 1968. Structural equivalence of context-free grammars, J. Comp. Syst. Sci. 2, 427-463. M.O. Rabin, 1969. Decidability of second-order theories and automata on infinite trees, Transactions of the Am. Math. Soc. 141, 1-35. F.L. de Remer, 1974. Transformational grammars for languages and compilers, Lecture Notes in Computer Science 21. G. Ricci, 1973. Cascades of tree-automata and computations in universal algebras, Math. Syst. Th. 7, 201-218. B.K. Rosen, 1971. Subtree replacement systems, Ph. D. Thesis, Harvard University. B.K. Rosen, 1973. Tree-manipulating systems and Church-Rosser theorems, Journal of the ACM 20, 160-188. W.C. Rounds, 1968. Trees, transducers and transformations, Ph. D. Dissertation, Stanford University. W.C. Rounds, 1969. Context-free grammars on trees, 1st Theory of Computing, 143-148. W.C. Rounds, 1970a. Tree-oriented proofs of some theorems on context-free and indexed languages, 2nd Theory of Computing, 109-116. W.C. Rounds, 1970b. Mappings and grammars on trees, Math. Syst. Th. 4, 257-287. W.C. Rounds, 1973. Complexity of recognition in intermediate-level languages, 14th SWAT, 145-158. T. Rus, 1967. Some observations concerning the application of the electronic computers in order to solve nonarithmetical problems, Mathematica 9, 343-360. C.D. Shephard, 1969. Languages in general algebras, 1st Theory of Computing, 155-163. A.L. Szilard, 1974. Ω-OL Systems, L-Systems (eds. Rozenberg and Salomaa), Lecture Notes in Computer Science 15, 258-291.

75

M. Takahashi, 1972. Regular sets of strings, trees and W-structures, Dissertation, University of Pennsylvania. M. Takahashi, 1973. Primitive transformations of regular sets and recognizable sets, Automata, Languages and Programming (ed. Nivat), North-Holland Publ. Co., 475-480. J.W. Thatcher, 1967. Characterizing derivation trees of context-free grammars through a generalization of finite automata theory, J. Comp. Syst. Sci. 1, 317-322. J.W. Thatcher, 1970. Generalized2 sequential machine maps, J. Comp. Syst. Sci. 4, 339-367 (also IBM Report RC 2466, also published in 1st Theory of Computing as “Transformations and translations from the point of view of generalized finite automata theory”). J.W. Thatcher, 1973. Tree automata: an informal survey, Currents in the Theory of Computing (ed. Aho), Prentice-Hall, 143-172 (also published in 4th Princeton Conf. on Inf. Sci. and Systems, 263-276, as ”There’s a lot more to finite automata theory than you would have thought”). J.W. Thatcher and J.B. Wright, 1968. Generalized finite automata theory with an application to a decision problem of second-order logic, Math. Syst. Th. 2, 57-81. R. Turner, 1973. An infinite hierarchy of term languages – an approach to mathematical complexity, Automata, Languages and Programming (ed. Nivat), North-Holland Publ. Co., 593-608. R.T. Yeh, 1971. Some structural properties of generalized automata and algebras, Math. Syst. Th. 5, 306-318.

76

TREE AUTOMATA AND TREE GRAMMARS

by Joost Engelfriet

DAIMI FN-10 April 1975

Institute of Mathematics, University of Aarhus DEPARTMENT OF COMPUTER SCIENCE Ny Munkegade, 8000 Aarhus C, Denmark

Preface I wrote these lecture notes during my stay in Aarhus in the academic year 1974/75. As a young researcher I had a wonderful time at DAIMI, and I have always been happy to have had that early experience. I wish to thank Heiko Vogler for his noble plan to move these notes into the digital world, and I am grateful to Florian Starke and Markus Napierkowski (and Heiko) for the excellent transformation of my hand-written manuscript into LATEX. Apart from the reparation of errors and some cosmetical changes, the text of the lecture notes has not been changed. Of course, many things have happened in tree language theory since 1975. In particular, most of the problems mentioned in these notes have been solved. The developments until 1984 are described in the book “Tree Automata” by Ferenc G´ecseg and Magnus Steinby, and for recent developments I recommend the Appendix of the reissue of that book at arXiv.org/abs/1509.06233. Joost Engelfriet, October 2015 LIACS, Leiden University, The Netherlands

Tree automata and tree grammars To appreciate the theory of tree automata and tree grammars one should already be motivated by the goals and results of formal language theory. In particular one should be interested in “derivation trees”. A derivation tree models the grammatical structure of a sentence in a (context-free) language. By considering only the bottom of the tree the sentence may be recovered from the tree. The first idea in tree language theory is to generalize the notion of a finite automaton working on strings to that of a finite automaton operating on trees. It turns out that a large part of the theory of regular languages can rather easily be generalized to a theory of regular tree languages. Moreover, since a regular tree language is (almost) the same as the set of derivation trees of some context-free language, one obtains results about context-free languages by “taking the bottom” of results about regular tree languages. The second idea in tree language theory is to generalize the notion of a generalized sequential machine (that is, finite automaton with output) to that of a finite state tree transducer. Tree transducers are more complicated than string transducers since they are equipped with the basic capabilities of copying, deleting and reordering (of subtrees). The part of (tree) language theory that is concerned with translation of languages is mainly motivated by compiler writing (and, to a lesser extent, by natural linguistics). When considering bottoms of trees, finite state transducers are essentially the same as syntax-directed translation schemes. Results in this part of tree language theory treat the composition and decomposition of tree transformations, and the properties of those tree languages that can be obtained by finite state transformation of regular tree languages (or, taking bottoms, those languages that can be obtained by syntax-directed translation of context-free languages). Thirdly there are, of course, many other ideas in tree language theory. In the literature one can find, for instance, context-free tree grammars, recognition of subsets of arbitrary algebras, tree walking automata, hierarchies of tree languages (obtained by iterating old ideas), decomposition of tree automata, Lindenmayer tree grammars, etc. These lectures will be divided in the following five parts: (1) and (2) contain preliminaries, (3), (4) and (5) are the main parts. (1) Introduction. (p. 1) (2) Some basic definitions. (p. 2) (3) Recognizable (= regular) tree languages. (p. 10) (4) Finite state tree transformations. (p. 32) (5) Whatever there is more to consider. Part (5) is not contained in these notes; instead, some Notes on the literature are given on p. 69.

Contents 1 Introduction

1

2 Some basic definitions

2

3 Recognizable tree languages 10 3.1 Finite tree automata and regular tree grammars . . . . . . . . . . . . . . 10 3.2 Closure properties of recognizable tree languages . . . . . . . . . . . . . . 18 3.3 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4 Finite state tree transformations 4.1 Introduction: Tree transducers and semantics . . . . . . . . . . . . 4.2 Top-down and bottom-up finite tree transducers . . . . . . . . . . 4.3 Comparison of B and T, the nondeterministic case . . . . . . . . . 4.4 Decomposition and composition of bottom-up tree transformations 4.5 Decomposition of top-down tree transformations . . . . . . . . . . 4.6 Comparison of B and T, the deterministic case . . . . . . . . . . . 4.7 Top-down finite tree transducers with regular look-ahead . . . . . . 4.8 Surface and target languages . . . . . . . . . . . . . . . . . . . . . 5 Notes on the literature

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

32 32 35 46 51 56 58 59 65 69

1 Introduction Our basic data type is the kind of tree used to express the grammatical structure of strings in a context-free language. Example 1.1. Consider the context-free grammar G = (N, Σ, R, S) with nonterminals N = {S, A, D}, terminals Σ = {a, b, d}, initial nonterminal S and the set of rules R, consisting of the rules S → AD, A → aAb, A → bAa, A → AA, A → λ, D → Ddd and D → d (we use λ to denote the empty string). The string baabddd ∈ Σ∗ can be generated by G and has the following derivation tree (see [Sal, II.6], [A&U, 0.5 and 2.4.1]): S A

D

A b

A

A a

e

a

A

D b

d

d

d

e

– Note that we use e as a symbol standing for the empty string λ. – The string baabddd is called the “yield” or “result” of the derivation tree.

Thus, in graph terminology, our trees are finite (finite number of nodes and branches), directed (the branches are “growing downwards”), rooted (there is a node, the root, with no branches entering it), ordered (the branches leaving a node are ordered from left to right) and labeled (the nodes are labeled with symbols from some alphabet). The following intuitive terminology will be used: – the rank (or out-degree) of a node is the number of branches leaving it (note that the in-degree of a node is always 1, except for the root which has in-degree 0) – a leaf is a node with rank 0 – the top of a tree is its root – the bottom (or frontier) of a tree is the set (or sequence) of its leaves – the yield (or result, or frontier) of a tree is the string obtained by writing the labels of its leaves (except the label e) from left to right – a path through a tree is a sequence of nodes connected by branches (“leading downwards”); the length of the path is the number of its nodes minus one (that is, the number of its branches) – the height (or depth) of a tree is the length of the longest path from the top to the bottom

1

– if there is a path of length ≥ 1 (of length = 1) from a node a to a node b then b is a descendant (direct descendant) of a and a is an ancestor (direct ancestor) of b – a subtree of a tree is a tree determined by a node together with all its descendants; a direct subtree is a subtree determined by a direct descendant of the root of the tree; note that each tree is uniquely determined by the label of its root and the (possibly empty) sequence of its direct subtrees – the phrases “bottom-up”, “bottom-to-top” and “frontier-to-root” are used to indicate this ↑ direction, while the phrases “top-down”, “top-to-bottom” and “root-to-frontier” are used to indicate that ↓ direction. In derivation trees of context-free grammars each symbol may only label nodes of certain ranks. For instance, in the above example, a, b, d and e may only label leaves (nodes of rank 0), A labels nodes with ranks 1, 2 and 3, S labels nodes with rank 2, and D nodes of rank 1 and 3 (these numbers being the lengths of the right hand sides of rules). Therefore, given some alphabet, we require the specification of a finite number of ranks for each symbol in the alphabet, and we restrict attention to those trees in which nodes of rank k are labeled by symbols of rank k.

2 Some basic definitions The mathematical definition of a tree may be given in several, equivalent, ways. We will define a :::: tree::: as :: a ::::::: special::::: kind::: of :::::: string (others call this string a representation of the tree, see [A&U, 0.5.7]). Before doing so, let us define ranked alphabets. Definition 2.1. An alphabet Σ is said to be ranked if for each nonnegative integer k a subset Σk of Σ is such that Σk is nonempty for a finite number of k’s only, and S specified, such that Σ = Σk . † k≥0

If a ∈ Σk , then we say that a has rank k (note that a may have more than one rank). Usually we define a specific ranked alphabet Σ by specifying those Σk that are nonempty. Example 2.2. The alphabet Σ = {a, b, +, −} is made into a ranked alphabet by specifying Σ0 = {a, b}, Σ1 = {−} and Σ2 = {+, −}. (Think of negation and subtraction). Remark 2.3. Throughout our discussions we shall use the symbol e as a special symbol, intuitively representing λ. Whenever e belongs to a ranked alphabet, it is of rank 0. Operations on ranked alphabets should be defined as for instance in the following definition. † To be more precise one should define a ranked alphabet as a pair (Σ, f ), where Σ is an alphabet and f is a mapping from N into P(Σ) such that ∃n ∀k ≥ n : f (k) = ∅, and then denote f (k) by Σk and (Σ, f ) by Σ. Note that N = {0, 1, 2, . . .} is the set of natural numbers and that P(Σ) is the set of subsets of Σ.

2

Definition 2.4. Let Σ and ∆ be ranked alphabets. The union of Σ and ∆, denoted by Σ ∪ ∆, is defined by (Σ ∪ ∆)k = Σk ∪ ∆k , for all k ≥ 0. We say that Σ and ∆ are equal, denoted by Σ = ∆, if, for all k ≥ 0, Σk = ∆k . We now define the notion of tree. Let “ [ ” and “ ] ” be two symbols which are never elements of a ranked alphabet. Definition 2.5. Given a ranked alphabet Σ, the set of trees over Σ, denoted by TΣ , is the language over the alphabet Σ ∪ {[ , ]} defined inductively as follows. (i) If a ∈ Σ0 , then a ∈ TΣ . (ii) For k ≥ 1, if a ∈ Σk and t1 , t2 , . . . , tk ∈ TΣ , then a[t1 t2 · · · tk ] ∈ TΣ .

Intuitively, a is a tree with one node labeled “a”, and a[t1 t2 · · · tk ] is the tree a ... t1

. . .

t2

.

tk

Example 2.6. Consider the ranked alphabet of Example 2.2. Then +[−[a − [b]]a] is a tree over this alphabet, intuitively “representing” the tree + − a

a − b

which on its turn “represents” the expression (a − (−b)) + a (note that the “official” tree is the prefix notation of this expression). Example 2.7. Consider the ranked alphabet ∆, where ∆0 = {a, b, d, e}, ∆1 = ∆3 = {A, D} and ∆2 = {A, S}. A picture of the tree S[A[A[bA[e]a]A[aA[e]b]]D[D[d]dd]] in T∆ is given in Example 1.1.

Exercise 2.8. Take some ranked alphabet Σ and show that TΣ is a context-free language over Σ ∪ {[ , ]}. Our main aim will be to study several ways of constructively representing sets of trees and relations between trees. The basic terminology is the following. Definition 2.9. Let Σ be a ranked alphabet. A tree language over Σ is any subset of TΣ .

3

Definition 2.10. Let Σ and ∆ be ranked alphabets. A tree transformation from TΣ into T∆ is any subset of TΣ × T∆ . Exercise 2.11. Show that the context-free grammar G = (N, Σ, R, S) with N = {S}, Σ = {a, b, [ , ]} and R = {S → b[aS], S → a} generates a tree language over ∆, where ∆0 = {a} and ∆2 = {b}. The above definition of “tree” (Definition 2.5) gives rise to the following principles of proof by induction and definition by induction for trees. (Note that each tree is, uniquely, either in Σ0 or of the form a[t1 · · · tk ]). Principle 2.12. Principle of proof by induction (or recursion) on trees. Let P be a property of trees (over Σ). If

(i) all elements of Σ0 have property P , and (ii) for each k ≥ 1 and each a ∈ Σk , if t1 , . . . , tk have property P , then a[t1 · · · tk ] has property P ,

then all trees in TΣ have property P .

Principle 2.13. Principle of definition by induction (or recursion) on trees. Suppose we want to associate a value h(t) with each tree t in TΣ . Then it suffices to define h(a) for all a ∈ Σ0 , and to show how to compute the value h(a[t1 · · · tk ]) from the values h(t1 ), . . . , h(tk ). More formally expressed, given a set O of objects, and (i) for each a ∈ Σ0 , an object oa ∈ O, and (ii) for each k ≥ 1 and each a ∈ Σk , a mapping fak : Ok → O, there is exactly one mapping h : TΣ → O such that (i) h(a) = oa for all a ∈ Σ0 , and (ii) h(a[t1 · · · tk ]) = fak (h(t1 ), . . . , h(tk )) for all k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ . Example 2.14. Let Σ0 = {e} and Σ1 = { / }. The trees in TΣ are in an obvious one-to-one correspondence with the natural numbers. The above principles are the usual induction principles for these numbers. To illustrate the use of the induction principles we give the following useful definitions. Definition 2.15. The mapping yield from TΣ into Σ∗0 is defined inductively as follows. ( a if a = 6 e (i) For a ∈ Σ0 , yield(a) = λ if a = e. (ii) For a ∈ Σk and t1 , . . . , tk ∈ TΣ , yield(a[t1 · · · tk ]) = yield(t1 ) · yield(t2 ) · · · yield(tk ). † †

That is, the concatenation of yield(t1 ), . . . , yield(tk ).

4

Moreover, for a tree language L ⊆ TΣ , we define yield(L) = {yield(t) | t ∈ L}. We shall sometimes abbreviate “yield” by “y”.

Definition 2.16. The mapping height from TΣ into N is defined recursively as follows. (i) For a ∈ Σ0 , height(a) = 0. (ii) For a ∈ Σk and t1 , . . . , tk ∈ TΣ , height(a[t1 · · · tk ]) = max (height(ti )) + 1.

1≤i≤k

Example 2.17. As an example of a proof by induction on trees we show that, if e ∈ / Σ0 and Σ1 = ∅, then, for all t ∈ TΣ , height(t) < |yield(t)|. Proof. For a ∈ Σ0 , height(a) = 0 and |yield(a)| = |a| = 1 (since a 6= e). Now let a ∈ Σk (k ≥ 2) and assume (induction hypothesis) that height(ti ) < |yield(ti )| for 1 ≤ i ≤ k. k P Then |yield(a[t1 · · · tk ])| = |yield(ti )| (Def. 2.15(ii)) i=1

≥(

k P

height(ti )) + k

(ind. hypothesis)

i=1

≥ ( max height(ti )) + 2

(k ≥ 2 and height(ti ) ≥ 0)

1≤i≤k

> height(a[t1 · · · tk ])

(Def. 2.16(ii)).

Exercise 2.18. Let Σ be a ranked alphabet such that Σ0 ∩ Σk = ∅ for all k ≥ 1. Define a (string) homomorphism h from (Σ ∪ {[ , ]})∗ into Σ∗0 such that, for all t ∈ TΣ , h(t) = yield(t). Exercise 2.19. Give a recursive definition of the notion of “subtree”, for instance as a mapping sub : TΣ → P(TΣ ) such that sub(t) is the set of all subtrees of t. Give also an alternative definition of “subtree” in a more string-like fashion. Exercise 2.20. Let path(t) denote the set of all paths from the top of t to its bottom. Think of a formal definition for “path”. The generalization of formal language theory to a formal tree language theory will come about by viewing a :::::: string::: as :a::::::: special::::: kind::: of :::: tree and taking the obvious generalizations. To be able to view strings as trees we “turn them 90 degrees” to a vertical position, as follows. Definition 2.21. A ranked alphabet Σ is monadic if (i) Σ0 = {e}, and (ii) for k ≥ 2, Σk = ∅. The elements of TΣ are called monadic trees.

5

Thus a monadic ranked alphabet Σ is fully determined by the alphabet Σ1 . Monadic trees obviously can be made to correspond to the strings in Σ∗1 . There are two ways to do this, depending on whether we read top-down or bottom-up: ftd : TΣ → Σ∗1 is defined by (i) ftd (e) = λ (ii) ftd (a[t]) = a · ftd (t) for a ∈ Σ1 and t ∈ TΣ and fbu : TΣ → Σ∗1 is defined by (i) fbu (e) = λ (ii) fbu (a[t]) = fbu (t) · a for a ∈ Σ1 and t ∈ TΣ . (Obviously both ftd and fbu are bijections). Accordingly, when generalizing a string-concept to trees, we often have the choice between a top-down and a bottom-up generalization. Example 2.22. The string alphabet ∆ = {a, b, c} corresponds to the monadic alphabet Σ with Σ0 = {e} and Σ1 = ∆. The tree a b c b e in TΣ corresponds either to the string abcb in ∆∗ (top-down), or to the string bcba in ∆∗ (bottom-up). Note that, due to our “prefix definition” of trees (Definition 2.5), the above tree looks “top-down like” in its official form a[b[c[b[e]]]]. Obviously this is not essential. Let us consider some basic operations on trees. A basic operation on strings is rightconcatenation with one symbol (that is, for each symbol a in the alphabet there is an operation rca such that, for each string w, rca (w) = wa). Every string can uniquely be built up from the empty string by these basic operations (consider the way you write and read!). Generalizing bottom-up, the corresponding basic operations on trees, here called “top concatenation”, are the following. Definition 2.23. For each a ∈ Σk (k ≥ 1) we define the (k-ary) operation of top concatenation with a, denoted by tcka , to be the mapping from TΣk into TΣ such that, for all t1 , . . . , tk ∈ TΣ , tcka (t1 , . . . , tk ) = a[t1 · · · tk ]. Moreover, for tree languages L1 , . . . , Lk , we define tcka (L1 , . . . , Lk ) = {a[t1 · · · tk ] | ti ∈ Li for all 1 ≤ i ≤ k}.

6

Note that every tree can uniquely be built up from the elements of Σ0 by repeated top concatenation. The next basic operation on strings is concatenation. When viewed monadically, concatenation corresponds to substituting one vertical string into the e of the other vertical string. In the general case, we may take one tree and substitute a tree into each leaf of the original tree, such that different trees may be substituted into leaves with different labels. Thus we obtain the following basic operation on trees. Definition 2.24. Let n ≥ 1, a1 , . . . , an ∈ Σ0 all different, and s1 , . . . , sn ∈ TΣ . For t ∈ TΣ , the tree concatenation of t with s1 , . . . , sn at a1 , . . . , an , denoted by tha1 ← s1 , . . . , an ← sn i, is defined recursively as follows. (i) for a ∈ Σ0 , ( si aha1 ← s1 , . . . , an ← sn i = a

if a = ai otherwise

(ii) for a ∈ Σk and t1 , . . . , tk ∈ TΣ , a[t1 · · · tk ]h. . . i = a[t1 h. . . i · · · tk h. . . i], where h. . . i abbreviates ha1 ← s1 , . . . , an ← sn i. If, in particular, n = 1, then, for each a ∈ Σ0 and t, s ∈ TΣ , the tree tha ← si is also denoted by t ·a s. Example 2.25. Let ∆0 = {x, y, c}, ∆2 = {b} and ∆3 = {a}. If t = a[b[xy]xc], then thx ← b[cx], y ← ci = a[b[b[cx]c]b[cx]c]. Exercise 2.26. Check that in the monadic case tree concatenation corresponds to string concatenation. For tree languages tree concatenation is defined analogously. Definition 2.27. Let n ≥ 1, a1 , . . . , an ∈ Σ0 all different, and L1 , . . . , Ln ⊆ TΣ . For L ⊆ TΣ we define the tree concatenation of L with L1 , . . . , Ln at a1 , . . . , an , denoted by Lha1 ← L1 , . . . , an ← Ln i, as follows. † (i) for a ∈ Σ0 , ( Li aha1 ← L1 , . . . , an ← Ln i = a

if a = ai otherwise

(ii) for a ∈ Σk and t1 , . . . , tk ∈ TΣ , ††

a[t1 · · · tk ]h. . . i = a[t1 h. . . i · · · tk h. . . i]

(iii) for L ⊆ TΣ , S Lha1 ← L1 , . . . , an ← Ln i = tha1 ← L1 , . . . , an ← Ln i. t∈L †

As usual, given a string w, we use w also to denote the language {w}. For tree languages M1 , . . . , Mk we also write a[M1 · · · Mk ] to denote tcka (M1 , . . . , Mk ). This notation is fully justified since a[M1 · · · Mk ] is the (string) concatenation of the languages a, [ , M1 , . . . , Mk and ] ! ††

7

If, in particular, n = 1, then, for each a ∈ Σ0 and each L1 , L2 ⊆ TΣ , we denote L1 ha ← L2 i also by L1 ·a L2 . Remarks 2.28. (1) Obviously, if L, L1 , . . . , Ln are singletons, then Definition 2.27 is the same as Definition 2.24. (2) Note that tree concatenation, as defined above, is “nondeterministic” in the sense that, for instance, to obtain tha1 ← L1 , . . . , an ← Ln i different elements of L1 may be substituted at different occurrences of a1 in t. “Deterministic” tree concatenation of t with L1 , . . . , Ln at a1 , . . . , an could be defined as {tha1 ← s1 , . . . , an ← sn i | si ∈ Li for all 1 ≤ i ≤ n}. In this case different occurrences of a1 in t should be replaced by the same element of L1 . It is clear that, in the case that L1 , . . . , Ln are singletons, this distinction cannot be made. Intuitively, since trees are strings, tree concatenation is nothing else but ordinary string substitution, familiar from formal language theory (see, for instance, [Sal, I.3]). For completeness we give the definition of substitution of string languages. Definition 2.29. Let ∆ be an alphabet. Let n ≥ 1, a1 , . . . , an ∈ ∆ all different and let L1 , . . . , Ln be languages over ∆. For any L ⊆ ∆∗ , the substitution of L1 , . . . , Ln for a1 , . . . , an in L, denoted by Lha1 ← L1 , . . . , an ← Ln i, is the language over ∆ defined as follows: (i) λha1 ← L1 , . . . , an ← Ln i = λ (ii) for a ∈ ∆, ( Li aha1 ← L1 , . . . , an ← Ln i = a

if a = ai otherwise

(iii) for w ∈ ∆∗ and a ∈ ∆, wah. . . i = wh. . . i · ah. . . i (iv) for L ⊆ ∆∗ , S Lha1 ← L1 , . . . , an ← Ln i = wha1 ← L1 , . . . , an ← Ln i. w∈L

If n = 1, L1 ha ← L2 i will also be denoted as L1 ·a L2 . If L1 , . . . , Ln are singletons, then the substitution is called a homomorphism. Exercise 2.30. Let n ≥ 1, a1 , . . . , an ∈ Σ0 all different, ai 6= e for all 1 ≤ i ≤ n, and L, L1 , . . . Ln ⊆ TΣ . Prove that yield(Lha1 ← L1 , . . . , an ← Ln i) = yield(L)ha1 ← yield(L1 ), . . . , an ← yield(Ln )i. (Thus: “yield of tree concatenation is string substitution of yields”).

Exercise 2.31. Prove that Definitions 2.27 and 2.29 give exactly the same result for Lha1 ← L1 , . . . , an ← Ln i where a1 , . . . , an ∈ Σ0 and L, L1 , . . . Ln are tree languages over Σ (and thus, string languages over Σ ∪ {[ , ]}).

8

Exercise 2.32. Define the notion of associativity for tree concatenation, and show that tree concatenation is associative. Show that, in general, “deterministic tree concatenation” is not associative (cf. Remark 2.28(2)). We shall need the following special case of tree concatenation. Definition 2.33. Let Σ be a ranked alphabet and let S be a set of symbols or a tree language. Then the set of trees indexed by S, denoted by TΣ (S), is defined inductively as follows. (i) S ∪ Σ0 ⊆ TΣ (S) (ii) If k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ (S), then a[t1 · · · tk ] ∈ TΣ (S). Note that TΣ (∅) = TΣ .

Thus, if S is a set of symbols, then TΣ (S) = TΣ∪S , where the elements of S are assumed to have rank 0. If S is a tree language over a ranked alphabet ∆, then TΣ (S) is a tree language over the ranked alphabet Σ ∪ ∆. Exercise 2.34. Show that, for any a ∈ Σ0 , TΣ (S) = TΣ ·a (S ∪ {a}).

We close this section with two general remarks. Remark 2.35. Definition 2.5 of a tree is of course rather arbitrary. Other, equally useful, ways of defining trees as a special kind of strings are obtained by replacing a[t1 · · · tk ] in Definition 2.5 by [t1 · · · tk ]a or at1 · · · tk ] or [at1 · · · tk ] or at1 · · · tk (only in the case that each symbol has exactly one rank) or [ t1 · · · tk ] (where [ is a new symbol for each a) or a

a

a[t1 ,t2 , . . . ,tk ] (where “ , ” is a new symbol).

Remark 2.36. Remark on the general philosophy in tree language theory. The general philosophy looks like this:

(1)

(3)

(2)

(1) Take vertical string language theory (cf. Definition 2.21), (2) generalize it to tree language theory, and (3) map this into horizontal string language theory via the yield operation (Definition 2.15). The fourth part of the philosophy is (4) Tree language theory is a specific part of string language theory, illustrated as follows:

9

a[b[cd]d]

a

a [b d] [cd]

b c

d d

Example: (1). (vertical) string concatenation (2). tree concatenation (3). (horizontal) string substitution

(see Exercise 2.30)

(4). (2) is a special case of (3)

(see Exercise 2.31)

3 Recognizable tree languages 3.1 Finite tree automata and regular tree grammars Let us first consider the usual finite automaton on strings. A deterministic finite automaton is a structure M = (Q, Σ, δ, q0 , F ), where Q is the set of states, Σ is the input alphabet, q0 is the initial state, F is the set of final states and δ is a family {δa }a∈Σ , where δa : Q → Q is the transition function for the input a. There are several ways to describe the functioning of M and the language it recognizes. One of them (see for instance [Sal, I.4]), is to describe explicitly the sequence of steps taken by the automaton while processing some input string. This point of view will be considered in Part (4). Another way is to give a recursive definition of the effect of an input string on the state of M . Since a recursive definition is in particular suitable for generalization to trees, let b us consider one in detail. We define a function δb : Σ∗ → Q such that, for w ∈ Σ∗ , δ(w) is intuitively the state M reaches after processing w, starting from the initial state q0 : b (i) δ(λ) = q0 b b (ii) for w ∈ Σ∗ and a ∈ Σ, δ(wa) = δa (δ(w)). b The language recognized by M is L(M ) = {w ∈ Σ∗ | δ(w) ∈ F }. When considering this b definition of δ for “bottom-up” monadic trees (see Definition 2.21), one easily arrives at the following generalization to the tree case: There should be a start state for each element of Σ0 . The finite tree automaton starts at all leaves (“at the same time”, “in parallel”) and processes the tree in a bottom-up fashion. The automaton arrives at each node of rank k with a sequence of k states (one state for each direct subtree of the node), and the transition function δa of the label a of that node is a mapping δa : Qk → Q, which, from that sequence of k states, determines

10

the state at that node. A tree is recognized iff the tree automaton is in a final state at the root of the tree. Formally: Definition 3.1. A deterministic bottom-up finite tree automaton is a structure M = (Q, Σ, δ, s, F ), where Q is a finite set (of states), Σ is a ranked alphabet (of input symbols), δ is a family {δak }k≥1,a∈Σk of mappings δak : Qk → Q (the transition function for a ∈ Σk ), s is a family {sa }a∈Σ0 of states sa ∈ Q (the initial state for a ∈ Σ0 ), and F is a subset of Q (the set of final states). The mapping δb : TΣ → Q is defined recursively as follows: b = sa , (i) for a ∈ Σ0 , δ(a) (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , b 1 · · · tk ]) = δ k (δ(t b 1 ), . . . , δ(t b k )). δ(a[t a

b ∈ F }. The tree language recognized by M is defined to be L(M ) = {t ∈ TΣ | δ(t)

b is the state reached by M after bottom-up processing of t. Intuitively, δ(t) For convenience, when k is understood, we shall write δa rather than δak . Note therefore that each symbol a ∈ Σ may have several transition functions δa (one for each of its ranks). We shall abbreviate “finite tree automaton” by “fta”, and “deterministic” by “det.”. Definition 3.2. A tree language L is called recognizable (or regular) if L = L(M ) for some det. bottom-up fta M . The class of recognizable tree languages will be denoted by RECOG.

Example 3.3. Let us consider the det. bottom-up fta M = (Q, Σ, δ, s, F ), where Q = {0, 1, 2, 3}, Σ0 = {0, 1, 2, . . . , 9}, Σ2 = {+, ∗}, sa ≡ a (mod 4), F = {1}, and δ+ and δ∗ (both mappings Q2 → Q) are addition modulo 4 and multiplication modulo 4 respectively. Then M recognizes the set of all “expressions” whose value modulo 4 is 1. Consider for instance the expression +[+[07]∗[2∗[73]]], the prefix form of (0+7)+(2∗(7∗3)). In the following picture, + + 0 (0)

(1) ∗ (2)

(3) 7 (3)

∗ (1)

2 (2) 7 (3)

3 (3)

the state of M at each node of the tree is indicated between parentheses.

11

Example 3.4. Let Σ0 = {a} and Σ2 = {b}. Consider the language of all trees in TΣ which have a “right comb-like” structure like for instance the tree b[ab[ab[ab[aa]]]]. This tree language is recognized by the det. bottom-up fta M = (Q, Σ, δ, s, F ), where Q = {A, C, W }, sa = A, F = {C} and δb is defined by δb (A, A) = δb (A, C) = C and δb (q1 , q2 ) = W for all other pairs of states (q1 , q2 ). Exercise 3.5. Let Σ0 = {a, b}, Σ1 = {p} and Σ2 = {p, q}. Construct det. bottom-up finite tree automata recognizing the following tree languages: (i) the language of all trees t, such that if a node of t is labeled q, then its descendants are labeled q or a; (ii) the set of all trees t such that yield(t) ∈ a+ b+ ; (iii) the set of all trees t such that the total number of p’s occurring in t is odd.

A (theoretically) convenient extension of the deterministic finite automaton is to make it nondeterministic. A nondeterministic finite automaton (on strings) is a structure M = (Q, Σ, δ, S, F ), where Q, Σ and F are the same as in the deterministic case, S is a set of initial states, and, for each a ∈ Σ, δa is a mapping Q → P(Q) (intuitively, δa (q) is the set of states which M can possibly, nondeterministically, enter when reading a in b now from Σ∗ into P(Q), can be defined, such that for every state q). Again a mapping δ, b w ∈ Σ∗ , δ(w) is the set of states M can possibly reach after processing w, having started from one of the initial states in S: b (i) δ(λ) = S, S b b (ii) for w ∈ Σ∗ and a ∈ Σ, δ(wa) = {δa (q) | q ∈ δ(w)}. b The language recognized by M is L(M ) = {w ∈ Σ∗ | δ(w) ∩ F 6= ∅}. Generalizing to trees we obtain the following definition. Definition 3.6. A nondeterministic bottom-up finite tree automaton is a 5-tuple M = (Q, Σ, δ, S, F ), where Q, Σ and F are as in the deterministic case, S is a family {Sa }a∈Σ0 such that Sa ⊆ Q for each a ∈ Σ0 , and δ is a family {δak }k≥1,a∈Σk of mappings δak : Qk → P(Q). The mapping δb : TΣ → P(Q) is defined recursively by b = Sa , (i) for a ∈ Σ0 , δ(a) (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , b 1 · · · tk ]) = S{δa (q1 , . . . , qk ) | qi ∈ δ(t b i ) for 1 ≤ i ≤ k}. δ(a[t b ∩ F 6= ∅}. The tree language recognized by M is L(M ) = {t ∈ TΣ | δ(t)

Note that, for q ∈ Qk , δak (q) may be empty. Example 3.7. Let Σ0 = {p} and Σ2 = {a, b}. Consider the following tree language over Σ: L = {u1 a[a[s1 s2 ]a[t1 t2 ]]u2 ∈ TΣ | · · · } ∪ {u1 b [b[s1 s2 ] b[t1 t2 ]]u2 ∈ TΣ | · · · }

12

where “· · · ” stands for u1 , u2 ∈ (Σ ∪ {[ , ]})∗ , s1 , s2 , t1 , t2 ∈ TΣ . In other words, L is a b the set of all trees containing a configuration or a configuration a a b b (or both). L is recognized by the nondet. bottom-up fta M = (Q, Σ, δ, S, F ), where Q = {qs , qa , qb , r}, Sp = {qs }, F = {r} and δa (qs , qs ) = {qs , qa }, δb (qs , qs ) = {qs , qb }, δa (qa , qa ) = δb (qb , qb ) = {r}, for all q ∈ Q : δa (q, r) = δa (r, q) = δb (q, r) = δb (r, q) = {r}, and δx (q1 , q2 ) = ∅ for all other possibilities. It is rather obvious in the last example that we can find a deterministic bottom-up fta recognizing the same language (find it!). We now show that this is possible in general (as in the case of strings). Theorem 3.8. For each nondeterministic bottom-up fta we can find a deterministic one recognizing the same language. Proof. The proof uses the “subset-construction”, well known from the string-case. Let M = (Q, Σ, δ, S, F ) be a nondeterministic bottom-up fta. Construct the deterministic bottom-up fta M1 = (P(Q), Σ, δ1 , s1 , F1 ) such that (s1 )a = Sa for all a ∈ Σ0 , F1 = {Q1 ∈ P(Q) | Q1 ∩ F 6= ∅}, and, for a ∈ Σk and Q1 , . . . , Qk ⊆ Q, [ (δ1 )a (Q1 , . . . , Qk ) = {δa (q1 , . . . , qk ) | qi ∈ Qi for all 1 ≤ i ≤ k}. b for all It is straightforward to show, using Definitions 3.1 and 3.6, that δb1 (t) = δ(t) b t ∈ TΣ (proof by induction on t). From this it follows that L(M1 ) = {t | δ1 (t) ∈ F1 } = b ∩ F 6= ∅} = L(M ). {t | δ(t) Exercise 3.9. Check the proof of Theorem 3.8. Construct the det. bottom-up fta corresponding to the fta M of Example 3.7 according to that proof, and compare this det. fta with the one you found before. Let us now consider the top-down generalization of the finite automaton. Let M = (Q, Σ, δ, q0 , F ) be a det. finite automaton. Another way to define L(M ) is by giving a recursive definition of a mapping δe : Σ∗ → P(Q) such that intuitively, for e each w ∈ Σ∗ , δ(w) is the set of states q such that the machine M , when started in state q, enters a final state after processing w. The definition of δe is as follows: e (i) δ(λ) =F (ii) for w ∈ Σ∗ and a ∈ Σ, e e δ(aw) = {q | δa (q) ∈ δ(w)} (the last line may be read as: to check whether, starting in q, M recognizes aw, compute q1 = δa (q) and check whether M recognizes w starting in q1 ). The language recognized e by M is L(M ) = {w ∈ Σ∗ | q0 ∈ δ(w)}. This definition, applied to “top-down” monadic trees, leads to the following generalization to arbitrary trees. The finite tree automaton starts at the root of the tree in the initial state, and processes the tree in a top-down

13

fashion. The automaton arrives at each node in one state, and the transition function δa of the label a of that node is a mapping δa : Q → Qk (where k is the rank of the node), which, from that state, determines the state in which to continue for each direct descendant of the node (the automaton “splits up” into k independent copies, one for each direct subtree of the node). Finally the automaton arrives at all leaves of the tree. There should be a set of final states for each element of Σ0 . The tree is recognized if the fta arrives at each leaf in a state which is final for the label of that leaf. Formally: Definition 3.10. A deterministic top-down finite tree automaton is a 5-tuple M = (Q, Σ, δ, q0 , F ), where Q is a finite set (of states), Σ is a ranked alphabet (of input symbols), is a family {δak }k≥1,a∈Σk of mappings δak : Q → Qk (the transition δ function for a ∈ Σk ), q0 is in Q (the initial state), and F is a family {Fa }a∈Σ0 of sets Fa ⊆ Q (the set of final states for a ∈ Σ0 ). The mapping δe : TΣ → P(Q) is defined recursively by e = Fa (i) for a ∈ Σ0 , δ(a) (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , e 1 · · · tk ]) = {q | δa (q) ∈ δ(t e 1 ) × · · · × δ(t e k )}. δ(a[t e The tree language recognized by M is defined to be L(M ) = {t ∈ TΣ | q0 ∈ δ(t)}.

e is the set of states q such that M , when starting at the root of t in Intuitively, δ(t) state q, arrives at the leaves of t in final states. Example 3.11. Consider the tree language of Exercise 3.5(i). A det. top-down fta recognizing this language is M = (Q, Σ, δ, q0 , F ) where Q = {A, R, W }, q0 = A, Fa = {A, R}, Fb = {A} and δp1 (A) = A,

δp1 (R) = δp1 (W ) = W,

δp2 (A) = (A, A),

δp2 (R) = δp2 (W ) = (W, W ),

δq (A) = (R, R),

δq (R) = (R, R), δq (W ) = (W, W ).

Exercise 3.12. Let Σ be a ranked alphabet, and p ∈ Σ2 . Let L be the tree language defined recursively by (i) for all t1 , t2 ∈ TΣ , p[t1 t2 ] ∈ L (ii) for all a ∈ Σk , if t1 , . . . , tk ∈ L, then a[t1 · · · tk ] ∈ L (k ≥ 1). Construct a deterministic top-down fta recognizing L. Give a nonrecursive description of L. Exercise 3.13. Construct a det. top-down fta M such that yield(L(M )) = a+ b+ .

We now show that the det. top-down fta recognizes less languages than its bottom-up counterpart.

14

Theorem 3.14. There are recognizable tree languages which cannot be recognized by a deterministic top-down fta. Proof. Let Σ0 = {a, b} and Σ2 = {S}. Consider the (finite!) tree language L = {S[ab], S[ba]}. Suppose that the det. top-down fta M = (Q, Σ, δ, q0 , F ) recognizes L. Let δS (q0 ) = (q1 , q2 ). Since S[ab] ∈ L(M ), q1 ∈ Fa and q2 ∈ Fb . But, since S[ba] ∈ L(M ), q1 ∈ Fb and q2 ∈ Fa . Hence both S[aa] and S[bb] are in L(M ). Contradiction. Exercise 3.15. Show that the tree languages of Exercise 3.5(ii,iii) are not recognizable by a det. top-down fta. It will be clear that the nondeterministic top-down fta is able to recognize all recognizable languages. We give the definition without comment. Definition 3.16. A nondeterministic top-down finite tree automaton is a structure M = (Q, Σ, δ, S, F ), where Q, Σ and F are as in the deterministic case, S is a subset of Q and δ is a family {δak }k≥1,a∈Σk of mappings δak : Q → P(Qk ). The mapping δe : TΣ → P(Q) is defined recursively as follows e = Fa , (i) for a ∈ Σ0 , δ(a) (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , e 1 · · · tk ]) = {q | ∃(q1 , . . . , qk ) ∈ δa (q) : qi ∈ δ(t e i ) for all 1 ≤ i ≤ k}. δ(a[t e ∩ S 6= ∅}. The tree language recognized by M is L(M ) = {t ∈ TΣ | δ(t)

We now show that, nondeterministically, there is no difference between bottom-up or top-down recognition. Theorem 3.17. A tree language is recognizable by a nondet. bottom-up fta iff it is recognizable by a nondet. top-down fta. Proof. Let us say that a nondet. bottom-up fta M = (Q, Σ, δ, S, F ) and a nondet. topdown fta N = (P, ∆, µ, R, G) are “associated” if the following requirements are satisfied: (i) Q = P , Σ = ∆, F = R and, for all a ∈ Σ0 , Sa = Ga ; (ii) for all k ≥ 1, a ∈ Σk and q1 , . . . , qk , q ∈ Q, q ∈ δa (q1 , . . . , qk ) iff (q1 , . . . , qk ) ∈ µa (q). In that case, one can easily prove by induction that δb = µ e, and so L(M ) = L(N ). Since obviously for each nondet. bottom-up fta there is an associated nondet. top-down fta, and vice versa, the theorem holds. Thus the classes of tree languages recognized by the nondet. bottom-up, det. bottom-up and nondet. top-down fta are all equal (and are called RECOG), whereas the class of tree languages recognized by the det. top-down fta is a proper subclass of RECOG. The next victim of generalization is the regular grammar (right-linear, type-3 grammar). In this case it seems appropriate to take the top-down point of view only. Consider an

15

ordinary regular grammar G = (N, Σ, R, S). All rules have either the form A → wB or the form A → w, where A, B ∈ N and w ∈ Σ∗ . Monadically, the string wB may be considered as the result of treeconcatenating the tree we with B at e, where B is of rank 0. Thus we can take the generalization of strings of the form wB or w to be trees in T∆ (N ), where ∆ is a ranked alphabet (for the definition of T∆ (N ), see Definition 2.33). Thus, let us consider a “tree grammar” with rules of the form A → t, where A ∈ N and t ∈ T∆ (N ). Obviously, the application of a rule A → t to a tree s ∈ T∆ (N ) should intuitively consist of replacing one occurrence of A in s by the tree t. Starting with the initial nonterminal, nonterminals at the frontier of the tree are then repeatedly replaced by right hand sides of rules, until the tree does not contain nonterminals any more. Now, since trees are defined as strings, it turns out that this process is precisely the way a context-free grammar works. Thus we arrive at the following formal definition. Definition 3.18. A regular tree grammar is a tuple G = (N, Σ, R, S) where N is a finite set (of nonterminals), Σ is a ranked alphabet (of terminals), such that Σ ∩ N = ∅, S ∈ N is the initial nonterminal, and R is a finite set of rules of the form A → t with A ∈ N and t ∈ TΣ (N ). The tree language generated by G, denoted by L(G), is defined to be L(H), ∗ ⇒ (or where H is the context-free grammar (N, Σ ∪ {[ , ]}, R, S). We shall use = ⇒ and = ∗

∗

G

G

⇒ and = ⇒ when G is understood) to denote the restrictions of =⇒ and =⇒ to TΣ (N ). H

H

Example 3.19. Let Σ0 = {a, b, c, d, e}, Σ2 = {p} and Σ3 = {p, q}. Consider the regular tree grammar G = (N, Σ, R, S), where N = {S, T } and R consists of the rules S → p[aT a], T → q[cp[dT ]b] and T → e. Then G generates the tree p[aq[cp[de]b]a] as follows: S ⇒ p[aT a] ⇒ p[aq[cp[dT ]b]a] ⇒ p[aq[cp[de]b]a] or, pictorially, p

S ⇒ a

T

p

⇒ a

p

⇒

.

a

q

a

a

q

a

c

p

b

c

p

b

d

T

d

e

The tree language generated by G is {p[a(q[cp[d)n e(]b])n a] | n ≥ 0}.

Exercise 3.20. Write regular tree grammars generating the tree languages of Exercise 3.5. As in the case of strings, each regular tree grammar is equivalent to one that has the property that at each step in the derivation exactly one terminal symbol is produced. Definition 3.21. A regular tree grammar G = (N, Σ, R, S) is in normal form, if each of its rules is either of the form A → a[B1 · · · Bk ] or of the form A → b, where k ≥ 1, a ∈ Σk , A, B1 , . . . , Bk ∈ N and b ∈ Σ0 .

16

Theorem 3.22. Each regular tree grammar has an equivalent regular tree grammar in normal form. Proof. Consider an arbitrary regular tree grammar G = (N, Σ, R, S). Let G1 = (N, Σ, R1 , S) be the regular tree grammar such that (A → t) ∈ R1 if and only if ∗ t∈ / N and there is a B in N such that A = ⇒ B and (B → t) ∈ R1 . Then L(G1 ) = L(G), G

and R1 does not contain rules of the form A → B with A, B ∈ N . (This is the well-known procedure of removing rules A → B from a context-free grammar). Suppose that G1 is not yet in normal form. Than there is a rule of the form A → a[t1 · · · ti · · · tk ] such that ti ∈ / N . Construct a new regular tree grammar G2 by adding a new nonterminal B to N and replacing the rule A → a[t1 · · · ti · · · tk ] by the two rules A → a[t1 · · · B · · · tk ] and B → ti in R1 . It should be clear that L(G2 ) = L(G1 ), and that, by repeating the latter process a finite number of times, one ends up with an equivalent grammar in normal form. Exercise 3.23. Put the regular tree grammar of Example 3.19 into normal form.

Exercise 3.24. What does Theorem 3.22 actually say in the case of strings (the monadic case)? In the next theorem we show that the regular tree grammars generate exactly the class of recognizable tree languages. Theorem 3.25. A tree language can be generated by a regular tree grammar iff it is an element of RECOG. Proof. Exercise.

Note therefore that each recognizable tree language is a special kind of context-free language. Exercise 3.26. Show that all finite tree languages are in RECOG.

Exercise 3.27. Show that each recognizable tree language can be generated by a “backwards deterministic” regular tree grammar. A regular tree grammar is called “backwards deterministic” if (1) it may have more than one initial nonterminal, (2) it is in normal form, and (3) rules with the same right hand side are equal. It is now easy to show the connection between recognizable tree languages and contextfree languages. Let CFL denote the class of context-free languages. Theorem 3.28. yield(RECOG) = CFL (in words, the yield of each recognizable tree language is context-free, and each context-free language is the yield of some recognizable tree language).

17

Proof. Let G = (N, Σ, R, S) be a regular tree grammar. Consider the context-free grammar G = (N, Σ0 , R, S), where R = {A → yield(t) | A → t ∈ R}. Then L(G) = yield(L(G)). Now let G = (N, Σ, R, S) be a context-free grammar. Let ∗ be a new symbol, and let ∆ = Σ ∪ {e, ∗} be the ranked alphabet such that ∆0 = Σ ∪ {e}, and, for k ≥ 1, ∆k = {∗} if and only if there is a rule in R with a right hand side of length k. Consider the regular tree grammar G = (N, ∆, R, S) such that (i) if A → w is in R, w 6= λ, then A → ∗[w] is in R, (ii) if A → λ is in R, then A → e is in R. Then yield(L(G)) = L(G).

In the next section we shall give the connection between regular tree languages and derivation trees of context-free languages. Exercise 3.29. A context-free grammar is “invertible” if rules with the same right hand side are equal. Show that each context-free language can be generated by an invertible context-free grammar. For regular string languages a useful stronger version of Theorem 3.28 can be proved. Theorem 3.30. Let Σ be a ranked alphabet. If R is a regular string language over Σ0 , then the tree language {t ∈ TΣ | yield(t) ∈ R} is recognizable. Proof. Let M = (Q, Σ, δ, q0 , F ) be a deterministic finite automaton recognizing R. We construct a nondeterministic bottom-up fta N = (Q × Q, Σ, µ, S, G), which, for each tree t, checks whether a successful computation of M on yield(t) is possible. The states of N are pairs of states of M . Intuitively we want that (q1 , q2 ) ∈ µ b(t) if and only if M arrives in state q2 after processing yield(t), starting from state q1 . Thus we define (i) for all a ∈ Σ0 , Sa = {(q1 , q2 ) | δa (q1 ) = q2 }, (ii) for all k ≥ 1, a ∈ Σk and states q1 , q2 , . . . , q2k ∈ Q, {(q1 , q2k )} if q2i = q2i+1 for µa ((q1 , q2 ), (q3 , q4 ), . . . , (q2k−1 , q2k )) = all 1 ≤ i ≤ k − 1 ∅ otherwise . Then L(N ) = {t ∈ TΣ | yield(t) ∈ R}.

Exercise 3.31. Show that, if Σ2 6= ∅, then Theorem 3.30 holds conversely: if L is a string language such that {t ∈ TΣ | yield(t) ∈ L} is recognizable, then L is regular. What can you say in case Σ2 = ∅?

3.2 Closure properties of recognizable tree languages We first consider set-theoretic operations.

18

Theorem 3.32. RECOG is closed under union, intersection and complementation. Proof. To show closure under complementation, consider a deterministic bottom-up fta M = (Q, Σ, δ, s, F ). Let N be the det. bottom-up fta (Q, Σ, δ, s, Q − F ). Then, obviously, L(N ) = TΣ − L(M ). To show closure under union, consider two regular tree grammars Gi = (Ni , Σi , Ri , Si ), i = 1, 2 (with N1 ∩ N2 = ∅). Then G = (N1 ∪ N2 ∪ {S}, Σ1 ∪ Σ2 , R1 ∪ R2 ∪ {S → S1 , S → S2 }, S) is a regular tree grammar such that L(G) = L(G1 ) ∪ L(G2 ). As a corollary we obtain the following closure property of context-free languages. Corollary 3.33. CFL is closed under intersection with regular languages. Proof. Let L and R be a context-free and regular language respectively. According to Theorem 3.28, there is a recognizable tree language U such that yield(U ) = L. Consequently, by Theorems 3.30 and 3.32, the tree language V = U ∩ {t | yield(t) ∈ R} is recognizable. Obviously L ∩ R = yield(V ) and so, again by Theorem 3.28, L ∩ R is context-free. We now turn to the closure of RECOG under concatenation operations (see Definitions 2.23 and 2.27). Theorem 3.34. For every k ≥ 1 and a ∈ Σk , RECOG is closed under tcka . Proof. Exercise.

Theorem 3.35. RECOG is closed under tree concatenation. Proof. The proof is obtained by generalizing that for regular string languages. Let n ≥ 1, a1 , . . . , an ∈ Σ0 all different and L0 , L1 , . . . , Ln recognizable tree languages (we may assume that all languages are over the same ranked alphabet Σ). Let Gi = (Ni , Σ, Ri , Si ) be a regular tree grammar in normal form for Li (i = 0, 1, . . . , n). A regular tree grammar n n S S generating L0 ha1 ← L1 , . . . , an ← Ln i is G = ( Ni , Σ, R, S0 ), where R = R0 ∪ Ri , i=0

i=1

and R0 is R0 with each rule of the form A → ai replaced by the rule A → Si (1 ≤ i ≤ n). Corollary 3.36. CFL is closed under substitution. Proof. Use Theorem 3.28 and Exercise 2.30.

Note also that Theorem 3.35 is essentially a special case of Corollary 3.36. Next we generalize the notion of (concatenation) closure of string languages to trees, and show that RECOG is closed under this closure operation. We shall, for convenience, restrict ourselves to the case that tree concatenation happens at one element of Σ0 .

19

Definition 3.37. Let a ∈ Σ0 and let L be a tree language over Σ. Then the tree ∞ S concatenation closure of L at a, denoted by L∗a , is defined to be Xn , where X0 = {a} n=0

and, for n ≥ 0, Xn+1 = Xn ·a (L ∪ {a}). †

Example 3.38. Let G = (N, Σ, R, S) be the regular tree grammar with N = {S}, Σ0 = {a}, Σ2 = {b} and R = {S → b[aS], S → a}. Then L(G) = {b[aS]}∗S ·S a. The “corresponding” operation on strings has several names in the literature. Let us call it “substitution closure”. Definition 3.39. Let ∆ be an alphabet and a ∈ ∆. For a language L over ∆, the ∞ S substitution closure of L at a, denoted by L∗a , is defined to be Xn , where X0 = {a} n=0

and, for n ≥ 0, Xn+1 = Xn ·a (L ∪ {a}).

Exercise 3.40. Let a ∈ Σ0 , a 6= e, and let L ⊆ TΣ . Prove that yield(L∗a ) = (yield(L))∗a . Theorem 3.41. RECOG is closed under tree concatenation closure. Proof. Again the proof is a straightforward generalization of the string case. Let G = (N, Σ, R, S) be a regular tree grammar in normal form, and let a ∈ Σ0 . Construct the regular tree grammar G = (N ∪ {S0 }, Σ, R, S0 ), where R = R ∪ {A → S | A → a is in R} ∪ {S0 → S, S0 → a}. Then L(G) = (L(G))∗a . Corollary 3.42. CFL is closed under substitution closure. Proof. Use Theorem 3.28 and Exercise 3.40.

It is well known that the class of regular string languages is the smallest class containing the finite languages and closed under union, concatenation and closure. A similar result holds for recognizable tree languages. Theorem 3.43. RECOG is the smallest class of tree languages containing the finite tree languages and closed under union, tree concatenation and tree concatenation closure. Proof. We have shown that RECOG satisfies the above conditions in Exercise 3.26 and Theorems 3.32, 3.35 and 3.41. It remains to show that every recognizable tree language can be built up from the finite tree languages using the operations ∪, ·a and ∗a . Let G = (N, Σ, R, S) be a regular tree grammar (it is easy to think of it as being in normal form). We shall use the elements of N to do tree concatenation at. For A ∈ N and P, Q ⊆ N with P ∩ Q = ∅, let us denote by LQ A,P the set of all trees t ∈ TΣ (P ) for which there is a derivation A ⇒ t1 ⇒ t2 ⇒ · · · ⇒ tn ⇒ tn+1 = t (n ≥ 0) such that, for 1 ≤ i ≤ n, ti ∈ TΣ (Q ∪ P ) and a rule with left hand side in Q is applied to ti to obtain †

Recall the notation L1 ·a L2 from Definition 2.27.

20

ti+1 . We shall show, by induction on the cardinality of Q, that all sets LQ A,P can be built ∗B up from the finite tree languages by the operations ∪, ·B and (for all B ∈ N ). For ∅ Q = ∅, LA,P is the set of all those right hand sides of rules with left hand side A, that are in TΣ (P ). Thus L∅A,P is a finite tree language for all A and P . Assuming now that, for Q ⊆ N , all sets LQ A,P can be built up from the finite tree languages, the same holds Q∪{B}

for all sets LA,P

, where B ∈ N − Q, since Q∪{B}

LA,P

Q ∗B = LQ · B LQ B,P A,P ∪{B} ·B (LB,P ∪{B} )

(a formal proof of this equation is left to the reader). Thus, since L(G) = LN S,∅ , the theorem is proved. In other words, each recognizable tree language can be denoted by a “regular expression” with trees as constants and ∪, ·A and ∗A as operators. Exercise 3.44. Try to find a regular expression for the language generated by the regular tree grammar G = (N, Σ, R, S) with N = {S, T }, Σ0 = {a}, Σ2 = {p} and R = {S → p[T S], S → a, T → p[T T ], T → a}. Use the algorithm in the proof of Theorem 3.43. As a corollary we obtain the result that all context-free languages can be denoted by “context-free expressions”. Corollary 3.45. CFL is the smallest class of languages containing the finite languages and closed under union, substitution and substitution closure. Proof. Exercise.

Exercise 3.46. Define the operation of “iterated concatenation at a” (for tree languages) and “iterated substitution at a” (for string languages) by ita (L) = L∗a ·a ∅. Prove (using Theorem 3.43) that RECOG is the smallest class of tree languages containing the finite tree languages and closed under the operations of union, top concatenation and iterated concatenation. Show that this implies that CFL is the smallest class of languages containing the finite languages and closed under the operations of union, concatenation and iterated substitution (cf. [Sal, VI.11]). Let us now turn to another operation on trees: that of relabeling the nodes of a tree. Definition 3.47. Let Σ and ∆ be ranked alphabets. A relabeling r is a family {rk }k≥0 of mappings rk : Σk → P(∆k ). A relabeling determines a mapping r : TΣ → P(T∆ ) by the requirements (i) for a ∈ Σ0 , r(a) = r0 (a), (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , r(a[t1 · · · tk ]) = {b[s1 · · · sk ] | b ∈ rk (a) and si ∈ r(ti )}.

21

If, for each k ≥ 0 and each a ∈ Σk , rk (a) consists of one element only, then r is called a projection. Obviously, RECOG is closed under relabelings. Theorem 3.48. RECOG is closed under relabelings. Proof. Let r be a relabeling, and consider some regular tree grammar G. By replacing each rule A → t of G by all rules A → s, s ∈ r(t), one obtains a regular tree grammar for r(L(G)). (In order that “r(t)” makes sense, we define r(B) = {B} for each nonterminal B of G). We are now in a position to study the connection between recognizable tree languages and sets of derivation trees of context-free grammars. We shall consider two kinds of derivation trees. First we define the “ordinary” kind of derivation tree (cf. Example 1.1). Definition 3.49. Let G = (N, Σ, R, S) be a context-free grammar. Let ∆ be the ranked alphabet such that ∆0 = Σ ∪ {e} and, for k ≥ 1, ∆k is the set of nonterminals A ∈ N for which there is a rule A → w with |w| = k (in case k = 1 : |w| = 1 or |w| = 0). For each α , is the tree language α ∈ N ∪ Σ, the set of derivation trees with top α, denoted by DG over ∆ defined recursively as follows a; (i) for each a in Σ, a ∈ DG

(ii) for each rule A → α1 · · · αn in R (n ≥ 1, A ∈ N , αi ∈ Σ ∪ N ), αi A; if ti ∈ DG for 1 ≤ i ≤ n, then A[t1 · · · tn ] ∈ DG A. (iii) for each rule A → λ in R, A[e] ∈ DG

Definition 3.50. A tree language L is said to be local if, for S some context-free grammar α. G = (N, Σ, R, S) and some set of symbols V ⊆ N ∪ Σ, L = DG α∈V

Exercise 3.51. Show that each local tree language is recognizable.

Note that a local tree language is the set of all derivation trees of a context-free grammar which has a set of initial symbols (instead of one initial nonterminal). The reason for the name “local” is that such a tree language L is determined by (1) a finite set of trees of height one, (2) a finite set of “initial symbols”, (3) a finite set of “final symbols”, and the requirement that L consists of all trees t such that each node of t together with its direct descendants belongs to (1), the top label of t belongs to (2), and the leaf labels of t to (3). We now show that the class of local tree languages is properly included in RECOG. Theorem 3.52. There are recognizable tree languages which are not local. Proof. Let Σ0 = {a, b} and Σ2 = {S}. Consider the tree language L = {S[S[ba]S[ab]]}. Obviously L is recognizable. Suppose that L is local. Then there is a context-free

22

S = L. Thus S → SS, S → ba and S → ab are rules of G. But grammar G such that DG then S[S[ab]S[ba]] ∈ L. Contradiction. †

Note that the recognizable tree language L in the above proof can be recognized by a deterministic top-down fta. Note also that the tree language given in the proof of Theorem 3.14 is local. Hence the local tree languages and the tree languages recognized by det. top-down fta are incomparable. Exercise 3.53. Find a recognizable tree language which is neither local nor recognizable by a det. top-down fta. It is clear that, if Σ0 = {a, b} and Σ2 = {S1 , S2 , S3 }, then L0 = {S1 [S2 [ba]S3 [ab]]} is a local language. Hence the language L in Theorem 3.52 is the projection of the local language L0 (project S1 , S2 and S3 on S). We will show that this is true in general: each recognizable tree language is the projection of a local tree language. In fact we shall show a slightly stronger fact. To do this we define the second type of derivation tree of a context-free grammar, called “rule tree”. Definition 3.54. Let G = (N, Σ, R, S) be a context-free grammar. Let R be any set of symbols in one-to-one correspondence with R, R = {r | r ∈ R}. Each element of R is given a rank such that, if r in R is of the form A → w0 A1 w1 A2 w2 · · · Ak wk (for some k ≥ 0, A1 , . . . , Ak ∈ N and w0 , w1 , . . . , wk ∈ Σ∗ ), then r ∈ Rk . The set of rule trees of G, denoted by RT (G), is defined to be the tree language generated by the regular tree grammar G = (N, R, P, S), where P is defined by (i) if r = (A → w0 A1 · · · wk−1 Ak wk ), k ≥ 1, is in R, then A → r[A1 · · · Ak ] is in P ; (ii) if r = (A → w0 ) is in R, then A → r is in P .

Definition 3.55. We shall say that a tree language L is a rule tree language if L = RT (G) for some context-free grammar G. Thus, a rule tree is a derivation tree in which the nodes are labeled by the rules applied during the derivation. It should be obvious, that for each context-free grammar G = (N, Σ, R, S) there is a one-to-one correspondence between the tree languages RT (G) S. and DG Example 3.56. Consider Example 1.1. For each rule r in that example, let (r) stand for a new symbol. The rule tree “corresponding” to the derivation tree displayed in Example 1.1 is

†

Other examples are for instance {S[T [a]T [b]]} and {S[S[a]]}.

23

(S → AD) (A → AA)

(D → Ddd)

(A → bAa)

(A → aAb)

(A → λ)

(A → λ)

(D → d)

Note that this tree is obtained from the other one by viewing the building blocks (trees of height one) of the local tree as the nodes of the rule tree. The following theorem shows the relationship of the rule languages to those defined before. Theorem 3.57. The class of rule tree languages is properly included in the intersection of the class of local tree languages and the class of tree languages recognizable by a det. top-down fta. Proof. We first show inclusion in the class of local tree languages. Let G = (N, Σ, R, S) be a context-free grammar, and R = {r | r ∈ R}. Consider the context-free grammar G1 = (R − R0 , R0 , P, −) where P is defined as follows: if r = (A → w0 A1 w1 · · · Ak wk ), k ≥ 1, is in R, then r → r1 · · · rk is in P for all rules r1 , . . . , rk ∈ R such that the left hand side of S ri αis Ai (1 ≤ i ≤ k). Let V = {r | r ∈ R has left hand side S}. Then DG1 , and hence RT (G) is local. RT (G) = α∈V

To show that RT (G) can be recognized by a deterministic top-down fta, consider M = (Q, R, δ, q0 , F ), where Q = N ∪ {W }, q0 = S, for r ∈ R0 , Fr consists of the left hand side of r only, and for r ∈ Rk , r of the form A → w0 A1 w1 A2 w2 · · · Ak wk , δr (A) = (A1 , . . . , Ak ) and δr (B) = (W, . . . , W ) for all other B ∈ Q. Then L(M ) = RT (G). To show proper inclusion, let H be the context-free grammar with rules S → SS, S is a local tree language. It is easy to see that S → aS, S → Sb and S → ab. Then DH S S = RT (G) for some DH can be recognized by a det. top-down fta. Now suppose that DH S

context-free grammar G. Since S has rank 2 and since the configuration S

occurs S

S , S is the name of a rule of G of the form A → w Aw Aw . Now, since a and b are in DH 0 1 2

S

of rank 0 and since a

S , a and b are names of rules A → w and A → w . is in DH 3 4

b Hence S[ba] is a rule tree of G. Contradiction.

24

We now characterize the recognizable tree languages in terms of rule tree languages. Theorem 3.58. Every recognizable tree language is the projection of a rule tree language. Proof. Let G = (N, Σ, R, S) be a regular tree grammar in normal form. We shall define a regular tree grammar G and a projection p such that L(G) = p(L(G)) and L(G) is a rule tree language. G will simulate G, but G will put all information about the rules, applied during the derivation of a tree t, into the tree itself. This is a useful technique. Let R be a set of symbols in one-to-one correspondence with R, and let G = (N, R, P, S). The ranking of R, the set P of rules and the projection p are defined simultaneously as follows: (i) if r ∈ R is the rule A → a[B1 · · · Bk ], then r has rank k, A → r[B1 · · · Bk ] is in P and pk (r) = a; (ii) if r ∈ R is the rule A → a, then r has rank 0, A → r is in P and p0 (r) = a. It is obvious that p(L(G)) = L(G). Now note that G may be viewed as a context-free grammar (over Σ ∪ {[ , ]}). In fact, G is the same as the one constructed in Definition 3.54! Thus L(G) is a rule tree language. Since RECOG is closed under projections (Theorem 3.48), we now easily obtain the following corollary. Corollary 3.59. For each tree language L the following four statements are equivalent: (i) L is recognizable (ii) L is the projection of a rule tree language (iii) L is the projection of a local tree language (iv) L is the projection of a tree language recognizable by a det. top-down fta.

Exercise 3.60. Show that, in the case of local tree languages, the projection involved in the above corollary (iii) can be taken as the identity on symbols of rank 0 (thus the yields are preserved). As a final operation on trees we consider the notion of tree homomorphism. For strings, a homomorphism h associates a string h(a) with each symbol a of the alphabet, and transforms a string a1 a2 · · · an into the string h(a1 ) · h(a2 ) · · · h(an ). Generalizing this to trees, a tree homomorphism h associates a tree h(a) with each symbol a of the ranked alphabet (actually, one tree for each rank). The application of h to a tree t consists in replacing each symbol a of t by the tree h(a), and tree concatenating all the resulting trees. Note that, if a is of rank k, then h(a) should be tree concatenated with k other trees; therefore, since tree concatenation happens at symbols of rank 0, the tree h(a) should contain at least k different symbols of rank 0. Since, in general, the number of symbols of rank 0 in some alphabet may be less than the rank of some other symbol, we allow for the use of an arbitrary number of auxiliary symbols of rank 0, called “variables” (recall the use of nonterminals as auxiliary symbols of rank 0 in Theorem 3.43).

25

Definition 3.61. Let x1 , x2 , x3 , . . . be an infinite sequence of different symbols, called variables. Let X = {x1 , x2 , x3 , . . . }, for k ≥ 1, Xk = {x1 , x2 , . . . , xk }, and X0 = ∅. Elements of X will also be denoted by x, y and z. Definition 3.62. Let Σ and ∆ be ranked alphabets. A tree homomorphism h is a family {hk }k≥0 of mappings hk : Σk → T∆ (Xk ). A tree homomorphism determines a mapping h : TΣ → T∆ as follows: (i) for a ∈ Σ0 , h(a) = h0 (a); (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , h(a[t1 · · · tk ]) = hk (a)hx1 ← h(t1 ), . . . , xk ← h(tk )i. In the particular case that, for each a ∈ Σk , hk (a) does not contain two occurrences of the same xi (i = 1, 2, 3, . . . ), h is called a linear tree homomorphism. A general tree homomorphism h has the abilities of deleting (hk (a) does not contain ::::::: xi ), ::::::: (if i < j, then xj may copying (hk (a) contains ≥ 2 occurrences of xi ) and permuting :::::::::: occur before xi in hk (a)) subtrees. Moreover, at each node, it can add pieces of tree (the frontier of hk (a) need not to be an element of X ∗ ). A linear homomorphism cannot copy. Note that, to obtain the proper generalization of the monadic case, one should also forbid deletion, and require that h0 (a) = a for all a ∈ Σ0 (moreover, no pieces of tree should be added). Exercise 3.63. Let Σ0 = {a, b} and Σ2 = {p}. Consider the tree homomorphism h such that h0 (a) = a, h0 (b) = b and h2 (p) = p[x2 x1 ]. Show that, for every t in TΣ , yield(h(t)) is the mirror image of yield(t). It is easy to see that recognizable tree languages are not closed under arbitrary tree homomorphisms; they are closed under linear tree homomorphisms. This is shown in the following two theorems. Theorem 3.64. RECOG is not closed under arbitrary tree homomorphisms. Proof. Let Σ0 = {a} and Σ1 = {b}. Let h be the tree homomorphism defined by h0 (a) = a and h1 (b) = b[x1 x1 ]. Consider the recognizable tree language TΣ . It is easy n n to prove that yield(h(TΣ )) = {a2 | n ≥ 0}. Since {a2 | n ≥ 0} is not a context-free language, Theorem 3.28 implies that h(TΣ ) is not recognizable. Theorem 3.65. RECOG is closed under linear tree homomorphisms. Proof. The idea of the proof is obvious. Given some regular tree grammar generating a recognizable tree language, we change the right hand sides of all rules into their homomorphic images. The resulting grammar generates homomorphic images of “sentential forms” of the original grammar (note that this wouldn’t work in the nonlinear case). The only thing we should worry about is that the homomorphism may be deleting. In that case superfluous rules in the original grammar might be transformed into useful rules

26

in the new grammar. This is solved by requiring that the original grammar does not contain any superfluous rule. The formal construction is as follows. Let G = (N, Σ, R, S) be a regular tree grammar in normal form, such that for each ∗ nonterminal A there is at least one t ∈ TΣ such that A = ⇒ t (since G is a context-free grammar, it is well known that each regular tree grammar has an equivalent one satisfying this condition). Let h be a homomorphism from TΣ into T∆ , for some ranked alphabet ∆. Extend h to trees in TΣ (N ) by defining h0 (A) = A for all A in N . Thus h is now a homomorphism from TΣ (N ) into T∆ (N ). Construct the regular tree grammar H = (N, ∆, R, S), where R = {A → h(t) | A → t is in R}. To show that L(H) = h(L(G)) we shall prove that ∗

∗

G ∗

H

(1) if A = ⇒ t, then A =⇒ h(t)

(t ∈ TΣ ); and

(2) if A =⇒ s, then there exists t such that H

∗

h(t) = s and A = ⇒t

(s ∈ T∆ , t ∈ TΣ ).

G

Let us give the straightforward proofs as detailed as possible. ∗

(1) The proof is by induction on the number of steps in the derivation A = ⇒ t. If, G

in one step, A = ⇒ t, then t ∈ Σ0 and A → t is in R. Hence A → h(t) is in R, and so G

∗

∗

A =⇒ h(t). Now suppose that the first step in A = ⇒ t results from the application of a H

∗

G

∗

rule of the form A → a[B1 · · · Bk ]. Then A = ⇒ t is of the form A = ⇒ a[B1 · · · Bk ] = ⇒ t. It G

G

G

∗

follows that t is of the form a[t1 · · · tk ] such that Bi = ⇒ ti for all 1 ≤ i ≤ k. Hence, by G

∗

induction, Bi =⇒ h(ti ). Now, since the rule A → hk (a)hx1 ← B1 , . . . , xk ← Bk i is in R H

by definition, we have (prove this!) ∗

A =⇒ hk (a)hx1 ← B1 , . . . , xk ← Bk i =⇒ hk (a)hx1 ← h(t1 ), . . . , xk ← h(tk )i H

H

= h(a[t1 · · · tk ]) = h(t). ∗

(2) The proof is by induction on the number of steps in A =⇒ s. H

For zero

∗

steps the statement is trivially true. Suppose that the first step in A =⇒ s results H

from the application of a rule A → h0 (a) for some a in Σ0 . Then h(a) = s and ∗ A = ⇒ a. Now suppose that the first step results from the application of a rule G

A → hk (a)hx1 ← B1 , . . . , xk ← Bk i, where A → a[B1 · · · Bk ] is a rule of G. Then ∗ the derivation is A =⇒ hk (a)hx1 ← B1 , . . . , xk ← Bk i =⇒ s. At this point we need H

H

both linearity of h (to be sure that each Bi in hk (a)hx1 ← B1 , . . . , xk ← Bk i produces at most one subtree of s) and the condition on G (to deal with deletion: since hk (a)hx1 ← B1 , . . . , xk ← Bk i need not contain an occurrence of Bi , we need an arbitrary tree generated, in G, by Bi to be able to construct the tree t such that h(t) = s). There exist trees s1 , . . . , sk in T∆ such that s = hk (a)hx1 ← s1 , . . . , xk ← sk i and

27

∗

(i) if xi occurs in hk (a), then Bi =⇒ si ; H

∗

(ii) if xi does not occur in hk (a), then si = h(ti ) for some arbitrary ti such that Bi = ⇒ ti . G

∗

Hence, by induction and (ii), there are trees t1 , . . . , tk such that h(ti ) = si and Bi = ⇒ ti for ∗

G

all 1 ≤ i ≤ k. Consequently, if t = a[t1 · · · tk ], then A = ⇒ a[B1 · · · Bk ] = ⇒ a[t1 · · · tk ] = t, G

G

and h(t) = hk (a)hx1 ← h(t1 ), . . . , xk ← h(tk )i = s. This proves the theorem.

Exercise 3.66. In the string case one can also prove that the regular languages are closed under homomorphisms by using the Kleene characterization theorem. Give an alternative proof of Theorem 3.65 by using Theorem 3.43 (use the fact, which is implicit in the proof of that theorem, that each regular tree language over the ranked alphabet Σ can be built up from finite tree languages using operations ∪, ·A and ∗A , where A ∈ / Σ). As an indication how one could use theorems like Theorem 3.65, we prove the following theorem, which is (slightly!) stronger than Theorem 3.28. Theorem 3.67. Each context-free language over ∆ is the yield of a recognizable tree language over Σ, where Σ0 = ∆ ∪ {e} and Σ2 = {∗}. Proof. Let L be a context-free language over ∆. By Theorem 3.28, there is a recognizable tree language U over some ranked alphabet Ω with Ω0 = ∆ ∪ {e}, such that yield(U ) = L. Let h be the linear tree homomorphism from TΩ into TΣ such that h0 (a) = a for all a in ∆ ∪ {e}, h1 (a) = x1 for all a in Ω1 , and hk (a) = ∗[x1 ∗ [x2 ∗ [· · · ∗ [xk−1 xk ] · · · ]]] for all a in Ωk , k ≥ 2. By Theorem 3.65, h(U ) is a recognizable tree language over Σ. It is easy to show that, for each t in TΩ , yield(h(t)) = yield(t). Hence yield(h(U )) = yield(U ) = L. Note that Theorem 3.67 is “equivalent” to the fact that each context-free language can be generated by a context-free grammar in Chomsky normal form. Exercise 3.68. Try to show that RECOG is closed under inverse (not necessarily linear) homomorphisms; that is, if L ∈ RECOG and h is a tree homomorphism, then h−1 (L) = {t | h(t) ∈ L} is recognizable. (Represent L by a det. bottom-up fta). We have now discussed all AFL operations (see [Sal, IV]) generalized to trees: union, tree concatenation, tree concatenation closure, (linear) tree homomorphism, inverse tree homomorphism and intersection with a recognizable tree language. Thus, according to previous results, RECOG is a “tree AFL”. Exercise 3.69. Generalize the operation of string substitution (see Definition 2.29) to trees, and show that RECOG is closed under “linear tree substitution”.

28

Exercise 3.70. Suppose you don’t know about context-free grammars. Consider the notion of regular tree grammar. Give a recursive definition of the relation = ⇒ for such a ∗ grammar. Show that, if a[t1 · · · tk ] = ⇒ s, then there are s1 , . . . , sk such that s = a[s1 · · · sk ] ∗ and ti = ⇒ si for all 1 ≤ i ≤ k. Which of the two definitions of regular tree grammar do you prefer?

3.3 Decidability Obviously the membership problem for recognizable tree languages is solvable: given a tree t and an fta M , just feed t into M and see whether t is recognized or not. We now want to prove that the emptiness and finiteness problems for recognizable tree languages are solvable. To do this we generalize the pumping lemma for regular string languages to recognizable tree languages: for each regular string language L there is an integer p such that for all strings z in L, if |z| ≥ p, then there are strings u, v and w such that z = uvw, |vw| ≤ p, |v| ≥ 1 and, for all n ≥ 0, uv n w ∈ L. Theorem 3.71. Let Σ be a ranked alphabet, and x a symbol not in Σ. For each recognizable tree language L over Σ we can find an integer p such that for all trees t in L, if height(t) ≥ p, then there are trees u, v, w ∈ TΣ ({x}) such that (i) u and v contain exactly one occurrence of x, and w ∈ TΣ ; (ii) t = u ·x v ·x w; (iii) height(v ·x w) ≤ p; (iv) height(v) ≥ 1, and (v) for all n ≥ 0, u ·x v nx ·x w ∈ L, where v nx = v ·x v ·x · · · ·x v (n times). Proof. Let M = (Q, Σ, δ, s, F ) be a deterministic bottom-up fta recognizing L. Let p be the number of states of M . Consider a tree t ∈ L(M ) such that height(t) ≥ p. Considering some path of maximal length through t, it is clear that there are trees t1 , t2 , . . . , tn ∈ TΣ ({x}) such that n ≥ p + 1, t = tn ·x tn−1 ·x · · · ·x t2 ·x t1 , the trees t2 , . . . , tn contain exactly one occurrence of x and have height ≥ 1, and t1 ∈ TΣ (this is a b i ·x · · · ·x t1 ) “linearization” of t according to some path). Now consider the states qi = δ(t for 1 ≤ i ≤ n. Then, among q1 , . . . , qp+1 there are two equal states: there are i, j such that qi = qj and 1 ≤ i < j ≤ p + 1. Let u = tn ·x · · · ·x tj+1 , v = tj ·x · · · ·x ti+1 and w = ti ·x · · · ·x t1 . Then requirements (i)-(iv) in the statement of the theorem are obviously b 1 ) = δ(s b 2 ), then δ(s b ·x s1 ) = δ(s b ·x s2 ). Hence, satisfied. Furthermore, in general, if δ(s b ·x w) = δ(w), b since δ(v requirement (v) is also satisfied. As a corollary to Theorem 3.71 we obtain the pumping lemma for context-free languages. Corollary 3.72. For each context-free language L over ∆ we can find an integer q such that for all strings z ∈ L, if |z| ≥ q, then there are strings u1 , v1 , w0 , v2 and u2 in ∆∗ , such that z = u1 v1 w0 v2 u2 , |v1 w0 v2 | ≤ q, |v1 v2 | > 0, and, for all n ≥ 0, u1 v1n w0 v2n u2 ∈ L.

29

Proof. By Theorem 3.67, and the fact that each context-free language can be generated by a λ-free context-free grammar, there is a recognizable tree language U over Σ such that yield(U ) = L, where Σ0 = ∆ and Σ2 = {∗}. Let p be the integer corresponding to U according to Theorem 3.71, and put q = 2p . Obviously, if z ∈ L and |z| ≥ q, then there is a t in U such that yield(t) = z and height(t) ≥ p. Then, by Theorem 3.71 there are trees u, v and w such that (i)-(v) in that theorem hold. Thus t = u ·x v ·x w. Let yield(u) = u1 xu2 , yield(v) = v1 xv2 and yield(w) = w0 (see (i)). Then z = yield(t) = yield(u ·x v ·x w) = yield(u) ·x yield(v) ·x yield(w) = u1 v1 w0 v2 u2 . It is easy to see that all other requirements stated in the corollary are also satisfied. Exercise 3.73. Let Σ be a ranked alphabet such that Σ0 = {a, b}. Show that the tree language {t ∈ TΣ | yield(t) has an equal number of a’s and b’s} is not recognizable. From the pumping lemma the decidability of both emptiness and finiteness problem for RECOG follows. Theorem 3.74. The emptiness problem for recognizable tree languages is decidable. Proof. Let L be a recognizable tree language, and let p be the integer of Theorem 3.71. Obviously (using n = 0 in point (v)), L is nonempty if and only if L contains a tree of height < p. Theorem 3.75. The finiteness problem for recognizable tree languages is decidable. Proof. Let L be a recognizable tree language, and let p be the integer of Theorem 3.71. Obviously, L is finite if and only if all trees in L are of height < p. Thus, L is finite iff L ∩ {t | height(t) ≥ p} = ∅. Since L ∩ {t | height(t) ≥ p} is recognizable (Exercise 3.26 and Theorem 3.32), this is decidable by the previous theorem. Note that the decidability of emptiness and finiteness problem for context-free languages follows from these two theorems together with the “yield-theorem” (with e ∈ / Σ0 , Σ1 = ∅). As in the string case we now obtain the decidability of inclusion of recognizable tree languages (and hence of equality). Theorem 3.76. It is decidable, for arbitrary recognizable tree languages U and V , whether U ⊆ V (and also, whether U = V ). Proof. Since U is included in V iff the intersection of U with the complement of V is empty, the theorem follows from Theorems 3.32 and 3.74. Note again that each regular tree language is a special kind of context-free language. Note also that inclusion of context-free languages is not decidable (neither equality). Therefore it is nice that we have found a subclass of CFL for which inclusion and equality are decidable. Note also that CFL is not closed under intersection, but RECOG is. We shall now relate these facts to some results in the literature concerning “parenthesis languages” and “structural equivalence” of context-free grammars (see [Sal, VIII.3]).

30

Definition 3.77. A parenthesis grammar is a context-free grammar G = (N, Σ ∪ {[ , ]}, R, S) such that each rule in R is of the form A → [w] with A ∈ N and w ∈ (Σ ∪ N )∗ . The language generated by G is called a parenthesis language. To relate parenthesis languages to recognizable tree languages, let us restrict attention to ranked alphabets ∆ such that, for k ≥ 1, if ∆k = 6 ∅, then ∆k = {∗}, where ∗ is a fixed symbol. Suppose that in our recursive definition of “tree” we change a[t1 · · · tk ] into [ t1 · · · tk ] (see Definition 2.5 and Remark 2.35). Then, obviously, all our results about a

the class RECOG are still valid. Furthermore, since ∗ is the only symbol of rank ≥ 1, we may as well replace [ by [. In this way, each parenthesis language is in RECOG (in fact, ∗

each parenthesis grammar is a regular tree grammar). It is also easy to see that, if L is a recognizable tree language (over a restricted ranked alphabet ∆), then L − ∆0 is a parenthesis language. From these considerations we obtain the following theorem. Theorem 3.78. The class of parenthesis languages is closed under union, intersection and subtraction. The inclusion problem for parenthesis languages is decidable. Proof. The first statement follows directly from Theorem 3.32 and the last remark above. The second statement follows directly from Theorem 3.76. A paraphrase of this theorem is obtained as follows. Definition 3.79. For any ranked alphabet, let p be the projection such that p(a) = a for all symbols of rank 0, and p(a) = ∗ for all symbols of rank ≥ 1. Let G be a context-free S ), where S is the initial grammar. The bare tree language of G, denoted by BT(G), is p(DG nonterminal of G. We say that two context-free grammars G1 and G2 are structurally equivalent iff they generate the same bare tree language (i.e., BT(G1 ) = BT(G2 )). Thus, G1 and G2 are structurally equivalent if their sets of derivation trees are the same after “erasing” all nonterminals. Theorem 3.80. It is decidable for arbitrary context-free grammars whether they are structurally equivalent. Proof. For any context-free grammar G = (N, Σ, R, S), let [G] be the parenthesis grammar (N, Σ ∪ {[ , ]}, R, S), where R = {A → [w] | A → w is in R}. Obviously, L([G]) = BT(G). Hence, by Theorem 3.78, the theorem holds. Exercise 3.81. Show that, for any two context-free grammars G1 and G2 there exists a context-free grammar G3 such that BT(G3 ) = BT(G1 ) ∩ BT(G2 ). Exercise 3.82. Show that each context-free grammar has a structurally equivalent context-free grammar that is invertible (cf. Exercise 3.29). Exercise 3.83. Consider the “bracketed context-free languages” of Ginsburg and Harrison, and show that some of their results follow easily from results about RECOG (show first that each recognizable tree language is a deterministic context-free language).

31

Exercise 3.84. Investigate whether it is decidable for an arbitrary recognizable tree language R (i) whether R is local; (ii) whether R is a rule tree language; (iii) whether R is recognizable by a det. top-down fta.

4 Finite state tree transformations 4.1 Introduction: Tree transducers and semantics In this part we will be concerned with the notion of a tree transducer: a machine that takes a tree as input and produces another tree as output. In all generality we may view a tree transducer as a device that gives meaning to structured objects (i.e., a semantics defining device). Let us try to indicate this aspect of tree transducers. Consider a ranked alphabet Σ. The elements of Σ may be viewed as “operators”, i.e., symbols denoting operations (functions of several arguments). The rank of an operator stands for the number of arguments of the operation (note therefore that one operator may denote several operations). The operators of rank 0 have no arguments: they are (denote) constants. As an example, the ranked alphabet Σ with Σ0 = {e, a, b} and Σ2 = {f } may be viewed as consisting of three constants e, a and b and one binary operator f . From operators we may form “terms” or “expressions”, like for instance f (a, f (e, b)), or perhaps, denoting f by ∗, (a ∗ (e ∗ b)). Obviously the terms are in one-to-one correspondence with the set TΣ of trees over Σ . Thus the notions tree and term may be identified. Intuitively, terms denote structured objects, obtained by applying the operations to the constants. Formally, meaning is given to operators and terms by way of an “interpretation”. An interpretation of Σ consists of a “domain” B, for each element a ∈ Σ0 an element h0 (a) of B, and for each k ≥ 1 and operator a ∈ Σk an operation hk (a) : B k → B. An interpretation of Σ is also called a “Σ-algebra” or “algebra of type Σ”. An interpretation (B, {h0 (a)}a∈Σ0 , {hk (a)}a∈Σk ) determines a mapping h : TΣ → B (giving an interpretation to each term as an element of B) as follows: (i) for a ∈ Σ0 , h(a) = h0 (a); (ii) for k ≥ 1 and a ∈ Σk , h(a[t1 · · · tk ]) = hk (a)(h(t1 ), . . . , h(tk )). (Such a mapping is also called a “homomorphism” from TΣ into B). Thus the meaning of a tree is uniquely determined by the meaning of its subtrees and the interpretation of the operator applied to these subtrees. In general we can say that the meaning of a structured object is a function of the meanings of its substructures, the function being determined by the way the object is constructed from its substructures. As an example, an interpretation of the above-mentioned ranked alphabet Σ = {e, a, b, f } might for instance consist of a group B with unity h0 (e), multiplication h2 (f ) and two specific elements h0 (a) and h0 (b). Or it might consist of B = {a, b}∗ ,

32

h0 (e) = λ, h0 (a) = a, h0 (b) = b and h2 (f ) is concatenation. Note that in this case the mapping h : TΣ → B is the yield! It is now easy to see that a deterministic bottom-up fta with input alphabet Σ is nothing else but a Σ-algebra with a finite domain (its set of states). Such an automaton may therefore be used as a semantic defining device in case there are only a finite number of possible semantical values. Obviously, in general, one needs an infinite number of semantical values. However, it is not attractive to consider arbitrary infinite domains B since this provides us with no knowledge about the structure of the elements of B. We therefore assume that the elements of B are structured objects: trees (or interpretations of them). Thus we consider Σ-algebras with domain T∆ for some ranked alphabet ∆. Our complete semantics of TΣ may then consist of two parts: an interpretation of TΣ into T∆ and an interpretation of T∆ in some ∆-algebra. The interpretation of TΣ into T∆ may be realized by a tree transducer. An example of an interpretation of TΣ into T∆ is the tree homomorphism of Definik → T , tion 3.62. In fact each tree s ∈ T∆ (Xk ) may be viewed as an operation se : T∆ ∆ defined by se(s1 , . . . , sk ) = shx1 ← s1 , . . . , xk ← sk i. A tree homomorphism is then the same thing as an interpretation of Σ with domain T∆ , where the allowable interpretations of the elements of Σ are the mappings se above. Note that these interpretations are very natural, since the interpretation of a tree is obtained by “applying a finite number of ∆-operators to the interpretations of its subtrees”. To show the relevance of tree homomorphisms (and therefore tree transducers in general) to the semantics of context-free languages we consider the following very simple example. Example 4.1. Consider a context-free grammar generating expressions by the rules E → E + T , T → T ∗ F , E → a, T → a, F → a and F → (E). Suppose we want to translate each expression into the equivalent post-fix expression. To do this we consider the rule tree language corresponding to this grammar and apply to the rule trees in this language the tree homomorphism h defined by h2 (E → E + T ) = E[x1 x2 +], h2 (T → T ∗F ) = T [x1 x2 ∗], h0 (E → a) = E[a], h0 (T → a) = T [a], h0 (F → a) = F [a] and h1 (F → (E)) = F [x1 ]. Then the rule tree corresponding to an expression is translated into a tree whose yield is the corresponding post-fix expression. For instance, E

(E → E + T ) (E → a)

(T → T ∗ F )

goes into

E a

(T → a)

(F → a)

+

T T F a

,

∗

a

so that a + a ∗ a is translated into aaa ∗ +. Note that, moreover, the transformed tree is the derivation tree of the post-fix expression in the context-free grammar with the rules E → ET +, T → T F ∗, E → a, T → a, F → a and F → E. Instead of interpreting this

33

derivation tree as its yield (the post-fix expression), one might also interpret it as, for instance, a sequence of machine code instructions, like “load a; load a; load a; multiply; add”. It is not difficult to see that the syntax-directed translation schemes of [A&U, I.3] correspond in some way to linear, nondeleting homomorphisms working on rule tree languages. To arrive at our general notion of tree transducer, we combine the finite tree automaton and the tree homomorphism into a “tree homomorphism with states” or a “finite tree automaton with output”. This tree transducer will not any more be an interpretation of TΣ into T∆ , but involves a generalization of this concept (although, by replacing T∆ by some other set, it can again be formulated as an interpretation of TΣ ). Two ideas occur in this generalization. Idea 4.2. The translation (meaning) of a tree may depend not only on the translation of its subtrees but also on certain properties of these subtrees. Assuming that these properties are recognizable (that is, the set of all trees having the property is in RECOG), they may be represented as states of a (deterministic) bottom-up fta. Thus we can combine the deterministic bottom-up fta and the tree homomorphism by associating to each symbol a ∈ Σk a mapping fa : Qk → Q × T∆ (Xk ). This fa may be split up into two mappings δa : Qk → Q and ha : Qk → T∆ (Xk ). The δ-functions determine a mapping δb : TΣ → Q, as for the bottom-up fta, and the h-functions determine an output-mapping b h : TΣ → T∆ by the formula (cf. the corresponding tree homomorphism formula): b b 1 ), . . . , δ(t b k ))hx1 ← b h(a[t1 · · · tk ]) = ha (δ(t h(t1 ), . . . , xk ← b h(tk )i. Thus our tree transducer works through the tree in a bottom-up fashion just like the bottom-up fta, but at each step, it produces output by combining the output trees, already obtained from the subtrees, into one new output tree. Note that, if we allow our bottom-up tree transducer to be nondeterministic, then the above formula for b h is intuitively wrong (we need “deterministic substitution”). Idea 4.3. To obtain the translation of the input tree one may need several different translations of each subtree. Suppose that one needs m different kinds of translation of each tree (where one of them is the “main meaning” and the others are “auxiliary meanings”), then these may be realized by m states of the transducer, say q1 , . . . , qm . The ith translation may then be specified by associating to each a ∈ Σk a tree hqi (a) ∈ T∆ (Ym,k ), where Ym,k = {yi,j | 1 ≤ i ≤ m, 1 ≤ j ≤ k}. The ith translation of a tree a[t1 · · · tk ] may then be defined by the formula b hqi (a[t1 · · · tk ]) = hqi (a)hyr,s ← b hqr (ts )i1≤r≤m,1≤s≤k . Thus the ith translation of a tree is expressed in terms of all possible translations of its subtrees. Realizing such a translation in a bottom-up fashion would mean that we should compute all m possible translations of each tree in parallel, whereas working in a top-down way we know exactly from hqi (a) which translations of which subtrees are

34

needed (note that, in general, not all elements of Ym,k appear in hqi (a)). Therefore, such a translation seems to be realized best by a top-down tree transducer. We note that the generalized syntax-directed translation scheme of [A&U, II.9.3] corresponds to such a top-down tree transducer working on a rule tree language. As already indicated in Example 4.1, tree transducers are of interest to the translation of context-free languages (in particular the context-free part of a programming language). For this reason we often restrict the tree transducer to a rule tree language, a local tree language or a recognizable tree language (the difference being slight: a projection). This restriction is also of interest from a linguistical point of view: a natural language may be described by a context-free set of kernel sentences to which transformations may be applied, working on the derivation trees (as for instance the transformation active → passive). The language then consists of all transformations of kernel sentences. We note that if derivation tree d1 of sentence s1 is transformed into tree d2 with yield s2 , then the sentence s2 is said to have “deep structure” d1 and “surface structure” d2 .

4.2 Top-down and bottom-up finite tree transducers Since tree transducers define tree transformations (recall Definition 2.10), we start by recalling some terminology concerning relations. We note first that, for ranked alphabets Σ and ∆, we shall identify any mapping f : TΣ → T∆ with the tree transformation {(s, t) | f (s) = t}, and we shall identify any mapping f : TΣ → P(T∆ ) with the tree transformation {(s, t) | t ∈ f (s)}. Definition 4.4. Let Σ, ∆ and Ω be ranked alphabets. If M1 ⊆ TΣ ×T∆ and M2 ⊆ T∆ ×TΩ , then the composition of M1 and M2 , denoted by M1 ◦ M2 , is the tree transformation {(s, t) ∈ TΣ × TΩ | (s, u) ∈ M1 and (u, t) ∈ M2 for some u ∈ T∆ }. If F and G are classes of tree transformations, then F ◦ G denotes the class {M1 ◦ M2 | M1 ∈ F and M2 ∈ G}. Definition 4.5. Let M be a tree transformation from TΣ into T∆ . The inverse of M , denoted by M −1 , is the tree transformation {(t, s) ∈ T∆ × TΣ | (s, t) ∈ M }. Definition 4.6. Let M be a tree transformation and L a tree language. The image of L under M , denoted by M (L), is the tree language M (L) = {t | (s, t) ∈ M for some s in L}. If M is a tree transformation from TΣ into T∆ , then the domain of M , denoted by dom(M ), is M −1 (T∆ ), and the range of M , denoted by range(M ), is M (TΣ ). In Part (3) we already considered certain simple tree transformations: relabelings and tree homomorphisms. Notation 4.7. We shall use REL to denote the class of all relabelings, HOM to denote the class of all tree homomorphisms, and LHOM to denote the class of linear tree homomorphisms.

35

Moreover we want to view each finite tree automaton as a simple “checking” tree transducer, which, given some input tree, produces the same tree as output if it belongs to the tree language recognized by the fta, and produces no output if not. Definition 4.8. Let Σ be a ranked alphabet. A tree transformation R ⊆ TΣ × TΣ is called a finite tree automaton restriction if there is a recognizable tree language L such that R = {(t, t) | t ∈ L}. If M is an fta, then we shall denote the finite tree automaton restriction {(t, t) | t ∈ L(M )} by T (M ). We shall use FTA to denote the class of all finite tree automaton restrictions. Exercise 4.9. Prove that the classes of tree transformations REL, HOM and FTA are each closed under composition. Show that REL and FTA are also closed under inverse. Before defining tree transducers we first discuss a very general notion of tree rewriting system that can be used to define both tree transducers and tree grammars. The reason to introduce these tree rewriting systems is that recursive definitions like those for the finite tree automata and tree homomorphisms tend to become very cumbersome when used for tree transducers, whereas rewriting systems are more “machine-like” and therefore easier to visualize. To arrive at the notion of tree rewriting system we first generalize the notion of string rewriting system to allow for the use of rule “schemes”. Recall the set X of variables from Definition 3.61. Definition 4.10. A rewriting system with variables is a pair G = (∆, R), where ∆ is an alphabet and R a finite set of “rule schemes”. A rule scheme is a triple (v, w, D) such that, for some k ≥ 0, v and w are strings over ∆ ∪ Xk and D is a mapping from Xk into P(∆∗ ). Whenever D is understood, (v, w, D) is denoted by v → w. For 1 ≤ i ≤ k, the language D(xi ) is called the range (or domain) of the variable xi . A relation = ⇒ on ∆∗ is defined as follows. For strings s, t ∈ ∆∗ , s = ⇒ t if and only if there G

G

exists a rule scheme (v, w, D) in R, strings φ1 , . . . , φk in D(x1 ), . . . , D(xk ) respectively (where Xk is the domain of D), and strings α and β in ∆∗ such that s = α · vhx1 ← φ1 , . . . , xk ← φk i · β

and

t = α · whx1 ← φ1 , . . . , xk ← φk i · β . ∗

As usual = ⇒ denotes the transitive-reflexive closure of = ⇒. G

G

For convenience we shall, in what follows, use the word “rule” rather than “rule scheme”. Of course, in a rewriting system with variables, the ranges of the variables should be specified in some effective way (note that we would like the relation ⇒ to be decidable). In what follows we shall only use the case that the variables range over recognizable tree languages.

36

Examples 4.11. (1) Consider the rewriting system with variables G = (∆, R), where ∆ = {a, b, c} and R consists of the one rule ax1 c → aax1 bcc, where D(x1 ) = b∗ . Then, for instance, aabbcc ⇒ aaabbbccc (by application of the ordinary rewriting rule abbc → aabbbcc obtained by substituting bb for x1 in the rule above). It is easy to see that ∗ {w ∈ ∆∗ | abc = ⇒ w} = {an bn cn | n ≥ 1}. (2) Consider the rewriting system with variables G = (∆, R), where ∆ = {[ , ], ∗, 1} and R consists of the rules [x1 ∗ x2 1] → [x1 ∗ x2 ]x1 and [x1 ∗ 1] → x1 , where in both rules D(x1 ) = D(x2 ) = 1∗ . It is easy to see that, for arbitrary u, v, w ∈ 1∗ , ∗ [u ∗ v] = ⇒ w iff w is the product of u and v (in unary notation). (3) The two-level grammar used to describe Algol 68 may be viewed as a rewriting system with variables. The variables ( = meta notions) range over context-free languages, specified by the meta grammar. By specializing to trees we obtain the notion of tree rewriting system. Definition 4.12. A rewriting system with variables G = (∆, R) is called a tree rewriting system if (i) ∆ = Σ ∪ {[ , ]} for some ranked alphabet Σ; (ii) for each rule (v, w, D) in R, v and w are trees in TΣ (Xk ) and, for 1 ≤ i ≤ k, D(xi ) ⊆ TΣ (where Xk is the domain of D). It should be clear that, for a tree rewriting system G = (Σ ∪ {[ , ]}, R), if s ∈ TΣ and s= ⇒ t, then t ∈ TΣ . In fact, the application of a rule to a tree consists of replacing some G

piece in the middle of the tree by some other piece, where the variables indicate how the subtrees of the old piece should be connected to the new one. As an example, if we have a rule a[b[x1 x2 ]b[x3 d]] → b[x2 a[x1 dx2 ]], then the application of this rule to a tree t (if possible) consists of replacing a subtree of t of the form a

b

b

b

t2

d t1

t2

a

by

t3

,

d t1

t2

where t1 , t2 and t3 are in the ranges of x1 , x2 and x3 . Thus t is of the form αa[b[t1 t2 ]b[t3 d]]β and is transformed into αb[t2 a[t1 dt2 ]]β. Example 4.13. Let Σ0 = {a}, Σ1 = {b}, ∆0 = {a}, ∆2 = {b}, Ω0 = {a}, Ω1 = {∗, b} and Ω2 = {b}. (i) Consider the tree rewriting system G = (Ω ∪ {[ , ]}, R), where R consists of the rules a → ∗[a], b[∗[x1 ]] → ∗[b[x1 x1 ]], and D(x1 ) = T∆ . Then, for instance,

37

b

b

∗

b ∗

b ⇒ b ⇒ . b ⇒ ∗ b b b a a a a a a a a It is easy to see that, if h is the tree homomorphism from TΣ into T∆ defined by ∗ h0 (a) = a and h1 (b) = b[x1 x1 ], then, for s ∈ TΣ and t ∈ T∆ , h(s) = t iff s = ⇒ ∗[t]. (ii) Consider the tree rewriting system G0 = (Ω ∪ {[ , ]}, R0 ), where R0 consists of the rules ∗[b[x1 ]] → b[∗[x1 ] ∗ [x1 ]], ∗[a] → a, and D(x1 ) = TΣ . Then, for instance, ∗ b b a

b ⇒

b

∗

∗

b a

b a

⇒

b

b ∗

∗ ∗ b a a a

⇒

b

∗

∗ a b a a

b

∗

⇒ =

b

b

.

a aa a

It is easy to see that, if h is the homomorphism defined above, then, for s ∈ TΣ ∗ and t ∈ T∆ , h(s) = t iff ∗[s] = ⇒ t. The tree transducers to be defined will be a generalization of the generalized sequential machine working on strings, which is essentially a finite automaton with output. A (nondeterministic) generalized sequential machine is a 6-tuple M = (Q, Σ, ∆, δ, S, F ), where Q is the set of states, Σ is the input alphabet, ∆ the output alphabet, δ is a mapping Q × Σ → P(Q × ∆∗ ), S is a set of initial states and F a set of final states. Intuitively, if δ(q, a) contains (q 0 , w) then, in state q and scanning input symbol a, the machine M may go into state q 0 and add w to the output. Formally we may define the functioning of M in several ways. As already said, the recursive definition (as for the fta) is too cumbersome, although it is the most exact one (and should be used in very formal proofs). The other way is to describe the sequence of configurations the machine goes through during the translation of the input string. A configuration is usually a triple (v, q, s), where v is the output generated so far, q is the state and s is the rest of the input. If s = as1 , then the next configuration might be (vw, q 0 , s1 ). A useful variation of this is to replace (v, q, s) by the string vqs ∈ ∆∗ QΣ∗ . The next configuration can now be obtained by applying the string rewriting rule qa → wq 0 , thus vqas1 ⇒ vwq 0 s1 . Replacing δ by a corresponding set of rewriting rules, the string translation realized by ∗ M can be defined as {(v1 , v2 ) | q0 v1 = ⇒ v2 qf for some q0 ∈ S and qf ∈ F }. Let us first consider the bottom-up generalization of this machine to trees, which is conceptually easier than the top-down version, although perhaps less interesting. The bottom-up finite tree transducer goes through the input tree in the same way as the bottom-up fta, at each step producing a piece of output to which the already generated

38

output is concatenated. The transducer arrives at a node of rank k with a sequence of k states and a sequence of k output trees (one state and one output tree for each direct subtree of the node). The sequence of states and the label at the node determine (nondeterministically) a new state and a piece of output containing the variables x1 , . . . , xk . The transducer processes the node by going into the new state and computing a new output tree by substituting the k output trees for x1 , . . . , xk in the piece of output. There should be start states and output for each node of rank 0. If the transducer arrives at the top of the tree in a final state, then the computed output tree is the transformation of the input tree. (Cf. the story in Idea 4.2). To be able to put the states of the transducer as labels on trees, we make them into symbols of rank 1. The configurations of the bottom-up tree transducer will be elements of TΣ (Q[T∆ ]), † and the steps of the tree transducer (including the start steps) are modelled by the application of tree rewriting rules to these configurations. We now give the formal definition. Definition 4.14. A bottom-up (finite) tree transducer is a structure M = (Q, Σ, ∆, R, Qd ), where Q is a ranked alphabet (of states), such that all elements of Q have rank 1 and no other ranks; Σ is a ranked alphabet (of input symbols); ∆ is a ranked alphabet (of output symbols), Q ∩ (Σ ∪ ∆) = ∅; Qd is a subset of Q (the set of final states); and R is a finite set of rules of one of the forms (i) or (ii): (i) a → q[t], where a ∈ Σ0 , q ∈ Q and t ∈ T∆ ; (ii) a[q1 [x1 ] · · · qk [xk ]] → q[t], where k ≥ 1, a ∈ Σk , q1 , . . . , qk , q ∈ Q and t ∈ T∆ (Xk ). M is viewed as a tree rewriting system over the ranked alphabet Q ∪ Σ ∪ ∆ with R as the set of rules, such that the range of each variable occurring in R is T∆ . Therefore the ∗ relations =⇒ and =⇒ are well defined according to Definition 4.10. M

M

The tree transformation realized by M , denoted by T (M ) or simply M , is ∗

{(s, t) ∈ TΣ × T∆ | s =⇒ q[t] for some q in Qd }. M

We shall abbreviate “(finite) tree transducer” by “ftt”. Remark 4.15. Note that T (M ) is also denoted by M . In general, we shall often make no distinction between a tree transducer and the tree transformation it realizes. Hopefully this will not lead to confusion. Definition 4.16. The class of tree transformations realized by bottom-up ftt will be denoted by B. An element of B will be called a bottom-up tree transformation. †

Note that Q[T∆ ] = {q[t] | q ∈ Q and t ∈ T∆ }.

39

Example 4.17. An example of a bottom-up ftt realizing a homomorphism was given in Example 4.13(i) (it had one state ∗). Example 4.18. Consider the bottom-up ftt M = (Q, Σ, ∆, R, Qd ), where Q = {q0 , q1 }, Σ0 = {a, b}, Σ2 = {f, g}, ∆0 = {a, b}, ∆2 = {m, n}, Qd = {q0 } and the rules are a → q0 [a],

b → q0 [b],

f [qi [x1 ]qj [x2 ]] → q1−i [m[x1 x1 ]],

g[qi [x1 ]qj [x2 ]] → q1−i [n[x1 x1 ]],

f [qi [x1 ]qj [x2 ]] → q1−j [m[x2 x2 ]],

g[qi [x1 ]qj [x2 ]] → q1−j [n[x2 x2 ]],

for all i, j ∈ {0, 1}. The transformation realized by M may be described by saying that, given some input tree t, M selects some path of even length through t, relabels f by m and g by n and then doubles every subtree. For example f [a[g[ab]]] may be transformed into the tree m[n[bb]n[bb]] corresponding to the path f gb. The tree g[ab] is not in the domain of M . Exercise 4.19. Construct bottom-up tree transducers M1 and M2 such that (i) {(yield(s), yield(t)) | (s, t) ∈ T (M1 )} = {(a(cd)n f en b, acn f d2n b) | n ≥ 0}; (ii) M2 deletes, given an input tree t, all subtrees t0 of t such that yield(t0 ) ∈ a+ b+ . Exercise 4.20. Give a recursive definition of the transformation realized by a bottom-up tree transducer (without using the notion of tree rewriting system). Exercise 4.21. Given a bottom-up ftt M with input alphabet Σ, find a suitable Σ-algebra such that M may be viewed as an interpretation of Σ into this Σ-algebra (cf. Section 4.1). We now define some subclasses of the class of bottom-up tree transformations. Definition 4.22. Let Σ be a ranked alphabet and k ≥ 0. A tree t in TΣ (Xk ) is linear if each element of Xk occurs at most once in t. The tree t is called nondeleting with respect to Xk if each element of Xk occurs at least once in t. Definition 4.23. Let M = (Q, Σ, ∆, R, Qd ) be a bottom-up ftt. M is called linear if the right hand side of each rule in R is linear. M is called nondeleting if the right hand side of each rule in R is nondeleting with respect to Xk , where k is the rank of the input symbol in the left hand side. M is called one-state (or pure) if Q is a singleton. M is called (partial) deterministic if (i) for each a ∈ Σ0 there is at most one rule in R with left hand side a, and (ii) for each k ≥ 1, a ∈ Σk and q1 , . . . , qk ∈ Q there is at most one rule in R with left hand side a[q1 [x1 ] · · · qk [xk ]]. M is called total deterministic if (i) and (ii) hold with “at most one” replaced by “exactly one” and Qd = Q.

40

Notation 4.24. The same terminology will be applied to the transformations realized by such transducers. Thus, for instance, a linear deterministic bottom-up tree transformation is one that can be realized by a linear deterministic bottom-up ftt. The classes of tree transformations obtained by putting one or more of the above restrictions on the bottom-up tree transducers will be denoted by adding the symbols L, N, P, D and Dt (standing for linear, nondeleting, pure, deterministic, and total deterministic respectively) to the letter B. Thus the class of linear deterministic bottom-up tree transformations is denoted by LDB. Example 4.25. Let Σ0 = {e}, Σ1 = {a, f }, ∆0 = {e}, ∆1 = {a, b} and ∆2 = {f }. Consider the bottom-up ftt M = (Q, Σ, ∆, R, Qd ), where Q = Qd = {∗} and R consists of the rules e → ∗[e], a[∗[x1 ]] → ∗[a[x1 ]],

a[∗[x1 ]] → ∗[b[x1 ]],

f [∗[x1 ]] → ∗[f [x1 x1 ]]. Then M ∈ PNB.

Let us make the following remarks about the concepts defined in Definition 4.23. Remarks 4.26. (1) Deletion is different from erasing. A rule may be called erasing if its right hand side belongs to Q[X]. Thus, symbols of rank 0 cannot be erased. Symbols of rank 1 can be erased without any deletion, but symbols of rank k ≥ 2 can only be erased by deleting also k − 1 subtrees. Thus a nondeleting tree transducer is still able to erase symbols of rank 1. (2) The one-state bottom-up tree transformations correspond intuitively to the finite substitutions in the string case. (3) The total deterministic bottom-up ftt realize tree transformations which are total functions. Exercise 4.27. Show that, in the definition of “(partial) deterministic”, we may replace the phrase “at most one” by “exactly one” without changing the corresponding class DB of deterministic bottom-up tree transformations. In the next theorem we show that all relabelings, finite tree automaton restrictions and tree homomorphisms are realizable by bottom-up ftt. Theorem 4.28. (1) REL ⊆ PNLB, (2) FTA ⊆ NLDB, (3) HOM = PDt B and LHOM = PLDt B.

41

Proof. (1) Let r be a relabeling from TΣ into T∆ . Thus r is determined by a family of mappings rk : Σk → P(∆k ). Obviously the following bottom-up ftt realizes r: M = ({∗}, Σ, ∆, R, {∗}), where R is constructed as follows: (i) for a ∈ Σ0 , if b ∈ r0 (a), then a → ∗[b] is in R; (ii) for k ≥ 1 and a ∈ Σk , if b ∈ rk (a), then a[∗[x1 ] · · · ∗ [xk ]] → ∗[b[x1 · · · xk ]] is in R. Clearly M ∈ PNLB. (2) From the definition of FTA and from Part (3) it follows that we need only consider a deterministic bottom-up fta M = (Q, Σ, δ, s, F ) and show that T (M ) = {(t, t) | t ∈ L(M )} f = (Q, Σ, Σ, R, F ), where is realized by a bottom-up ftt. Consider the bottom-up ftt M R is constructed as follows: (i) for a ∈ Σ0 , a → q[a] is in R, where q = sa ; (ii) for k ≥ 1 and a ∈ Σk , if δak (q1 , . . . , qk ) = q, then a[q1 [x1 ] · · · qk [xk ]] → q[a[x1 · · · xk ]] is in R. f realizes T (M ) and M f ∈ NLDB (the determinism of M f follows from that Clearly M of M ). (3) We first show that HOM ⊆ PDt B (and LHOM ⊆ PLDt B). An example of this was already given in Example 4.13(i). Let h be a tree homomorphism from TΣ into T∆ determined by the mappings hk : Σk → T∆ (Xk ). Consider the bottom-up ftt M = ({∗}, Σ, ∆, R, {∗}), where R contains the following rules: (i) for a ∈ Σ0 , a → ∗[h0 (a)] is in R; (ii) for k ≥ 1 and a ∈ Σk , the rule a[∗[x1 ] · · · ∗ [xk ]] → ∗[hk (a)] is in R. Obviously M is in PDt B (and linear, if h is linear). Let us prove that M realizes h. Thus ∗ we have to show that, for s ∈ TΣ and t ∈ T∆ , h(s) = t iff s = ⇒ ∗[t]. The proof is by induction on s. The case s ∈ Σ0 is clear. Now let s = a[s1 · · · sk ]. Suppose that h(s) = t. Then, by definition of h, t = hk (a)hx1 ← h(s1 ), . . . , xk ← h(sk )i. ∗ By induction, si = ⇒ ∗[h(si )] for all i, 1 ≤ i ≤ k. Hence (but note that formally ∗ ⇒ a[∗[h(s1 )] · · · ∗ [h(sk )]]. But, by rule (ii) above, this needs a proof) a[s1 · · · sk ] = ∗ a[∗[h(s1 )] · · · ∗ [h(sk )]] ⇒ ∗[hk (a)hx1 ← h(s1 ), . . . , xk ← h(sk )i]. Consequently s = ⇒ ∗[t]. ∗

Now suppose that s = a[s1 · · · sk ] = ⇒ ∗[t]. Then (and again this needs a for∗ mal proof) there are trees t1 , . . . , tk such that si = ⇒ ∗[ti ] for 1 ≤ i ≤ k and ∗ a[s1 · · · sk ] = ⇒ a[∗[t1 ] · · · ∗ [tk ]] ⇒ ∗[hk (a)hx1 ← t1 , . . . , xk ← tk i] = ∗[t]. By induction, ti = h(si ) for all i, 1 ≤ i ≤ k. Hence t = hk (a)hx1 ← h(s1 ), . . . , xk ← h(sk )i = h(s). This proves that (L)HOM ⊆ P(L)Dt B. To show the converse, consider a one-state total deterministic bottom-up tree transducer M = ({∗}, Σ, ∆, R, {∗}). Define the tree homomorphism h from TΣ into T∆ as follows: (i) for a ∈ Σ0 , h0 (a) is the tree t occurring in the (unique) rule a → ∗[t] in R; (ii) for k ≥ 1 and a ∈ Σk , hk (a) is the tree t occurring in the (unique) rule a[∗[x1 ] · · · ∗ [xk ]] → ∗[t] in R.

42

Then, obviously, by the same proof as above, h = T (M ).

Exercise 4.29. Prove that the domain of a bottom-up tree transformation is a recognizable tree language, and vice-versa. Let us now consider the top-down generalization of the generalized sequential machine. The top-down finite tree transducer goes through the input tree in the same way as the top-down fta, at each step producing a piece of output to which the (unprocessed) rest of the input is concatenated. Note therefore that the transducer not really “goes through” the input tree in the same way as the bottom-up ftt does, since in the topdown case the rest of the input may be modified (deleted, permuted, copied) during translation, whereas in the bottom-up case the rest of the input is unmodified during translation. The top-down transducer arrives at a node of rank k in a certain state; on that moment the configuration is an element of T∆ (Q[TΣ ]), where Σ and ∆ are the input and output alphabet, and Q the set of states. The state and the label at the node determine (nondeterministically) a piece of output containing the variables x1 , . . . , xk , and states with which to continue the translation of the subtrees. These states are also specified in the piece of output, which is in fact a tree in T∆ (Q[Xk ]), where an occurrence of q[xi ] means that the processing of the ith subtree should, at this point, be continued in state q. The transducer processes the node by replacing it and its direct subtrees by the piece of output, in which the k subtrees are substituted for the variables x1 , . . . , xk . The processing of (all copies of) the subtrees is continued as indicated above. The transducer starts at the root of the input tree in some initial state. There should be final states and output for each node of rank 0. If the transducer arrives in a final state at each leaf, then it replaces each leaf by the final output, and the resulting tree is the transformation of the input tree. (Cf. the story in Idea 4.3). The steps of the transducer, including the final steps, are modelled by the application of rewriting rules to the elements of T∆ (Q[TΣ ]). We now give the formal definition. Definition 4.30. A top-down (finite) tree transducer is a structure M = (Q, Σ, ∆, R, Qd ), where Q, Σ and ∆ are as for the bottom-up ftt, Qd is a subset of Q (the set of initial states), and R is a finite set of rules of one of the forms (i) or (ii): (i) q[a[x1 · · · xk ]] → t, where k ≥ 1, a ∈ Σk , q ∈ Q and t ∈ T∆ (Q[Xk ]); (ii) q[a] → t, where q ∈ Q, a ∈ Σ0 and t ∈ T∆ . M is viewed as a tree rewriting system over the ranked alphabet Q ∪ Σ ∪ ∆ with R as the set of rules, such that the range of each variable in X is TΣ . The tree transformation realized by M , denoted by T (M ) or simply M , is ∗ {(s, t) ∈ TΣ × T∆ | q[s] =⇒ t for some q in Qd }. M

Definition 4.31. The class of tree transformations realized by top-down ftt will be denoted by T. An element of T will be called a top-down tree transformation.

43

Example 4.32. An example of a top-down ftt realizing a homomorphism was given in Example 4.13(ii) (it had one state ∗). The next example is a top-down ftt computing the formal derivative of an arithmetic expression. Example 4.33. Consider the top-down ftt M = (Q, Σ, ∆, R, Qd ), where Σ0 = {a, b}, ∆0 = {a, b, 0, 1}, Σ1 = ∆1 = {−, sin, cos}, Σ2 = ∆2 = {+, ∗}, Q = {q, i}, Qd = {q}, the rules for q are q[+[x1 x2 ]] → +[q[x1 ]q[x2 ]], q[∗[x1 x2 ]] → +[∗[q[x1 ]i[x2 ]] ∗ [i[x1 ]q[x2 ]]], q[−[x1 ]] → −[q[x1 ]], q[sin[x1 ]] → ∗[cos[i[x1 ]]q[x1 ]], q[cos[x1 ]] → ∗[−[sin[i[x1 ]]]q[x1 ]], q[a] → 1, q[b] → 0, and the rules for i are i[+[x1 x2 ]] → +[i[x1 ]i[x2 ]], i[∗[x1 x2 ]] → ∗[i[x1 ]i[x2 ]], i[−[x1 ]] → −[i[x1 ]], i[sin[x1 ]] → sin[i[x1 ]], i[cos[x1 ]] → cos[i[x1 ]], i[a] → a and i[b] → b.

Then (t, s) ∈ T (M ) iff s is the formal derivative of t with respect to a. For instance, ∗ ∗ q[∗[+[ab] − [a]]] = ⇒ +[∗[+[10] − [a]] ∗ [+[ab] − [1]]]. Note that i[t1 ] = ⇒ t2 iff t1 = t2 (t1 , t2 ∈ TΣ ). Exercise 4.34. Let Σ0 = {a, b}, Σ1 = {∼} and Σ2 = {∧, ∨}. TΣ may be viewed as the set of all boolean expressions over the boolean variables a and b, using negation, conjunction and disjunction. Write a top-down tree transducer which transforms every boolean expression into an equivalent one in which a and b are the only subexpressions which may be negated. Exercise 4.35. (i) Give a recursive definition of the transformation realized by a topdown ftt. (ii) Find a suitable Σ-algebra such that the top-down ftt may be viewed as an interpretation of Σ into this Σ-algebra. As in the bottom-up case we define some subclasses of T. Definition 4.36. Let M = (Q, Σ, ∆, R, Qd ) be a top-down tree transducer. The definitions of linear, nondeleting and one-state are identical to the bottom-up ones

44

in Definition 4.23. M is called (partial) deterministic if (i) Qd is a singleton; (ii) for each q ∈ Q, k ≥ 1, and a ∈ Σk , there is at most one rule in R with left hand side q[a[x1 · · · xk ]]; (iii) for each q ∈ Q and a ∈ Σ0 there is at most one rule in R with left hand side q[a]. M is called total deterministic if (i), (ii) and (iii) hold with “at most one” replaced by “exactly one”. Notation 4.24 also applies to the top-down case. Thus, PLT is the class of one-state linear top-down tree transformations. Example 4.37. Let Σ0 = {e}, Σ1 = {a, f }, ∆0 = {e}, ∆1 = {a, b} and ∆2 = {f }. Consider the top-down tree transducer M = (Q, Σ, ∆, R, Qd ) with Q = Qd = {∗} and R consists of the rules ∗ [f [x1 ]] → f [∗[x1 ] ∗ [x1 ]], ∗ [a[x1 ]] → a[∗[x1 ]],

∗[a[x1 ]] → b[∗[x1 ]],

∗ [e] → e. Then M ∈ PNT.

Remarks 4.26 also apply to the top-down case. Exercise 4.38. Show that, in the definition of “(partial) deterministic” top-down ftt, we may replace in (ii) the phrase “at most one” by “exactly one” without changing DT. The next theorem shows that all relabelings, finite tree automaton restrictions and tree homomorphisms are realizable by top-down tree transducers (cf. Theorem 4.28). Note therefore that these tree transformations are not specifically bottom-up or top-down. Theorem 4.39. (1) REL ⊆ PNLT, (2) FTA ⊆ NLT, (3) HOM = PDt T and LHOM = PLDt T. Proof. Exercise.

In what follows we shall need one other type of tree transformation which corresponds to the ordinary sequential machine in the string case (which translates each input symbol into one output symbol). It is a combination of an fta and a relabeling. Definition 4.40. A top-down (resp. bottom-up) finite state relabeling is a (tree transformation realized by a) top-down (resp. bottom-up) tree transducer M = (Q, Σ, ∆, R, Qd )

45

in which all rules are of the form q[a[x1 · · · xk ]] → b[q1 [x1 ] · · · qk [xk ]] with q, q1 , . . . , qk ∈ Q, a ∈ Σk and b ∈ ∆k , or of the form q[a] → b with q ∈ Q, a ∈ Σ0 and b ∈ ∆0 (resp. of the form a[q1 [x1 ] · · · qk [xk ]] → q[b[x1 · · · xk ]] with q, q1 , . . . , qk ∈ Q, a ∈ Σk and b ∈ ∆k , or of the form a → q[b] with q ∈ Q, a ∈ Σ0 and b ∈ ∆0 ). It is clear that the classes of top-down and bottom-up finite state relabelings coincide. This class will be denoted by QREL. The classes of deterministic top-down and deterministic bottom-up finite state relabelings obviously do not coincide. They will be denoted by DTQREL and DBQREL respectively. Note that FTA ∪ REL ⊆ QREL ⊆ NLB ∩ NLT. Apart from the tree transformation realized by a tree transducer we will also be interested in the image of a recognizable tree language under a tree transformation and the yield of that image. Definition 4.41. Let F be a class of tree transformations. An F -surface tree language is a language M (L) with M ∈ F and L ∈ RECOG. An F -target language is the yield of an F -surface language. An F -translation is a string relation {(yield(s), yield(t)) | (s, t) ∈ M and s ∈ L} for some M ∈ F and L ∈ RECOG. The classes of F -surface and F -target languages will be denoted by F -Surface and F -Target respectively. It is clear that, for all classes F discussed so far, since the identity transformation is in F, RECOG ⊆ F -Surface, and so CFL ⊆ F -Target. Moreover it is clear from the proof of Theorem 3.64 that if HOM ⊆ F, then the above inclusions are proper.

4.3 Comparison of B and T, the nondeterministic case The main differences between the bottom-up and the top-down tree transducer are the following. Property (B). Nondeterminism followed by copying. A bottom-up ftt has the ability of first processing an input subtree nondeterministically and then copying the resulting output tree. Property (T). Copying followed by different processing (by nondeterminism or by different states). A top-down ftt has the ability of first copying an input subtree and then treating the resulting copies differently. Property (B0 ). Checking followed by deletion. A bottom-up ftt has the ability of first processing an input subtree and then deleting the resulting output subtree. In other words, depending on a (recognizable) property of the input subtree, it can decide whether to delete the output subtree or do something else with it. It should be intuitively clear that top-down ftt do not possess properties (B) and (B0 ), whereas bottom-up ftt do not have property (T). We now show that these differences

46

also result in differences in the corresponding classes of tree transformations. Notation 4.42. For any alphabet Σ, not containing the brackets [ and ], we define a function m : Σ+ → (Σ ∪ {[ , ]})+ as follows: for a ∈ Σ and w ∈ Σ+ , m(a) = a and m(aw) = a[m(w)]. For instance, m(aab) = a[a[b]]. Note that m is a kind of converse to the mapping ftd discussed after Definition 2.21. A tree of the form m(w) will also be called a monadic tree. Theorem 4.43. The classes of bottom-up and top-down tree transformations are incomparable. In particular, there are tree transformations in PNB − T and PNT − B. Proof. (1) Consider the bottom-up ftt M of Example 4.25. M is in PNB and is a typical example of an ftt having property (B). It is intuitively clear that M is not realizable by a top-down ftt. In fact, consider for each n ≥ 1 the specific input tree f [m(an e)]. This tree is nondeterministically transformed by M into all trees of the form f [m(we)m(we)], where w is a string over {a, b} of length n. Suppose that the top-down ftt N = (Q0 , Σ0 , ∆0 , R0 , Q0d ) could do the same transformation of these input trees. Then, roughly speaking, N would first have to make a copy of m(an e) and would then have to relabel the two copies in an arbitrary but identical way, which is clearly impossible. A formal proof goes as follows. If N realizes the same transformation, then, for each n ≥ 1 and each w ∈ {a, b}∗ of ∗ length n, there is a derivation q0 [f [m(an e)]] =⇒ f [m(we)m(we)] for some q0 in Q0d . N

Let us consider a fixed n. Consider, in each of these 2n derivations, the first string of the form f [t1 t2 ]; that is, consider the moment that f is produced as output. Note that this is not necessarily the second string of the derivation, since the transducer may first erase the input symbol f and some of the a’s before producing any output (thus the derivation ∗ ∗ may look like q0 [f [m(an e)]] = ⇒ q[a[m(ak e)]] ⇒ f [t1 t2 ] = ⇒ f [m(we)m(we)] for some q ∈ Q0 ∗ and some k, 0 ≤ k < n, or even like q0 [f [m(an e)]] = ⇒ q[e] ⇒ f [t1 t2 ] = f [m(we)m(we)] for some q ∈ Q). Obviously, for different derivations these strings have to be different: ∗ ∗ ∗ ∗ if, for w 6= w0 , both t1 = ⇒ m(we), t2 = ⇒ m(we) and t1 = ⇒ m(w0 e), t2 = ⇒ m(w0 e), then also ∗ f [t1 t2 ] = ⇒ f [m(we)m(w0 e)], which is an invalid output. Therefore there are 2n of such strings f [t1 t2 ]. However it is clear that f [t1 t2 ] is of the form f [ t1 t2 ]hx1 ← m(ak e)i, where 0 ≤ k ≤ n and f [ t1 t2 ] is the right hand side of a rule in R0 . Therefore the number of possible f [t1 t2 ]’s is less than (n + 1)r, where r = #(R0 ). For n sufficiently large this is a contradiction. (2) Consider now the top-down ftt M of Example 4.37. M is in PNT and is a typical example of an ftt having property (T). Suppose that M can be realized by a bottom-up ftt N = (Q0 , Σ0 , ∆0 , R0 , Q0d ). Consider again for each n ≥ 1 the specific input tree f [m(an e)]. This tree should be transformed by N into all trees of the form f [m(w1 e)m(w2 e)] for w1 , w2 ∈ {a, b}∗ of length n. Let us consider, in each of the derivations realizing this transformation, the first string which contains the output tree. Note that this is not necessarily the last string since N may end its computation by erasing a number of a’s and the input f . Obviously, this string is obtained from the previous one by application of a rule with right hand side of the form q[f [ t1 t2 ]], where q ∈ Q0 , t1 , t2 ∈ T∆ ({x1 })

47

and there are s1 and s2 such that t1 hx1 ← s1 i = m(w1 e) and t2 hx1 ← s2 i = m(w2 e). Obviously, if f [ t1 t2 ] contains no x1 or only one x1 , then the rule can only be used for exactly one input tree f [m(an e)]. Thus we may choose n such that in all derivations starting with f [m(an e)] the right hand side q[f [ t1 t2 ]] contains two x1 ’s (it cannot contain more). Thus q[f [ t1 t2 ]] is of the form q[f [m(v1 x1 )m(v2 x1 )]] for certain v1 , v2 ∈ {a, b}∗ . By choosing n larger than the length of all such v1 ’s and v2 ’s occurring in right hand sides of rules in R0 , we see that the output tree always has two equal subtrees 6= e: it has to be of the form f [m(v1 we)m(v2 we)] for some w ∈ {a, b}+ . Thus, for such an n, not all possible outputs are produced. This is a contradiction. An important property of a class F of tree transformations is whether it is closed under composition. If so, then we know that each sequence of transformations from F can be realized by one tree transducer (corresponding to the class F ). We then also know that the class of F -surface tree languages is closed under the transformations of F. The next theorem shows that unfortunately the classes of top-down and bottom-up tree transformations are not closed under composition. This nonclosure is caused by the failure of property (B) for top-down transformations (property (T) for bottom-up transformations). Theorem 4.44. T and B are not closed under composition. In particular, there are tree transformations in (REL ◦ HOM) − T and in (HOM ◦ REL) − B. Proof. (1) The bottom-up ftt M of Example 4.25 can be realized by the composition of a relabeling and a homomorphism. Let Ω0 = {e} and Ω1 = {a, b, f }. Let r be the relabeling from Σ into Ω defined by r0 (e) = {e}, r1 (a) = {a, b} and r1 (f ) = {f }. Let h be the tree homomorphism from Ω into ∆ defined by h0 (e) = e, h1 (a) = a[x1 ], h1 (b) = b[x1 ] and h1 (f ) = f [x1 x1 ]. Then, for all s ∈ TΣ and t ∈ T∆ , (s, t) ∈ M iff there exists u in TΩ such that u ∈ r(s) and h(u) = t. Thus, by the first part of the proof of Theorem 4.43, M is in (REL ◦ HOM) − T. (2) The top-down ftt M of Example 4.37 can be realized by the composition of a homomorphism and a relabeling. Let Π be the ranked alphabet with Π0 = {e}, Π1 = {a} and Π2 = {f }. Let h be the tree homomorphism from Σ into Π defined by h0 (e) = e, h1 (a) = a[x1 ] and h1 (f ) = f [x1 x1 ]. Let r be the relabeling from Π into ∆ defined by r0 (e) = {e}, r1 (a) = {a, b} and r2 (f ) = {f }. Then, for all s ∈ TΣ and t ∈ T∆ , (s, t) ∈ M iff there exists u in TΠ such that h(s) = u and t ∈ r(u). Thus, by the second part of the proof of Theorem 4.43, M is in (HOM ◦ REL) − B. Exercise 4.45. Prove the statements in the above proof.

One might get the impression that each bottom-up (resp. top-down) tree transformation can be realized by two top-down (resp. bottom-up) tree transducers (i.e., B ⊆ T ◦ T, resp. T ⊆ B ◦ B). We shall show later that this is true. Let us now consider the linear case. Since properties (B) and (T) are now eliminated, the only remaining difference between linear top-down and bottom-up tree transducers is caused by property (B0 ).

48

Lemma 4.46. There is a tree transformation M that belongs to LDB, but not to T. M can be realized by the composition of a deterministic top-down fta with a linear homomorphism. Proof. Let Σ0 = {c}, Σ1 = {b}, Σ2 = {a}, ∆0 = {c} and ∆1 = {a, b}. Consider the tree transformation M = {(a[tc], a[t]) | t = m(bn c) for some n ≥ 0}. We shall show that M∈ / T. The rest of the proof is left as an exercise. Suppose that there is a top-down ftt N = (Q, Σ0 , ∆0 , R, Qd ) such that T (N ) = M . Each successful derivation of N has to start with the application of a rule q0 [a[x1 x2 ]] → s, where q0 ∈ Qd and s ∈ T∆0 (X2 ). Now, if s contains no x1 , then we could change the input a[tc] into a[t0 c] without changing the output. If s contains no x2 , then we could change a[tc] into a[tb[c]] and still obtain (the same) output. But if s contains both x1 and x2 then it has to contain a symbol of rank 2 and so a[t] cannot be derived. Since both deterministic top-down fta and linear homomorphisms belong to LDT we can state the following corollary. Corollary 4.47. Composition of linear deterministic top-down tree transformations leads out of the class of top-down tree transformations; in a formula: (LDT ◦ LDT) − T 6= ∅. We now show that, in some sense, property (B0 ) is the only cause of difference between linear bottom-up and linear top-down tree transformations. Firstly, all linear top-down tree transformations can be realized linear bottom-up. Secondly, in the nondeleting linear case, all differences between top-down and bottom-up are gone (this can be considered as a generalization of Theorem 3.17). Theorem 4.48. (1) LT ( LB, (2) NLT = NLB. Proof. We first show part (2). Let us say that a nondeleting linear bottom-up ftt M = (Q, Σ, ∆, R, Qd ) and a nondeleting linear top-down ftt N = (Q0 , Σ0 , ∆0 , R0 , Q0d ) are “associated” if Q = Q0 , Σ = Σ0 , ∆ = ∆0 , Qd = Q0d and (i) for each a ∈ Σ0 , q ∈ Q and t ∈ T∆ , a → q[t] is in R iff q[a] → t is in R0 ; (ii) for each k ≥ 1, a ∈ Σk , q1 , . . . , qk , q ∈ Q and t ∈ T∆ (Xk ) linear and nondeleting w.r.t. Xk , a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R iff q[a[x1 · · · xk ]] → thx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i is in R0 .

49

Note that each tree r ∈ T∆ (Q[Xk ]), which is linear and nondeleting w.r.t. Xk , is of the form thx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i, where t ∈ T∆ (Xk ) is linear and nondeleting w.r.t. Xk (in fact, t is the result of replacing qi [xi ] by xi in r). Therefore it is clear that for each M ∈ NLB there exists an associated N ∈ NLT and vice versa. Hence it suffices to prove that associated ftt realize the same tree transformation. Let M and N be associated as above. We shall prove, by induction on s, that for every q ∈ Q, s ∈ TΣ and u ∈ T∆ , ∗

s =⇒ q[u]

iff

M

∗

q[s] =⇒ u.

(∗)

N

For s ∈ Σ0 , (∗) is obvious. Suppose now that s = a[s1 · · · sk ] for some k ≥ 1, a ∈ Σk and s1 , . . . , sk ∈ TΣ . The only-if part of (∗) is left to the reader (it is similar to the proof of Theorem 4.28(3)). The if-part of (∗) is proved as follows (it is similar to the ∗ proof of Theorem 3.65). Let the first rule applied in the derivation q[a[s1 · · · sk ]] =⇒ u be N

q[a[x1 · · · xk ]] → r, and let r = thx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i for certain t ∈ T∆ (Xk ) and ∗ q1 , . . . , qk ∈ Q. Thus q[a[s1 · · · sk ]] =⇒ thx1 ← q1 [s1 ], . . . , xk ← qk [sk ]i =⇒ u. Since t is linN

N

ear and nondeleting, there exist u1 , . . . , uk ∈ T∆ such that u = thx1 ← u1 , . . . , xk ← uk i ∗ ∗ and qi [si ] =⇒ ui for all i, 1 ≤ i ≤ k. Hence, by induction, si =⇒ qi [ui ] for all i, N

M

1 ≤ i ≤ k. Also, by associatedness, the rule a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R. Conse∗ quently, a[s1 · · · sk ] =⇒ a[q1 [u1 ] · · · qk [uk ]] =⇒ q[thx1 ← u1 , . . . , xk ← uk i] = q[u]. M

M

We now show part (1). By Lemma 4.46, it suffices to show that LT ⊆ LB. In principle we can use the construction used above to show NLT ⊆ NLB. The only problem is that the top-down transducer N may delete subtrees, whereas a bottom-up transducer is forced to process a subtree before deleting it. The solution is to add an “identity state” d to the set of states of M which allows M to process any subtree which has to be ∗ deleted (d is such that for all t ∈ TΣ , t =⇒ d[t]). The formal construction is as follows. M

Let N = (Q, Σ, ∆, R, Qd ) be a linear top-down ftt. Construct the linear bottom-up ftt M = (Q ∪ {d}, Σ, ∆ ∪ Σ, RM , Qd ), where RM is obtained as follows. (i) For each a ∈ Σ0 the rule a → d[a] is in RM , and for each k ≥ 1 and a ∈ Σk the rule a[d[x1 ] · · · d[xk ]] → d[a[x1 · · · xk ]] is in RM . (ii) For q ∈ Q, a ∈ Σ0 and t ∈ T∆ , if q[a] → t is in R, then a → q[t] is in RM . (iii) Let q[a[x1 · · · xk ]] → t be in R, where q ∈ Q, k ≥ 1, a ∈ Σk and t is a linear tree in T∆ (Q[Xk ]). Determine the (unique) states q1 , . . . , qk ∈ Q ∪ {d} such that, for 1 ≤ i ≤ k, either qi [xi ] occurs in t or (xi does not occur in t and) qi = d. Determine t0 ∈ T∆ (Xk ) such that t0 hx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i = t. Then the rule a[q1 [x1 ] · · · qk [xk ]] → q[t0 ] is in RM . Again (∗) can be proved, and since the proof only slightly differs from the previous one, it is left to the reader. Exercise 4.49. Find an example of a tree transformation in LDt B − T.

Exercise 4.50. Compare the classes PLT and PLB.

50

Exercise 4.51. Let a deterministic top-down ftt be called “simple” if it is not allowed to make different translations of the same input subtree (if q[a[x1 · · · xk ]] → t is a rule and q1 [xi ], q2 [xi ] occur in t, then q1 = q2 ). Prove that the class of simple deterministic top-down tree transformations is included in B. (This result should be expected from the fact that property (T) is eliminated. Similarly, one can prove that NDB ⊆ T, because properties (B) and (B0 ) are eliminated.)

4.4 Decomposition and composition of bottom-up tree transformations Since bottom-up tree transformations are theoretically easier to handle than top-down tree transformations, we start investigating the former. We have seen that a bottom-up ftt can copy after nondeterministic processing (property (B)). The next theorem shows that these two things can in fact be taken apart into different phases of the transformation: each bottom-up ftt can be decomposed into two transducers, the first doing the nondeterminism (linearly) and the second doing the copying (deterministically). Theorem 4.52. Each bottom-up tree transformation can be realized by a finite state relabeling followed by a homomorphism. In formula: B ⊆ QREL ◦ HOM. Moreover

LB ⊆ QREL ◦ LHOM and DB ⊆ DBQREL ◦ HOM.

Proof. Let M = (Q, Σ, ∆, R, Qd ) be a bottom-up ftt. To simulate M in two phases we apply a technique similar to the one use in the proof of Theorem 3.58: a finite state relabeling is used to put information on each node indicating by which piece of the tree the node should be replaced; then a homomorphism is used to actually replace each node by that piece of tree. The formal construction is as follows. We simultaneously construct a ranked alphabet Ω, the set of rules RN of a bottom-up ftt N = (Q, Σ, Ω, RN , Qd ) and a homomorphism h : TΩ → T∆ as follows. (i) If a → q[t] is a rule in R, then dt is a (new) symbol in Ω0 , a → q[dt ] is in RN and h0 (dt ) = t. (ii) If a[q1 [x1 ] · · · qk [xk ]] → q[t] is a rule in R, then dt is a (new) symbol in Ωk , a[q1 [x1 ] · · · qk [xk ]] → q[dt [x1 · · · xk ]] is in RN and hk (dt ) = t. The only requirement on the symbols of Ω is that if t1 6= t2 then dt1 6= dt2 . Obviously N is a (bottom-up) finite state relabeling. Also, if M is linear then h is linear, and if M is deterministic then so is N . It can easily be shown (by induction on s) that, for s ∈ TΣ , q ∈ Q and t ∈ T∆ , ∗

s =⇒ q[t] M

iff

∗

∃u ∈ TΩ : s =⇒ q[u] and h(u) = t. N

From this it follows that M = N ◦ h, which proves the theorem.

51

Example 4.53. Consider the bottom-up ftt M of Example 4.18. It can be decomposed as follows. Firstly, Ω0 = {a, b} and Ω2 = {m1 , m2 , n1 , n2 }, where da = a, db = b, dm[x1 x1 ] = m1 , dm[x2 x2 ] = m2 , dn[x1 x1 ] = n1 and dn[x2 x2 ] = n2 . Secondly, N = (Q, Σ, Ω, RN , Qd ), where RN consists of the rules a → q0 [a],

b → q0 [b]

f [qi [x1 ]qj [x2 ]] → q1−i [m1 [x1 x2 ]],

g[qi [x1 ]qj [x2 ]] → q1−i [n1 [x1 x2 ]],

f [qi [x1 ]qj [x2 ]] → q1−j [m2 [x1 x2 ]],

g[qi [x1 ]qj [x2 ]] → q1−j [n2 [x1 x2 ]]

for all i, j ∈ {0, 1}. Finally, h is defined by h0 (a) = a, h0 (b) = b, h2 (m1 ) = m[x1 x1 ], h2 (m2 ) = m[x2 x2 ], h2 (n1 ) = n[x1 x1 ] and h2 (n2 ) = n[x2 x2 ]. For example, ∗ f [a[g[ab]]] =⇒ q0 [m2 [a[n2 [ab]]]] and h(m2 [a[n2 [ab]]]) = m[n[bb]n[bb]]. N

Note that Theorem 4.52 means (among other things) that each bottom-up tree transformation can be realized by the composition of two top-down tree transformations (cf. Theorem 4.43). We now show that, in the nondeterministic case, the finite state relabeling can still be decomposed further into a relabeling followed by a finite tree automaton restriction. Theorem 4.54. B ⊆ REL ◦ FTA ◦ HOM and LB ⊆ REL ◦ FTA ◦ LHOM. Proof. By the previous theorem and the fact that HOM and LHOM are closed under composition (see Exercise 4.9) it clearly suffices to show that QREL ⊆ REL◦FTA◦LHOM. Let M = (Q, Σ, ∆, R, Qd ) be a bottom-up finite state relabeling. We shall actually show that M can be simulated by a relabeling, followed by an fta, followed by a projection (which is in LHOM ). The relabeling guesses which rule is applied by M at each node (and puts that rule as a label on the node), the bottom-up fta checks whether this guess is in accordance with the possible state transitions of M , and finally the projection labels the node with the right label. Formally we construct a ranked alphabet Ω, a relabeling r from TΣ into TΩ , a (nondeterministic) bottom-up fta N = (Q, Ω, δ, S, Qd ) and a projection p from TΩ into T∆ as follows. (i) If rule m in R is of the form a → q[b], then dm is a (new) symbol in Ω0 , dm ∈ r0 (a), q ∈ Sdm and p0 (dm ) = b. (ii) If rule m in R is of the form a[q1 [x1 ] · · · qk [xk ]] → q[b[x1 · · · xk ]], then dm is a (new) symbol in Ωk , dm ∈ rk (a), q ∈ δdkm (q1 , . . . , qk ) and pk (dm ) = b. We require that if m and n are different rules, then dm 6= dn . It is left to the reader to show that, for s ∈ TΣ , q ∈ Q and t ∈ T∆ , ∗

s =⇒ q[t] M

iff

∃u ∈ TΩ : u ∈ r(s), u ∈ L(N ) and t = p(u).

This proves the theorem.

52

These decomposition results are often very helpful when proving something about bottom-up tree transformations: the proof can often be split up into proofs about REL, FTA and HOM only. As an example, we immediately have the following result from Theorem 4.54 and Theorems 3.32, 3.48 and 3.65 (note that a class of tree languages is closed under fta restrictions if and only if it is closed under intersection with recognizable tree languages!). Corollary 4.55. RECOG is closed under linear bottom-up tree transformations.

(This expresses that the image of a recognizable tree language under a linear bottom-up tree transformation is again recognizable. In other words, LB -Surface = RECOG.) Exercise 4.56. Prove, using Theorem 4.54, that B -Surface = HOM -Surface. Prove that, in fact, each B -Surface tree language is the homomorphic image of a rule tree language. We now prove that under certain circumstances the composition of two elements in B is again in B. Recall from Section 4.3 that the non-closure of B under composition was caused by the failure of property (T) for B : in general, in B , we can’t compose a copying transducer with a nondeterministic one. We now show that if either the first transducer is noncopying or the second one is deterministic, then their composition is again in B . Thus, when eliminating (the failure of) property (T), closure results are obtained. Theorem 4.57. (1)

LB ◦ B ⊆ B and

LB ◦ LB ⊆ LB.

(2) B ◦ DB ⊆ B and DB ◦ DB ⊆ DB. Proof. Because of our decomposition of B we only need to look at special cases. These are treated in three lemmas, concerning composition with homomorphisms, fta restrictions and relabelings respectively. Detailed induction proofs of these lemmas are left to the reader. Lemma. B ◦ HOM ⊆ B, LB ◦ LHOM ⊆ LB and DB ◦ HOM ⊆ DB. Proof. Let M = (Q, Σ, ∆, R, Qd ) be a bottom-up ftt and h a tree homomorphism from T∆ into TΩ . We have to show that M ◦ h can be realized by a bottom-up ftt N . The idea is the same as that of Theorem 3.65: N simulates M but outputs at each step the homomorphic image of the output of M . Note that, contrary to the proof of Theorem 3.65 (which was concerned with regular tree grammars, a top-down device), we need not require linearity of h. The construction is as follows. Extend h to trees in T∆ (X) by defining h0 (xi ) = xi for all xi in X. Thus h is now a homomorphism from T∆ (X) into TΩ (X). Define N = (Q, Σ, Ω, RN , Qd ) such that (i) if a → q[t] is in R, then a → q[h(t)] is in RN ; (ii) if a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R, then a[q1 [x1 ] · · · qk [xk ]] → q[h(t)] is in RN .

53

Obviously, if M and h are linear then so is N (the linear homomorphism transforms a linear tree in T∆ (X) into a linear tree in TΩ (X)), and if M is deterministic then so is N . Lemma. B ◦ FTA ⊆ B, LB ◦ FTA ⊆ LB and DB ◦ FTA ⊆ DB. Proof. The idea of the proof is similar to the one used to solve Exercise 3.68. Let e = (QN , ∆, ∆, RN , QdN ) be a M = (Q, Σ, ∆, R, Qd ) be a bottom-up ftt and let N deterministic bottom-up ftt corresponding to a deterministic bottom-up fta as in the e can be realized by a bottom-up proof of Theorem 4.28(2). We have to show that M ◦ N ftt K. K will have Q × QN as its set of states and it will simultaneously simulate M e at the computed output of M . and keep track of the state of N e Extend N by expanding its alphabet to ∆ ∪ X (or, better, to ∆ ∪ Xm where m is the highest subscript of a variable occurring in the rules of M ) and by allowing the variables e in its rules to range over T∆ (X). Thus the computation of the finite tree automaton N may now be started with an element of T∆ (QN [X]), which means that, at certain places e has to start in prescribed start states. in the tree, N Construct K = (Q × QN , Σ, ∆, RK , Qd × QdN ) such that ∗

(i) if a → q[t] is in R and if t =⇒ q 0 [t], then a →

(q, q 0 )[t]

e N

is in RK ; ∗

(ii) if a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R, and if thx1 ← q10 [x1 ], . . . , xk ← qk0 [xk ]i =⇒ q 0 [t], then the rule a[(q1 , q10 )[x1 ] · · · (qk , qk0 )[xk ]] → (q, q 0 )[t] is in RK .

e N

e is deterministic, if M is Note that if M is linear, then so is K. Moreover, since N deterministic then so is K. Lemma. B ◦ DBQREL ⊆ B, DB ◦ DBQREL ⊆ DB and LB ◦ REL ⊆ LB. Proof. The proofs of the first two statements are easy generalizations of the proof of the previous lemma. The proof of the third statement is left as an easy exercise. We now finish the proof of Theorem 4.57. Firstly LB ◦ B ⊆ LB ◦ REL ◦ FTA ◦ HOM

(Thm. 4.54)

⊆ LB ◦ FTA ◦ HOM

(3d lemma)

⊆ LB ◦ HOM

(2d lemma)

⊆B

(1st lemma)

54

and similarly for LB ◦ LB ⊆ LB. Secondly B ◦ DB ⊆ B ◦ DBQREL ◦ HOM

(Thm. 4.52)

⊆ B ◦ HOM

(3d lemma)

⊆B

(1st lemma)

and similarly for DB ◦ DB ⊆ DB.

Note that the “right hand side” of Theorem 4.57 states that LB and DB are closed under composition. Exercise 4.58. Is LDB closed under composition?

Exercise 4.59. Prove that B -Surface is closed under intersection with recognizable tree languages. A consequence of Theorem 4.57 (or in fact its second lemma) is the following useful fact. Corollary 4.60. RECOG is closed under inverse bottom-up tree transformations (in particular under inverse tree homomorphisms, cf. Exercise 3.68). Proof. Let L ∈ RECOG and M ∈ B. We have to show that M −1 (L) is in the class RECOG. Obviously, if R is the finite tree automaton restriction {(t, t) | t ∈ L}, then M −1 (L) = dom(M ◦ R). By Theorem 4.57, M ◦ R is in B and so, by Exercise 4.29, its domain is in RECOG. Remark 4.61. Note that, since, for arbitrary tree transformations M1 and M2 , (M1 ◦ M2 )−1 = M2−1 ◦ M1−1 , Corollary 4.60 implies that if M is the composition of any finite number of bottom-up tree transformations (in particular, elements of REL, FTA and HOM ), then RECOG is closed under M −1 ; moreover, since dom(M ) = M −1 (T∆ ), the domain of any such tree transformation M is recognizable. We finally note that, because of the composition results of Theorem 4.57, the inclusion signs in Theorems 4.52 and 4.54 may be replaced by equality signs. It is also easy to prove equations like B = LB ◦ HOM, B = LB ◦ DB or B = REL ◦ DB (cf. property (B)). The first equation has some importance of its own, since it characterizes B without referring to the notion “bottom-up”; this is so because LB has such a characterization as shown next (we first need a definition). Definition 4.62. Let F be a class of tree transformations containing all identity transformations. For each n ≥ 1 we define F n inductively by F 1 = F and F n+1 = S Fn ◦ F. ∗ Moreover, the closure of F under composition, denoted by F , is defined to be F n. n≥1

Thus F ∗ consists of all tree transformations of the form M1 ◦ M2 ◦ · · · ◦ Mn , where n ≥ 1 and Mi ∈ F for all i, 1 ≤ i ≤ n.

55

Corollary 4.63. (1) LB = (REL ∪ FTA ∪ LHOM)∗ . (2)

B = LB ◦ HOM.

Proof. The inclusions ⊆ follow from Theorems 4.52 and 4.54; the inclusions ⊇ follow from Theorem 4.57.

4.5 Decomposition of top-down tree transformations We now show that, analogously to the bottom-up case, we can decompose each top-down ftt into two transducers, the first doing the copying and the second doing the rest of the work (cf. property (T)). Theorem 4.64. T ⊆ HOM ◦ LT and DT ⊆ HOM ◦ LDT. Proof. Let M = (Q, Σ, ∆, R, Qd ) be a top-down ftt. While processing an input tree, M generally makes a lot of copies of input subtrees in order to get different translations of these subtrees. To simulate M we can first use a homomorphism which simply makes as many copies of subtrees as are needed by M , and then we can simulate M linearly (since all copies are already there). The formal construction is as follows. We first determine the “degree of copying” of M , that is the maximal number of copies needed by M in any step of its computation. Thus, for x ∈ X and r ∈ R, let rx be the number of occurrences of x in the right hand side of r. Let n = max{rx | x ∈ X, r ∈ R}. We now let Ω be the ranked alphabet obtained from Σ by multiplying the ranks of all symbols by n (so that a node may be connected to n copies of each of its subtrees). Thus Ωkn = Σk for all k ≥ 0. The “copying” homomorphism h from TΣ into TΩ is now defined by (i) for a ∈ Σ0 , h0 (a) = a (ii) for k ≥ 1 and a ∈ Σk , hk (a) = a[xn1 xn2 · · · xnk ]. (For example, if k = 2 and n = 3, then h2 (a) = a[x1 x1 x1 x2 x2 x2 ]). Finally the top-down ftt N = (Q, Ω, ∆, RN , Qd ) is defined as follows. (i) If q[a] → t is a rule in R, then it is also in RN . (ii) Suppose that q[a[x1 · · · xk ]] → t is a rule in R. Let us denote the variables x1 , x2 , . . . , xkn by x1,1 , x1,2 , . . . , x1,n , x2,1 , . . . , x2,n , . . . , xk,1 , . . . , xk,n respectively. Then the rule q[a[x1,1 · · · x1,n · · · xk,1 · · · xk,n ]] → t0 is in RN , where t0 is taken such that it is linear and such that t0 hxi,j ← xi i 1≤i≤k = t 1≤j≤n

(t0

can be obtained by putting different second subscripts on different occurrences of the same variable in t. For instance, if q[a[x1 x2 ]] → b[q1 [x2 ]cd[q2 [x1 ]q3 [x2 ]]] is in R and n = 3, then we can put the rule q[a[x1,1 x1,2 x1,3 x2,1 x2,2 x2,3 ]] → b[q1 [x2,1 ]cd[q2 [x1,1 ]q3 [x2,2 ]]] in RN .) Obviously, if M is deterministic, then so is N . A formal proof of the fact that M = h◦N is left to the reader.

56

Example 4.65. Consider the top-down ftt M of Example 4.33 and let us consider its decomposition according to the above proof. Clearly n = 2 and therefore the definition of h for, for example, +, ∗, −, a and b is h2 (+) = +[x1 x1 x2 x2 ], h2 (∗) = ∗[x1 x1 x2 x2 ], h1 (−) = −[x1 x1 ], h0 (a) = a and h0 (b) = b. Thus, for example, h(∗[+[ab] − [a]]) = ∗[+[aabb] + [aabb] − [aa] − [aa]]. For instance the first three rules of M turn into the following three rules for N : q[+[x1,1 x1,2 x2,1 x2,2 ]] → +[q[x1,1 ]q[x2,1 ]], q[∗[x1,1 x1,2 x2,1 x2,2 ]] → +[∗[q[x1,1 ]i[x2,1 ]] ∗ [i[x1,2 ]q[x2,2 ]]], q[−[x1,1 x1,2 ]] → −[q[x1,1 ]]. It is left to the reader to see how N processes h(∗[+[ab] − [a]]).

Since LT ⊆ LB (Theorem 4.48), we know already how to decompose LT . This gives us the following result. Corollary 4.66. T ⊆ HOM ◦ LB = HOM ◦ REL ◦ FTA ◦ LHOM.

Notice that the inclusion is proper by Lemma 4.46. From this corollary we see that each top-down tree transformation can be realized by the composition of two bottom-up tree transformations (cf. Theorem 4.43). Another way of expressing our decomposition results concerning B and T is by the equation B ∗ = T ∗ = (REL ∪ FTA ∪ HOM)∗ . By the above corollary and Remark 4.61 we obtain that RECOG is closed under inverse top-down tree transformations, and in particular Corollary 4.67. The domain of a top-down tree transformation is recognizable.

The next theorem says, analogously to Theorem 4.52, that each deterministic element of LT can be decomposed into two simpler (deterministic) ones. Theorem 4.68. LDT ⊆ DTQREL ◦ LHOM. Proof. See the proof of Theorem 4.52.

We now note that we cannot obtain very nice results about closure under composition analogously to those of Theorem 4.57, since for instance LT and DT are not closed under composition (see Corollary 4.47). The reason is essentially the failure of property (B0 ) for top-down ftt. One could get closure results by eliminating both properties (B) and (B0 ). For example, one can show that Dt T ◦ T ⊆ T, T ◦ NLT ⊆ T, etc. However, after having compared DB with DT in the next section, we prefer to extend the top-down ftt in such a way that it has the capability of checking before deletion. It will turn out that the so extended top-down transducer has all the nice properties “dual” to those of B . The following is an easy exercise in composition. Exercise 4.69. Prove that every T -surface tree language is in fact the image of a rule tree language under a top-down tree transformation.

57

4.6 Comparison of B and T, the deterministic case Although, by determinism, some differences between B and T are eliminated, DB and DT are still incomparable for a number of reasons. We first discuss the question why DB contains elements not in DT , and then the reverse question. Firstly, we have seen in Lemma 4.46 that DB contains elements not in T . This was caused by property (B0 ). Secondly, we have seen in Theorem 3.14 (together with Theorem 3.8) that there are deterministic bottom-up recognizable tree languages which cannot be recognized deterministically top-down. It is easy to see that the corresponding fta restrictions cannot be realized by a deterministic top-down transducer. In fact the following can be shown. Exercise 4.70. Prove that the domain of a deterministic top-down ftt can be recognized by a deterministic top-down fta. Thirdly, there is a trivial reason that DB is stronger than DT : a bottom-up ftt can, for example, recognize the “lowest” occurrence of some symbol in a tree (since it is the first occurrence), whereas a deterministic top-down ftt cannot (since for him it is the last occurrence). Lemma 4.71. There is a tree transformation in DBQREL which is not in DT . Proof. Let Σ0 = {b}, Σ1 = {a, f }, ∆0 = {b} and ∆1 = {a, a, f }. Consider the bottom-up ftt M = (Q, Σ, ∆, R, Qd ) where Q = Qd = {q1 , q2 } and R consists of the rules b → q1 [b], a[q1 [x]] → q1 [a[x]],

f [q1 [x]] → q2 [f [x]],

a[q2 [x]] → q2 [a[x]],

f [q2 [x]] → q2 [f [x]],

where x denotes x1 . Thus M is a deterministic bottom-up finite state relabeling which bars all a’s below the lowest f . Obviously a deterministic top-down ftt cannot give the right translation for both m(an b) and m(an f b). Let us now consider DT . Firstly, we note that property (T) has not been eliminated: a deterministic topdown ftt still has the ability to copy an input subtree and continue the translation of these copies in different states. Consider for example the tree transformation M = {(m(abn c), a[m(pn c)m(q n c)]) | n ≥ 0}. Obviously M is in DT . It can be shown, similarly to the proof of Theorem 4.43(2), that M is not in B (see also Exercise 4.51). Secondly, deterministic bottom-up ftt cannot distinguish between left and right (because they start at the bottom!), whereas deterministic top-down ftt can. Consider for example the tree transformation M = {(a[m(bn c)m(bk c)], a[m(bn d)m(bk e)]) | n, k ≥ 0}. This is obviously an element of DT (even DTQREL) and can be shown not to be in DB (of course it is in B : a nondeterministic bottom-up ftt can guess whether it is left or right and check its guess when arriving at the top). Thus we have the following lemma.

58

Lemma 4.72. There is a tree transformation in DTQREL which is not in DB .

Thirdly, there is a proof of this lemma which is analogous to the one of Lemma 4.71: there is a deterministic top-down finite state relabeling that bars all a’s above the highest f , and this cannot be done by an element of DB .

4.7 Top-down finite tree transducers with regular look-ahead One way to take away the advantages of DB over DT is to allow the top-down tree transducer to have a look-ahead: that is, the ability to inspect an input subtree and, depending on the result of that inspection, decide which rule to apply next. Moreover it seems to be sufficient (and natural) that this look-ahead ability should consist of inspecting whether the input subtree belongs to a certain recognizable tree language or not (in other words, checking whether it has a certain “recognizable property”). As a result of this capability the top-down tree transducer would first of all have property (B0 ): it can check a recognizable property of a subtree and decide whether to delete it or not. Secondly, the domain of a deterministic top-down tree transducer would be arbitrary recognizable (it just starts by checking whether the whole input tree belongs to the recognizable tree language). And, thirdly, a deterministic top-down tree transducer would for instance be able to see the “lowest” occurrence of some symbol in a tree (it just checks whether the subtree beneath the symbol contains another occurrence of the same symbol, and that is a recognizable property). We now formally define the top-down transducer with regular (= recognizable) lookahead. It turns out that the look-ahead feature can be expressed easily in a tree rewriting system (see Definition 4.12): for each rule we specify the ranges of the variables in the rule to be certain recognizable tree languages (such a rule is then applicable only if the corresponding input subtrees belong to these recognizable tree languages). Definition 4.73. A top-down (finite) tree transducer with regular look-ahead is a structure M = (Q, Σ, ∆, R, Qd ), where Q, Σ, ∆ and Qd are as for the ordinary top-down ftt and R is a finite set of rules of the form (t1 → t2 , D), where t1 → t2 is an ordinary top-down ftt rule and D is a mapping from Xk into P(TΣ ) (where k is the number of variables in t1 ) such that, for 1 ≤ i ≤ k, D(xi ) ∈ RECOG. (Whenever D is understood or will be specified later we write t1 → t2 rather than (t1 → t2 , D). We call t1 and t2 the left hand side and right hand side of the rule respectively). M is viewed as a tree rewriting system in the obvious way, (t1 → t2 , D) being a “rule scheme” (t1 , t2 , D). The tree transformation realized by M , denoted by T (M ) or M , is ∗ {(s, t) ∈ TΣ × T∆ | q[s] =⇒ t for some q ∈ Qd }. M

Thus a top-down ftt with regular look-ahead works in exactly the same way as an ordinary one, except that the application of each of its rules is restricted: the (input sub-)trees substituted in the rule should belong to prespecified recognizable tree languages. Note that for rules of the form q[a] → t the mapping D need not be specified.

59

Notation 4.74. The phrase “with regular look-ahead” will be indicated by a prime. Thus the class of top-down tree transformations with regular look-ahead will be denoted by T 0 . An element of T 0 is also called a top-down0 tree transformation. Example 4.75. Consider the tree transformation M in the proof of Lemma 4.46. It can be realized by the top-down0 ftt N = (Q, Σ, ∆, R, Qd ) where Q = {q0 , q}, Qd = {q0 } and R consists of the following rules: q0 [a[x1 x2 ]] → a[q[x1 ]] with ranges D(x1 ) = {m(bn c) | n ≥ 0} and D(x2 ) = {c}, q[b[x1 ]] → b[q[x1 ]] with D(x1 ) = TΣ , and q[c] → c . Note that, in the first rule, D(x1 ) could as well be TΣ since it is checked later by N that the left subtree contains no a’s. The essential use of regular look-ahead in this example is the restriction of the right subtree to {c}. We now define some subclasses of T 0 . Definition 4.76. Let M = (Q, Σ, ∆, R, Qd ) be a top-down0 ftt. The definitions of linear, nondeleting and one-state are identical to the bottom-up and top-down ones (see Definition 4.23). M is called (partial) deterministic if the following holds. (i) Qd is a singleton. (ii) If (s → t1 , D1 ) and (s → t2 , D2 ) are different rules in R (with the same left hand side), then D1 (xi ) ∩ D2 (xi ) = ∅ for some i, 1 ≤ i ≤ k (where k is the number of variables in s). Since the ranges of the variables are recognizable, it can effectively be determined whether a top-down0 ftt is deterministic (if, of course, these ranges are effectively specified, which we always assume). Notation 4.24 also applies to T 0 . Thus LDT 0 is the class of linear deterministic top-down tree transformations with regular look-ahead. Observe that, obviously, T ⊆ T 0 since each top-down ftt can be transformed trivially into a top-down0 ftt by specifying all ranges of all variables in all rules to be the (recognizable) tree language TΣ (where Σ is the input alphabet). Moreover, if Z is a modifier, then ZT ⊆ ZT 0 . It can easily be seen that Theorems 4.43 and 4.44 still hold with T replaced by T 0 . In fact, the proofs of the theorems are true without further change. In the next theorem we show that the regular look-ahead can be “taken out” of a top-down0 ftt. Theorem 4.77. T 0 ⊆ DBQREL ◦ T and ZT 0 ⊆ DBQREL ◦ ZT for Z ∈ {L, D, LD}.

60

Proof. Let M = (Q, Σ, ∆, R, Qd ) be in T 0 . Consider all recognizable properties which M needs for its look-ahead (that is, all recognizable tree languages D(xi ) occurring in the rules of M ). We can use a total deterministic bottom-up finite state relabeling to check, for a given input tree t, whether the subtrees of t have these properties or not, and to put at each node a (finite) amount of information telling us whether the direct subtrees of that node have the properties or not. After this relabeling we can use an ordinary top-down ftt to simulate M , because the look-ahead information is now contained in the label of each node. The formal construction might look as follows. Let L1 , . . . , Ln be all the recognizable tree languages occurring as ranges of variables in the rules of M . Let U denote the set {0, 1}n , that is, the set of all sequences of 0’s and 1’s of length n. The j th element of u ∈ U will be denoted by uj . An element u of U will be used to indicate whether a tree belongs to L1 , . . . , Ln or not (uj = 1 iff the tree is in Lj ). We now introduce a new ranked alphabet Ω such that Ω0 = Σ0 and, for k ≥ 1, Ωk = Σk × U k . Thus an element of Ωk is of the form (a, (u1 , . . . , uk )) with a ∈ Σk and u1 , . . . , uk ∈ U . If a node is labeled by such a symbol, it will mean that ui contains all the information about the ith subtree of the node. Next we define the mapping f : TΣ → TΩ as follows: (i) for a ∈ Σ0 , f (a) = a; (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , f (a[t1 · · · tk ]) = b[f (t1 ) · · · f (tk )], where b = (a, (u1 , . . . , uk )) and, for 1 ≤ i ≤ k and 1 ≤ j ≤ n, uji = 1 iff ti ∈ Lj . It is left as an exercise to show that f can be realized by a (total) deterministic bottom-up finite state relabeling (given the deterministic bottom-up fta’s recognizing L1 , . . . , Ln ). We now define a top-down ftt N = (Q, Ω, ∆, RN , Qd ) such that (i) if q[a] → t is in R, then it is in RN ; (ii) if (q[a[x1 · · · xk ]] → t, D) is in R, then each rule of the form q[(a, u)[x1 · · · xk ]] → t is in RN , where u = (u1 , . . . , uk ) ∈ U k and u satisfies the condition: if D(xi ) = Lj then uji = 1 (for all i and j, 1 ≤ i ≤ k, 1 ≤ j ≤ n). This completes the construction. It is obvious from this construction that M = f ◦ N . Moreover, if M is linear, then so is N . It is also clear that, in the construction above, the set U may be replaced by the set {u ∈ U | for all j1 and j2 , if Lj1 ∩ Lj2 = ∅, then uj1 · uj2 6= 1} (note that this influences RN : rules containing elements not in this set are removed). One can now easily see that if M is deterministic, then so is N . An immediate consequence of this theorem and previous decomposition results is that each element of T 0 is decomposable into elements of REL, FTA and HOM . Corollary 4.78. The domain of a top-down ftt with regular look-ahead is recognizable. Proof. For instance by Remark 4.61.

61

Another consequence of Theorem 4.77 is that the addition of regular look-ahead has no influence on the surface tree languages. Corollary 4.79. T 0 -Surface = T -Surface and DT 0 -Surface = DT -Surface. Proof. Let L be a (D)T 0 -surface tree language, so L = M (L1 ) for some M ∈ (D)T 0 and L1 ∈ RECOG. Now, by Theorem 4.77, M = R ◦ N for some R ∈ DBQREL and N ∈ (D)T. Hence L = N (R(L1 )). Since RECOG is closed under linear bottom-up tree transformations (Corollary 4.55), N (R(L1 )) is a (D)T -surface tree language. We now show that, in the linear case, there is no difference between bottom-up and top-down0 tree transformations (all properties (B), (T) and (B0 ) are “eliminated”), cf. Theorem 4.48. Theorem 4.80. LT 0 = LB. Proof. First of all we have LT 0 ⊆ DBQREL ◦ LT

(Theorem 4.77)

⊆ DBQREL ◦ LB

(Theorem 4.48)

⊆ LB

(Theorem 4.57).

Let us now prove that LB ⊆ LT 0 . The construction is the same as in the proof of Theorem 4.48(2), but now we use look-ahead in case the bottom-up ftt is deleting. Let M = (Q, Σ, ∆, R, Qd ) be a linear bottom-up ftt. Define, for each q in Q, Mq to be the bottom-up ftt (Q, Σ, ∆, R, {q}). Construct the linear top-down0 ftt N = (Q, Σ, ∆, RN , Qd ) such that (i) if a → q[t] is in R, then q[a] → t is in RN ; (ii) if a[q1 [x1 ] · · · qk [xk ]] → q[t] is in R, then the rule q[a[x1 · · · xk ]] → thx1 ← q1 [x1 ], . . . , xk ← qk [xk ]i is in RN , where, for 1 ≤ i ≤ k, if xi does not occur in t, then D(xi ) = dom(Mqi ), and D(xi ) = TΣ otherwise. Note that dom(Mqi ) is recognizable by Exercise 4.29. It should be clear that T (N ) = T (M ).

From the proof of Theorem 4.80 it follows that each element of LT 0 can be realized by a linear top-down0 ftt with the property that look-ahead is only used on subtrees which are deleted. This property corresponds precisely to property (B0 ). Let us now consider composition of top-down0 ftt. Analogously to the bottom-up case, we can now expect the results in the next theorem from property (B). Theorem 4.81. (1)

T 0 ◦ LT 0 ⊆ T 0 .

(2) DT 0 ◦ T 0 ⊆ T 0

and DT 0 ◦ DT 0 ⊆ DT 0 .

62

Proof. As in the bottom-up case, we only consider a number of special cases. Lemma. T 0 ◦ LHOM ⊆ T 0 and DT 0 ◦ HOM ⊆ DT 0 . Proof. Let M = (Q, Σ, ∆, R, Qd ) be a top-down0 ftt and h a tree homomorphism from T∆ into TΩ . We construct a new top-down0 ftt using the old idea of applying the homomorphism to the right hand sides of the rules of M . Therefore we extend h to trees in T∆ (Q[X]) by defining h0 (x) = x for x in X and h1 (q) = q[x1 ] for q ∈ Q. Let, for q ∈ Q, Mq = (Q, Σ, ∆, R, {q}). Note that, by Corollary 4.78, dom(Mq ) is recognizable. Construct now N = (Q, Σ, Ω, RN , Qd ) such that (i) if q0 [a] → t is in R, then q0 [a] → h(t) is in RN ; (ii) if (q0 [a[x1 · · · xk ]] → t, D) is in R, then (q0 [a[x1 · · · xk ]] → h(t), D) is in RN , where, for 1 ≤ i ≤ k, D(xi ) is the intersection of D(xi ) and all tree languages dom(Mq ) such that q[xi ] occurs in t but not in h(t). Thus N simultaneously simulates M and applies h to the output of M . But, whenever M starts making a translation of a subtree t starting in a state q, and this translation is deleted by h, N checks that t is translatable by Mq . If h is linear or M is deterministic, then N = M ◦ h. Obviously, if M is deterministic, then so is N . Lemma. T 0 ◦ QREL ⊆ T 0 , DT 0 ◦ DTQREL ⊆ DT 0 and DT 0 ◦ DBQREL ⊆ DT 0 . Proof. It is not difficult to see that, given M ∈ T 0 and a top-down finite state relabeling N , the composition of these two can be realized by one top-down0 ftt K, which at each step simultaneously simulates M and N by transforming the output of M according to N . The construction is basically the same as that in the second lemma of Theorem 4.57. It is also clear that if M and N are both deterministic, then so is K. This gives us the first two statements of the lemma. The third one is more difficult since the finite state relabeling is now bottom-up. However the same kind of construction can be applied again, using the look-ahead facility to make the resulting top-down0 ftt deterministic. Let M = (Q, Σ, ∆, R, Qd ) with Qd = {qd } be in DT 0 and let N = (QN , ∆, Ω, RN , QdN ) be in DBQREL. Without loss of generality we assume that qd does not occur in the right hand sides of the rules in R. Let, as usual, for q ∈ Q, Mq = (Q, Σ, ∆, R, {q}), and, for p ∈ QN , Np = (QN , ∆, Ω, RN , {p}). We shall realize M ◦ N by a deterministic top-down0 ftt K = (Q, Σ, Ω, RK , Qd ), where the set RK of rules is determined as follows. Let q ∈ QM and p ∈ QN , such that if q = qd then p ∈ QdN . ∗

(i) If q[a] → t is in R and t =⇒ p[t0 ], then q[a] → t0 is in RK . N

63

(ii) Let (q[a[x1 · · · xk ]] → t, D) be a rule in R. Then t can be written as t = shx1 ← q1 [xi1 ], . . . , xm ← qm [xim ]i for certain m ≥ 0, s ∈ T∆ (Xm ), q1 , . . . , qm ∈ Q and xi1 , . . . , xim ∈ Xk , such that s is nondeleting w.r.t. Xm . Let p1 , . . . , pm be a sequence of m states of N such that ∗ shx1 ← p1 [x1 ], . . . , xm ← pm [xm ]i =⇒ p[s0 ], where s0 ∈ TΩ (Xm ). (Of course N

N was first extended in the usual way). Then the rule q[a[x1 · · · xk ]] → s0 hx1 ← q1 [xi1 ], . . . , xm ← qm [xim ]i is in RK , where the ranges of the variables are specified by D as follows. For 1 ≤ u ≤ k, D(xu ) is the intersection of D(xu ) and all tree languages dom(Mqj ◦ Npj ) such that xij = xu . This ends the construction. Intuitively, when K arrives at the root of an input subtree a[t1 · · · tk ] (in the same state as M ), it first uses its regular look-ahead to determine the rule applied by M and to determine, for every 1 ≤ j ≤ m, the (unique) state pj in which N will arrive after translation of the Mqj -translation of tij . It then runs N on the piece of output of M , starting in the states p1 , . . . , pm , and produces the output of N as its piece of output. It is straightforward to check formally that K is a deterministic top-down0 ftt (using the determinism of both M and N ). We now complete the proof of Theorem 4.81. Firstly T 0 ◦ LT 0 = T 0 ◦ LB

(Theorem 4.80)

0

(Theorem 4.52)

0

(second lemma)

0

(first lemma).

⊆ T ◦ QREL ◦ LHOM ⊆ T ◦ LHOM ⊆T Secondly DT 0 ◦ (D)T 0 ⊆ DT 0 ◦ DBQREL ◦ (D)T

(Theorem 4.77)

0

(second lemma)

0

(Theorem 4.64)

0

(first lemma).

⊆ DT ◦ (D)T ⊆ DT ◦ HOM ◦ L(D)T ⊆ DT ◦ L(D)T Now DT 0 ◦ LT ⊆ T 0 (by (1) of this theorem) and DT 0 ◦ LDT ⊆ DT 0 ◦ DTQREL ◦ LHOM

(second lemma)

0

(first lemma).

⊆ DT ◦ LHOM ⊆ DT

(Theorem 4.68)

0

This proves Theorem 4.81.

Note that the “right hand side” of Theorem 4.81(2) states that DT 0 is closed under composition. Clearly we know already that LT 0 is closed under composition (since

64

LT 0 = LB). It can also easily be checked from the proof of Theorem 4.81 that LDT 0 is closed under composition. We can now show that indeed regular look-ahead has made DT stronger than DB . Corollary 4.82. DB ( DT 0 . Proof. By Theorem 4.52, DB ⊆ DBQREL ◦ HOM. Hence, since the identity tree transformation is in DT 0 , we trivially have DB ⊆ DT 0 ◦ DBQREL ◦ HOM. But, by the second and the first lemma in the proof of Theorem 4.81, DT 0 ◦ DBQREL ◦ HOM ⊆ DT 0 . Hence DB ⊆ DT 0 . Proper inclusion follows from Lemma 4.72. Exercise 4.83. Show that each T 0 -surface tree language is the range of some element of T 0 . Exercise 4.84. Prove that T -Surface is closed under linear tree transformations (recall Corollary 4.79). Prove that DT -Surface is closed under deterministic top-down and bottom-up tree transformations. It follows from Theorem 4.81 that the inclusion signs in Theorem 4.77 may be replaced by equality signs. Hence, for example, DT 0 = DBQREL ◦ DT = DB ◦ DT. We finally show a result “dual” to Corollary 4.63(2) (recall that LT 0 = LB). Theorem 4.85. T 0 = HOM ◦ LT 0 . Proof. The inclusion HOM ◦ LT 0 ⊆ T 0 is immediate from Theorem 4.81. The inclusion T 0 ⊆ HOM ◦ LT 0 can be shown in exactly the same way as in the proof of T ⊆ HOM ◦ LT (Theorem 4.64). The only problem is the regular look-ahead: the image of a recognizable tree language under the homomorphism h need not be recognizable (we use the notation of the proof of Theorem 4.64). The solution is to consider a homomorphism g from TΩ into TΣ such that, for all t in TΣ , g(h(t)) = t; g is easy to find. Now, whenever, in a rule of M , we have a recognizable tree language L as look-ahead (for some variable), we can use g −1 (L) as the look-ahead in the corresponding rule of N . The details are left to the reader.

4.8 Surface and target languages In this section we shall consider a few properties of the tree (and string) languages which are obtained from the recognizable tree languages by application of a finite number of tree transducers. In other words we shall consider the classes (REL ∪ FTA ∪ HOM)∗ -Surface, briefly denoted by Surface, and (REL ∪ FTA ∪ HOM)∗ -Target, briefly denoted by Target. Note that Target = yield(Surface). Note also that, by various composition results, (REL ∪ FTA ∪ HOM)∗ = T ∗ = (T 0 )∗ = B ∗ = (T 0 ∪ B)∗ = . . . etc. Let us first consider some classes of tree languages obtained by restricting the number of transducers applied. In particular, let us consider, for each k ≥ 1, the

65

classes T k -Surface, (T 0 )k -Surface and B k -Surface. by the above remark, S k S S Obviously, k 0 k Surface = T -Surface = (T ) -Surface = B -Surface. k≥1

k≥1

k≥1

As a corollary to previous results we can show that regular look-ahead has no influence on the class of surface languages (cf. Corollary 4.79). Corollary 4.86. For all k ≥ 1, (i) (T 0 )k = DBQREL ◦ T k , (ii) (T 0 )k -Surface = T k -Surface, and (iii) T k -Surface is closed under linear tree transformations. Proof. (i) By Theorem 4.77, T 0 ⊆ DBQREL ◦ T. Also, by Corollary 4.82 and Theorem 4.81(2), DBQREL ◦ T ⊆ T 0 . Hence T 0 = DBQREL ◦ T. We now show that T 0 ◦ T 0 = T 0 ◦ T. Trivially, T 0 ◦ T ⊆ T 0 ◦ T 0 . Also 0 T ◦ T 0 = T 0 ◦ DBQREL ◦ T and, by Theorem 4.81(1), T 0 ◦ DBQREL ⊆ T 0 . Hence T 0 ◦ T 0 ⊆ T 0 ◦ T. From this it is straightforward to see that (T 0 )k+1 = T 0 ◦ T k = DBQREL ◦ T ◦ T k = DBQREL ◦ T k+1 . (ii) This is an immediate consequence of (i) and the fact that RECOG is closed under linear tree transformations. (iii) This follows easily from (ii) and the fact that (T 0 )k is closed under composition with linear tree transformations (Theorem 4.81(1), recall also Theorem 4.80). We mention here that it can also be shown that T k -Surface is closed under union, tree concatenation and tree concatenation closure. From that a lot of closure properties of T k -Target and Target follow. The relation between the top-down and the bottom-up surface tree languages is easy. Corollary 4.87. For all k ≥ 1, (i) T k -Surface = (B k ◦ LB)-Surface and B k+1 -Surface = (T k ◦ HOM)-Surface; (ii) B k -Surface ⊆ T k -Surface ⊆ B k+1 -Surface. Proof. (i) follows from the fact that B = LB ◦ HOM (Corollary 4.63(2)) and that T 0 = HOM ◦ LB (Theorem 4.85). (ii) is an obvious consequence of (i). Note that B-Surface = HOM-Surface (Exercise 4.56). It is not known, but conjectured, that for all k the inclusions in Corollary 4.87(ii) are proper. Note that, by taking yields, Corollary 4.87 also holds for the corresponding target languages. Again it is not known whether the inclusions are proper.

66

In the rest of this section we show that the emptiness-, the membership- and the finiteness-problem are solvable for Surface and Target. Theorem 4.88. The emptiness- and membership-problem are solvable for Surface. Proof. Let M ∈ (REL ∪ FTA ∪ HOM)∗ and L ∈ RECOG. Consider the tree language M (L) ∈ Surface. Obviously, M (L) = ∅ iff L ∩ dom(M ) = ∅. But, by Remark 4.61, dom(M ) is recognizable. Hence, by Theorem 3.32 and Theorem 3.74, it is decidable whether L ∩ dom(M ) = ∅. To show solvability of the membership-problem note first that Surface is closed under b is the fta restriction intersection with a recognizable tree language (if R ∈ RECOG and R b b such that dom(R) = R, then M (L) ∩ R = (M ◦ R)(L)). Now, for any tree t, t ∈ M (L) iff M (L) ∩ {t} 6= ∅. Since {t} is recognizable, M (L) ∩ {t} ∈ Surface, and we just showed that it is decidable whether a surface tree language is empty or not. To show decidability of the finiteness-problem we shall use the following result. Lemma 4.89. Each monadic tree language in Surface is recognizable (and hence regular as a set of strings). Proof. Let L be a monadic tree language. Obviously it suffices to show that, for each k ≥ 1, if L ∈ (T 0 )k -Surface, then L ∈ (T 0 )k−1 -Surface (where, by definition, (T 0 )0 -Surface = RECOG). Suppose therefore that L ∈ (T 0 )k -Surface, so L = (M1 ◦ · · · ◦ Mk−1 ◦ Mk )(R) for certain Mi ∈ T 0 and R ∈ RECOG. Consider all right hand sides of rules in Mk . Obviously, since L is monadic, these right hand sides do not contain elements of rank ≥ 2, that is, they are monadic in the sense of Notation 4.42 (rules which have nonmonadic right hand sides may be removed). But from this it follows that Mk is linear. It now follows from Theorem 4.81(1) (and Corollary 4.55 and Theorem 4.80 in the case k = 1), that L ∈ (T 0 )k−1 -Surface. Theorem 4.90. The finiteness-problem is solvable for Surface. Proof. Intuitively, a tree language is finite if and only if the set of paths through this tree language is finite. For L ⊆ TΣ we define path(L) ⊆ Σ∗ recursively as follows (i) for a ∈ Σ0 , path(a) = {a}; (ii) for k ≥ 1, a ∈ Σk and t1 , . . . , tk ∈ TΣ , path(a[t1 · · · tk ]) = {a} · (path(t1 ) ∪ · · · ∪ path(tk )); S (iii) for L ⊆ TΣ , path(L) = path(t). t∈L

Thus, path(t) consists of all strings which can be obtained by following a path through t (for instance, if t = a[bb[cc]], then path(t) = {ab, abc}). We remark here that any other similar definition of “path” would also satisfy our purposes. It is left to the reader to show that, for any tree language L, L is finite iff path(L) is finite. We now show that, given L ⊆ TΣ , the set path(L) can be obtained by feeding L into a top-down tree transducer Mp : Mp (L) will be equal to path(L), modulo the correspondence

67

between strings and monadic trees (see Definition 2.21). In fact, Mp = (Q, Σ, ∆, R, Qd ), where Q = Qd = {p}, ∆0 = {e}, ∆1 = Σ and R consists of the following rules: (i) for each a ∈ Σ0 , p[a] → a[e] is in R; (ii) for every k ≥ 1, a ∈ Σk and 1 ≤ i ≤ k, p[a[x1 · · · xk ]] → a[p[xi ]] is in R. Consequently, L is finite iff Mp (L) is finite. Now, let L ∈ Surface. Then, obviously, Mp (L) ∈ Surface. Moreover Mp (L) is monadic. Hence, by Lemma 4.89, Mp (L) is recognizable. Thus, by Theorem 3.75, it is decidable whether Mp (L) is finite. To show that the above mentioned problems are solvable for Target, we need the following lemma. Lemma 4.91. Each language in Target is (effectively) of the form yield(L) or yield(L) ∪ {λ}, where L ⊆ TΣ for some Σ such that Σ1 = ∅ and e ∈ / Σ, L ∈ Surface. Proof. It is left as an exercise to show that, for any L0 ⊆ TΣ0 , there exists a bottom-up tree transducer M such that M (L0 ) ⊆ TΣ for some Σ satisfying the requirements, and such that yield(M (L0 )) = yield(L0 ) − {λ}. It is also left as an exercise to show that it is decidable whether λ ∈ yield(L0 ). From these two facts the lemma follows. Theorem 4.92. The emptiness-, membership- and finiteness-problem are solvable for Target. Proof. It is obvious from the previous lemma that we may restrict ourselves to targetlanguages yield(L), where L ∈ Surface and L ⊆ TΣ for some Σ such that Σ1 = ∅ and e∈ / Σ0 . Obviously, yield(L) = ∅ iff L = ∅. Hence the emptiness-problem is solvable by Theorem 4.88. Note that, by Example 2.17, for a given w ∈ Σ∗0 , there are only a finite number of trees such that yield(t) = w. From this and Theorem 4.88 the decidability of the membership-problem follows. Moreover it follows that yield(L) is finite iff L is finite. Hence, by Theorem 4.90, the finiteness-problem is solvable. We note that it can be shown that Target is (properly, by Theorem 4.92) contained in the class of context-sensitive languages. Thus, Target lies properly between the context-free and the context-sensitive languages. We finally note that, in a certain sense, Surface ⊆ Target (cf. the similar situation for RECOG and CFL). In fact, let L ⊆ TΣ be in Surface. Let J and K be two new symbols (“standing for [ and ]”). Let ∆ be the ranked alphabet such that ∆0 = Σ ∪ {J, K}, ∆1 = ∆2 = ∆3 = ∅ and ∆k+3 = Σk for k ≥ 1. Let M = (Q, Σ, ∆, R, Qd ) be the (deterministic) top-down tree transducer such that Q = Qd = {f } and the rules are the following (i) for k ≥ 1 and a ∈ Σk , f [a[x1 · · · xk ]] → a[aJf [x1 ] · · · f [xk ]K] is in R,

68

(ii) for a ∈ Σ0 , f [a] → a is in R. (Note that M is in fact a homomorphism). It is left to the reader to show that yield(M (L)) = Lh[ ← J, ] ← Ki (as string languages).

5 Notes on the literature In the text there are some references to [A&U] A.V. Aho and J.D. Ullman, The theory of parsing, translation and compiling, I and II, Prentice-Hall, 1972. [Sal] A. Salomaa, Formal languages, Academic Press, 1973. An informal survey of the theory of tree automata and tree transducers (uptil 1970) is given by [Thatcher, 1973].

On Section 3 Bottom-up finite tree automata were invented around 1965 independently by [Doner, 1965, 1970] and [Thatcher & Wright, 1968] (and somewhat later by [Pair & Quere, 1968]). The original aim of the theory of tree automata was to apply it to the decision problems of second-order logical theories concerning strings. The connection with context-free languages was established in [Mezei & Wright, 1967] and [Thatcher, 1967], and the idea to give “tree-oriented” proofs for results on strings is expressed in [Thatcher, 1973] and [Rounds, 1970a]. Independently, results concerning parenthesis languages and structural equivalence were obtained by [McNaugthon, 1967], [Knuth, 1967], [Ginsburg & Harrison, 1967] and [Paull & Unger, 1968]. Top-down finite tree automata were introduced by [Rabin, 1969] and [Magidor & Moran, 1969], and regular tree grammars by [Brainerd, 1969]. The notion of rule tree languages occurs in one form or another in several places in the literature. Most of the results of Section 3 can be found in the above mentioned papers. Other work on finite tree automata and recognizable tree languages is written down in the following papers: [Arbib & Give’on, 1968], automata on acyclic graphs, category theory; [Brainerd, 1968], state minimalization of finite tree automata; [Costich, 1972], closure properties of RECOG; [Eilenberg & Wright, 1967], category theoretic formulation of fta; [Ito & Ando, 1974], axioms for regular tree expressions; [Maibaum, 1972, 1974], tree automata on “many-sorted” trees;

69

[Ricci, 1973], decomposition of fta; [Takahashi, 1972, 1973], several results; [Yeh, 1971], generalization of “semigroup of fa” to fta. Remark: The notation of substitution (tree concatenation) can be formalized algebraically as in [Eilenberg & Wright, 1967], [Goguen & Thatcher, 1974], [Thatcher, 1970] and [Yeh, 1971]. Generalizations of finite automata are the following: – finite automata on derivation graphs of type 0 grammars, [Benson, 1970], [Buttelmann, 1971], [Hart, 1974]; – probabilistic tree automata, [Ellis, 1970], [Magidor & Moran, 1970]; – automata and grammars for infinite trees, [Rabin, 1969], [Goguen & Thatcher, 1974], [Engelfriet, 1972]; – recognition of subsets of an arbitrary algebra (rather than the algebra of trees), [Mezei & Wright, 1967], [Shephard, 1969]. There are no real open problems in finite tree automata theory. It is only a question of (i) how far one wants to go in generalizing finite automata theory to trees (for instance, decomposition theory, theory of incomplete sequential machines, noncounting regular languages, etc), and (ii) which results on context-free languages one can prove via trees (for instance, Greibach normal form, Parikh’s theorem, etc.).

On Section 4 For the literature on syntax-directed translation, see [A&U]. The notion of “generalized syntax-directed translation” is defined in [Aho & Ullman, 1971]. The top-down tree transducer was introduced in [Rounds, 1968], [Thatcher, 1970] and [Rounds, 1970b] as a model for syntax-directed translation and transformational grammars. The bottom-up tree transducer was introduced in [Thatcher, 1973]. The notion of a tree rewriting system occurs in one form or another in [Brainerd, 1969], [Rosen, 1971, 1973], [Engelfriet, 1971] and [Maibaum, 1974]. Most results of Section 4 can be found in [Rounds, 1970b], [Thatcher, 1970], [Rosen, 1971], [Engelfriet, 1971], [Baker, 1973] and [Ogden & Rounds, 1972]. Other papers on tree transformations are [Alagi´c, 1973], [Benson, 1971], [Bertsch, 1973], [Kosaraju, 1973], [Levy & Joshi, 1973], [Martin & Vere, 1970] and [Rounds, 1973]. We mention the following problems concerning tree transformations.

70

– “Statements such as “similar models have been studied by [x,y,z]” are symptomatic of the disarray in the development of the theory of translation and semantics” (free after [Thatcher, 1973]). Establish the precise relationships between various models of syntax-directed translation, semantics of context-free languages and tree transducers (see [Goguen & Thatcher, 1974]): (i) Compare existing models of syntax-directed translation with top-down tree transducers, in particular with respect to the classes of translations they define (see Definition 4.41). See [A&U], [Aho & Ullman, 1971], [Thatcher, 1973] and [Martin & Vere, 1970]. (ii) Define a tree transducer corresponding to the semantics definition method of [Knuth, 1968]. – Develop a general theory of operations on tree languages (tree AFL theory). Relate these operations to the yield languages, as illustrated in Section 3. This work was started by [Baker, 1973]. It would perhaps be convenient to consider tree transducers (Q, Σ, ∆, R, Qd ) such that R consists of rules t1 → t2 with t1 ∈ Q[TΣ (X)], t2 ∈ T∆ [Q(X)] in the top-down case, or t1 ∈ TΣ (Q[X]), t2 ∈ Q[T∆ (X)] in the bottom-up case. – Consider surface tree languages and target languages more carefully. Prove that the class T -Target is incomparable with the class of indexed languages. Prove that the classes T k -Surface form a proper hierarchy (see [Ogden & Rounds, 1972] and [Baker, 1973]). Prove that the classes T k -Target form a proper hierarchy (then you also proved the previous one). Prove that DT-Target ( T-Target. Is it possible to obtain each target language by a sequence of nondeleting (and nonerasing) tree transducers? etc. – Consider the complexity of target languages and [Aho & Ullman, 1971], [Baker, 1973] and [Rounds, 1973]).

translations

(see

– What is the practical use of tree transducers? (see [A&U], [de Remer, 1974]).

Other subjects We mention finally the following subjects in tree theory. – Context-free tree grammars, [Fischer, 1968], [Rounds, 1969, 1970a, 1970b], [Downey, 1974], [Maibaum, 1974]. Let us explain briefly the notion of a context-free tree grammar. Consider a system G = (N, Σ, R, S), where N is a ranked alphabet of nonterminals, Σ is a ranked alphabet of terminals, V := N ∪ Σ, S ∈ N0 is the initial nonterminal, and R is a finite set of rules of one of the forms A → t with A ∈ N0 and t ∈ TV , or

71

A[x1 · · · xk ] → t with A ∈ Nk and t ∈ TV (Xk ). G is considered as a tree rewriting system on TV . If we let all variables in the rules range over TV then G is called a context-free tree grammar. If we let all variables range over TΣ , then G is called a bottom-up (or inside-out) context-free ∗ tree grammar. The language generated by G is L(G) = {t ∈ TΣ | S = ⇒ t}. In general the languages generated by G under the above two interpretations differ (consider for instance the grammar S → F [A], F [x1 ] → f [x1 x1 ], A → a, A → b). Thus restriction to bottom-up ( ≈ right-most in the string case) generation gives another language. On the other hand, restriction to top-down ( ≈ left-most) generation can be done without changing the language. The yield of the class of context-free tree languages is equal to the class of indexed languages. The yield of the class of bottom-up context-free tree languages is called IO. These two classes are incomparable ([Fischer, 1968]). How do these classes compare with the target languages? Is it possible to iterate the CFL → RECOG procedure and obtain results about (bottom-up) context-free tree languages from regular tree languages (of a “higher order”) (see [Maibaum, 1974]). Is there any sense in considering pushdown tree automata? – General computability on trees: [Rus, 1967], [Mahn, 1969], the Vienna method. – Tree walking automata (at each moment of time the finite automaton is at one node of the tree; depending on its state and the label of the node it goes to the father node or to one of the son nodes): [Aho & Ullman, 1971], [Martin & Vere, 1970]. – Tree adjunct grammars (another, linguistically motivated, way of generating tree languages): [Joshi & Levy & Takahashi, 1973]. – Lindenmayer tree systems (parallel rewriting): ˇ ˇ [Culik, 1974], [Culik & Maibaum, 1974], [Engelfriet, 1974], [Szilard, 1974].

References A.V. Aho and J.D. Ullman, 1971. Translations on a context-free grammar, Inf. & Control 19, 439-475. S. Alagi´c, 1973. Natural state transformations, Techn. Report 73B-2, Univ. of Massachusetts at Amherst. M.A. Arbib and Y. Give’on, 1968. Algebra automata, I & II, Inf. & Control 12, 331-370. B.S. Baker, 1973. Tree transductions and families of tree languages, Report TR-9-73, Harvard University (Abstract in: 5th Theory of Computing, 200-206). D.B. Benson, 1970. Syntax and semantics: a categorical view, Inf. & Control 17, 145-160.

72

D.B. Benson, 1971. Semantic preserving translations, Working paper, Washington State University, Washington. E. Bertsch, 1973. Some considerations about classes of mappings between context-free derivation systems, Lecture Note in Computer Science 2, 278-283. D. Bjørner, 1972. Finite state tree computations, IBM Report RJ 1053. W.S. Brainerd, 1968. The minimalization of tree automata, Inf. & Control 13, 484-491. W.S. Brainerd, 1969. Tree generating regular systems, Inf. & Control 14, 217-231. W.S. Brainerd, 1969a. Semi-Thue systems and representation of trees, 10th SWAT, 240-244. H.W. Buttelmann, 1971. On generalized finite automata and unrestricted generative grammars, 3d Theory of Computing, 63-77. O.L. Costich, 1972. A Medvedev characterization of sets recognized by generalized finite automata, Math. Syst. Th. 6, 263-267. S.C. Crespi Reghizzi and P. Della Vigna, 1973. Approximation of phrase markers by regular sets, Automata, Languages and Programming (ed. M. Nivat), North-Holland Publ. Co., 367-376. ˇ K. Culik II, 1974. Structured OL-Systems, L-Systems (eds. Rozenberg & Salomaa), Lecture Notes in Computer Science 15, 216-229. ˇ K. Culik II and T.S.E. Maibaum, 1974. Parallel rewriting systems on terms, Automata, Languages and Programming (ed. Loeckx), Lecture Notes in Computer Science 14, 495-511. J. Doner, 1970. Tree acceptors and some of their applications, J. Comp. Syst. Sci. 4, 406-451 (announced in Notices Am. Math. Soc. 12(1965), 819 as Decidability of the weak second-order theory of two successors). P.J. Downey, 1974. Formal languages and recursion schemes, Report TR 16-74, Harvard University. S. Eilenberg and J.B. Wright, 1967. Automata in general algebras, Inf. & Control 11, 452-470. C.A. Ellis, 1970. Probabilistic tree automata, 2nd Theory of Computing, 198-205. J. Engelfriet, 1971. Bottom-up and top-down tree transformations – a comparison, Memorandum 19, T.H. Twente, Holland (to be publ. in Math. Syst. Th.). J. Engelfriet, 1972. A note on infinite trees, Inf. Proc. Letters 1, 229-232.

73

J. Engelfriet, 1974. Surface tree languages and parallel derivation trees, Daimi Report PB-44, Aarhus University, Denmark. M.J. Fischer, 1968. Grammars with macro-like productions, 9th SWAT, 131-142 (Doctoral dissertation, Harvard University). S. Ginsburg and M.A. Harrison, 1967. Bracketed context-free languages, J. Comp. Syst. Sci. 1, 1-23. J.A. Goguen and J.W. Thatcher, 1974. Initial algebra semantics, 15th SWAT. J.M. Hart, 1974. Acceptors for the derivation languages of phrase-structure grammars, Inf. & Control 25, 75-92. T. Ito and S. Ando, 1974. A complete axiom system of super-regular expressions, Proc. IFIP Congress 74, 661-665. A.K. Joshi, L.S. Levy and M. Takahashi, 1973. A tree generating system, Automata, Languages and Programming (ed. Nivat), North-Holland Publ. Co., 453-465. D.E. Knuth, 1967. A characterization of parenthesis languages, Inf. & Control 11, 269-289. D.E. Knuth, 1968. Semantics of context-free languages, Math. Syst. Th. 2, 127-145 (see also: correction in Math. Syst. Th. 5(1971), 95-96, and “Examples of formal semantics” in Lecture Notes in Mathematics 188 (ed. Engeler)). S. Kosaraju, 1973. Context-sensitiveness of translational languages, 7th Princeton Conf. on Inf. Sci. and Syst. L.S. Levy and A.K. Joshi, 1973. Some results in tree automata, Math. Syst. Th. 6, 334-342. M. Magidor and G. Moran, 1969. Finite automata over finite trees, Techn. Report No. 30, Hebrew University, Jerusalem. M. Magidor and G. Moran, 1970. Probabilistic tree automata and context-free languages, Israel J. Math. 8, 340-348. F.K. Mahn, 1969. Primitiv-rekursive Funktionen auf Termmengen, Archiv f. Math. Logik und Grundlagenforschung 12, 54-65. T.S.E. Maibaum, 1972. The characterization of the derivation trees of context-free sets of terms as regular sets, 13th SWAT, 224-230. T.S.E. Maibaum, 1974. A generalized approach to formal languages, J. Comp. Syst. Sci. 8, 409-439. D.F. Martin and S.A. Vere, 1970. On syntax-directed transduction and tree-transducers, 2nd Theory of Computing, 129-135.

74

R. McNaugthon, 1967. Parenthesis grammars, Journal of the ACM 14, 490-500. J. Mezei and J.B. Wright, 1967. Algebraic automata and context-free sets, Inf. & Control 11, 3-29. D.E. Muller, 1968. Use of multiple index matrices in generalized automata theory, 9th SWAT, 395-404. W.F. Ogden and W.C. Rounds, 1972. Composition of n tree transducers, 4th Theory of Computing. C. Pair and A. Quere, 1968. D´efinition et etude des bilangages r´eguliers, Inf. & Control 13, 565-593. M.C. Paull and S.H. Unger, 1968. Structural equivalence of context-free grammars, J. Comp. Syst. Sci. 2, 427-463. M.O. Rabin, 1969. Decidability of second-order theories and automata on infinite trees, Transactions of the Am. Math. Soc. 141, 1-35. F.L. de Remer, 1974. Transformational grammars for languages and compilers, Lecture Notes in Computer Science 21. G. Ricci, 1973. Cascades of tree-automata and computations in universal algebras, Math. Syst. Th. 7, 201-218. B.K. Rosen, 1971. Subtree replacement systems, Ph. D. Thesis, Harvard University. B.K. Rosen, 1973. Tree-manipulating systems and Church-Rosser theorems, Journal of the ACM 20, 160-188. W.C. Rounds, 1968. Trees, transducers and transformations, Ph. D. Dissertation, Stanford University. W.C. Rounds, 1969. Context-free grammars on trees, 1st Theory of Computing, 143-148. W.C. Rounds, 1970a. Tree-oriented proofs of some theorems on context-free and indexed languages, 2nd Theory of Computing, 109-116. W.C. Rounds, 1970b. Mappings and grammars on trees, Math. Syst. Th. 4, 257-287. W.C. Rounds, 1973. Complexity of recognition in intermediate-level languages, 14th SWAT, 145-158. T. Rus, 1967. Some observations concerning the application of the electronic computers in order to solve nonarithmetical problems, Mathematica 9, 343-360. C.D. Shephard, 1969. Languages in general algebras, 1st Theory of Computing, 155-163. A.L. Szilard, 1974. Ω-OL Systems, L-Systems (eds. Rozenberg and Salomaa), Lecture Notes in Computer Science 15, 258-291.

75

M. Takahashi, 1972. Regular sets of strings, trees and W-structures, Dissertation, University of Pennsylvania. M. Takahashi, 1973. Primitive transformations of regular sets and recognizable sets, Automata, Languages and Programming (ed. Nivat), North-Holland Publ. Co., 475-480. J.W. Thatcher, 1967. Characterizing derivation trees of context-free grammars through a generalization of finite automata theory, J. Comp. Syst. Sci. 1, 317-322. J.W. Thatcher, 1970. Generalized2 sequential machine maps, J. Comp. Syst. Sci. 4, 339-367 (also IBM Report RC 2466, also published in 1st Theory of Computing as “Transformations and translations from the point of view of generalized finite automata theory”). J.W. Thatcher, 1973. Tree automata: an informal survey, Currents in the Theory of Computing (ed. Aho), Prentice-Hall, 143-172 (also published in 4th Princeton Conf. on Inf. Sci. and Systems, 263-276, as ”There’s a lot more to finite automata theory than you would have thought”). J.W. Thatcher and J.B. Wright, 1968. Generalized finite automata theory with an application to a decision problem of second-order logic, Math. Syst. Th. 2, 57-81. R. Turner, 1973. An infinite hierarchy of term languages – an approach to mathematical complexity, Automata, Languages and Programming (ed. Nivat), North-Holland Publ. Co., 593-608. R.T. Yeh, 1971. Some structural properties of generalized automata and algebras, Math. Syst. Th. 5, 306-318.

76