Tree-adjoining grammar (TAG) and combinatory categorial grammar (CCG) are two well-established mildly context-sensitive grammar formalisms that are known to have the same expressive power on strings (i.e., generate the same class of string languages). It is demonstrated that their expressive power on trees also essentially coincides. In fact, CCG without lexicon entries for the empty string and only first-order rules of degree at most 2 are sufficient for its full expressive power.

## 1 Introduction

Combinatory categorial grammar (CCG) (Steedman, 2000; Steedman and Baldridge, 2011) is one of several grammar formalisms that were introduced as an extension of context-free grammars. In particular, CCG extends the classical categorial grammar (Bar-Hillel et al., 1960), which has the same expressivity as context-free grammar, by rules that are inspired by combinatory logic (Curry et al., 1958). CCG is a mildly context-sensitive grammar formalism (Joshi, 1985). Context-sensitive grammar formalisms are formalisms that are efficiently parsable (i.e., in polynomial time) and have expressivity beyond the context-free languages. They are able to express a limited amount of cross-serial dependencies and have the constant growth property. Because of these features and its notion of syntactic categories, which is quite intuitive for natural languages, CCG has become widely applied in compuational linguistics (Steedman, 2000). Further, it can be enhanced by semantics through the lambda calculus.

CCG is based on a lexicon and a rule system. The lexicon assigns syntactic categories to the symbols of an input string and the rule system describes how neighboring categories can be combined to new categories. Each category has a target, which is similar to the return type of a function, and optionally, a number of arguments. Different from functions, each argument has a directionality that indicates if it is expected on the left or the right side. If repeated combination of categories leads to a (binary) derivation tree that comprises all input symbols and is rooted in an initial category, then the input string is accepted.

When defining CCG, there are many degrees of freedom yielding a number of different variants (Steedman, 2000; Baldridge, 2002; Steedman and Baldridge, 2011; Kuhlmann et al., 2015). This is a consequence of the linguistically motivated need to easily express specific structures that have been identified in a particular theory of syntax for a given natural language. However, we and others (Kuhlmann et al., 2015) are interested in the expressive power of CCGs as generators of formal languages, since this allows us to disentangle the confusion of subtly different formalisms and identify the principal structures expressible by a common core of the formalisms. As linguistic structure calls for a representation that goes beyond strings, we aim for a characterization of expressive power in terms of the generated trees.

The most famous result on the expressive power of CCG is by Vijay-Shanker and Weir (1994), showing that tree-adjoining grammar (TAG), linear-indexed grammar (LIG), head grammar (HG), and CCG generate the same string languages. An equivalent automaton model is the embedded push-down automaton (Vijay-Shanker, 1988). In the definition of CCG used by Vijay-Shanker and Weir (1994), the lexicon allows *ε*-entries, which assign syntactic categories to the empty string *ε*. Their rule system restricts rules to specific categories and limits the rule degree. CCG with unbounded rule degree are Turing-complete (Kuhlmann et al., 2018). Prefix-closed CCG without target restrictions, in which the rules obey special closure properties, are less powerful. This even holds for multimodal CCGs (Kuhlmann et al., 2010, 2015), which allow many types of directionality indicators (i.e., slashes).

When going beyond the level of string languages, there exist different notions of strong generative power. We consider two formalisms as strongly equivalent if their generated derivation tree languages coincide modulo relabelings. For example, the well-known local and regular tree grammars (Gécseg and Steinby, 1997) are strongly equivalent. On the other hand, Hockenmaier and Young (2008) regard two formalisms as strongly equivalent if they capture the same sets of dependencies. Then there exist specific scrambling cases whose dependencies can be expressed by their CCG, but not by Lexicalized TAG (LTAG). Their CCG are syntactically more expressive than ours and allow type-raising, whereas the strong generative capacity (in our sense) of LTAG is strictly smaller than that of TAG (Kuhlmann and Satta, 2012). The dependencies expressed by CCG without rule restrictions and TAG are shown to be incomparable by Koller and Kuhlmann (2009).

Returning to our notion of strong generative capacity, Kuhlmann et al. (2019) investigated the tree-generative capacity of CCG without *ε*-entries. The generated trees are always binary. CCG with application and first-degree composition rules generate exactly the regular tree languages (Gécseg and Steinby, 1997). Without the composition rules, only a proper subset can be generated. The languages of CCG rule trees (i.e., trees labeled by applied rules instead of categories) with bounded rule degree can also be generated by simple monadic context-free tree grammar (sCFTG).

For the converse direction, we show that the tree languages generated by sCFTG can also be generated by CCG, which shows strong equivalence. This answers several open questions. Since sCFTG and TAG are strongly equivalent (Kepser and Rogers, 2011), our result also shows strong equivalence of CCG and TAG. In contrast to the construction of Vijay-Shanker and Weir (1994), which relies heavily on *ε*-entries, our construction avoids them and shows that they do not increase the expressive power of CCG. Additionally, we only use rules up to degree 2 and first-order categories (i.e., arguments are atomic), which shows that larger rule degree or higher-order categories do not increase the expressive power.

Our construction proceeds roughly as follows. We begin with a spine grammar, which is a variant of sCFTG that is also strongly equivalent to TAG. We encode its spines using a context-free grammar, which in turn can be represented by a special variant of push-down automata. Finally, the runs of the push-down automaton are simulated by a CCG such that the stack operations of the automaton are realized by adding and removing arguments of the categories.

## 2 Preliminaries

The nonnegative integers are ℕ. For every *k* ∈ℕ, we let [*k*] = {*i* ∈ℕ∣1 ≤ *i* ≤ *k*}. The set Σ^{*} contains all strings over the finite set Σ including the empty string *ε*. We let Σ^{ +} = Σ^{*}∖{*ε*}. The length of *w* ∈ Σ^{*} is $|w|$, and concatenation is written as juxtaposition. The *prefixes* Pref(*w*) of a string *w* ∈ Σ^{*} are {*u* ∈ Σ^{*}∣∃*v* ∈ Σ^{*}: *w* = *uv*}. A *string language* is a subset *L* ⊆ Σ^{*}. Given a relation ⇒⊆ *S*^{2}, we let ⇒^{*} be the reflexive, transitive closure of ⇒.

### 2.1 Tree Languages

*ranked sets*Σ = Σ

_{0}∪ Σ

_{1}∪ Σ

_{2}. If Σ is an alphabet, then it is a ranked alphabet. For every

*k*∈{0, 1, 2}, we say that symbol

*a*∈ Σ

_{k}has

*rank k*. We write $T\Sigma 2,\Sigma 1(\Sigma 0)$ for the set of all trees over Σ, which is the smallest set

*T*such that

*c*(

*t*

_{1},…,

*t*

_{k}) ∈

*T*for all

*k*∈{0,1,2},

*c*∈ Σ

_{k}, and

*t*

_{1},…,

*t*

_{k}∈

*T*. As usual, we write just

*a*for leaves

*a*() with

*a*∈ Σ

_{0}. A

*tree language*is a subset $T\u2286T\Sigma 2,\u2205(\Sigma 0)$. Let $T=T\Sigma 2,\Sigma 1(\Sigma 0)$. The map $pos:T\u2192P+([2]*)$ assigns Gorn tree addresses (Gorn, 1965) to a tree, where $P+(S)$ is the set of all nonempty subsets of

*S*. Let

*k*∈{0,1,2},

*c*∈ Σ

_{k}and

*t*

_{1},…,

*t*

_{k}∈

*T*. The set of all leaf positions of

*t*is defined as $leaves(t)=w\u2208pos(t)\u2223w1\u2209pos(t)$. Given a tree

*t*∈

*T*and a position

*w*∈pos(

*t*), we write

*t*|

_{w}and

*t*(

*w*) to denote the subtree rooted in

*w*and the symbol at

*w*, respectively. Additionally, we let

*t*[

*t′*]

_{w}be the tree obtained when replacing the subtree appearing in

*t*at position

*w*by

*t′*∈

*T*. Finally, let $yield:T\u2192\Sigma 0+$ be defined by yield(

*a*) =

*a*for all

*a*∈ Σ

_{0}and $yieldc(t1,\u2026,tk)=yield(t1)\cdots yield(tk)$ for all

*k*∈ [2],

*c*∈ Σ

_{k}, and

*t*

_{1},…,

*t*

_{k}∈

*T*.

The special leaf symbol $\u25a1$ is reserved and represents a hole in a tree. The set $C\Sigma 2,\Sigma 1(\Sigma 0)$ of contexts contains all trees of $T\Sigma 2,\Sigma 1\Sigma 0\u222a{\u25a1}$, in which $\u25a1$ occurs exactly once. We write $pos\u25a1(C)$ to denote the unique position of $\u25a1$ in the context $C\u2208C\Sigma 2,\Sigma 1(\Sigma 0)$. Moreover, given *t* ∈ *T* we simply write *C*[*t*] instead of $C[t]pos\u25a1(C)$.

A tuple (*ρ*_{0},*ρ*_{1},*ρ*_{2}) is called a *relabeling* if *ρ*_{k}: Σ_{k} → *Δ*_{k} for all *k* ∈{0,1,2} and ranked set *Δ*. It induces the map $\rho :T\u2192T\Delta 2,\Delta 1(\Delta 0)$ given by $\rho c(t1,\u2026,tk)=\rho k(c)\rho (t1),\u2026,\rho (tk)$ for all *k* ∈{0,1,2}, *c* ∈ Σ_{k} and *t*_{1},…,*t*_{k} ∈ *T*.

### 2.2 Combinatory Categorial Grammar

In the following, we give a short introduction to CCG. Given an alphabet *A* of *atoms* or *atomic categories* and a set of *slashes**D* = {/,∖} indicating directionality, the set of *categories* is defined as $C(A)=TD,\u2205(A)$. We usually write the categories in infix notation and the slashes are left-associative by convention, so each category takes the form *c* = *a* |_{1}*c*_{1} ⋯ |_{k}*c*_{k} where *a* ∈ *A*, |_{i} ∈ *D*, $ci\u2208C(A)$ for all *i* ∈{1,…,*k*}. The atom *a* is called the *target* of *c* and written as tar(*c*). The slash-category pairs |_{i}*c*_{i} are called *arguments* and their number *k* is called the *arity* of *c* and denoted by ar(*c*). In addition, we write arg(*c*,*i*) to get the *i*-th argument |_{i}*c*_{i} of *c*. In the sense of trees, the sequence of arguments is a context $\u25a1|1c1\cdots |kck$. The set of *argument contexts* is denoted by $A(A)\u2286CD,\u2205(A)$. We distinguish between two types of categories. In *first-order categories*, all arguments are atomic, whereas in *higher-order categories*, the arguments can have arguments themselves.

*rule of degree*with

*k**k*∈ℕ has one of the following forms:

*a*∈

*A*, $c\u2208C(A)\u222a{y}$, |

_{i}∈

*D*, and $ci\u2208C(A)\u222a{yi}$ for all

*i*∈ [

*k*]. Here,

*y*,

*y*

_{1},…,

*y*

_{k}are category variables that can match any category in $C(A)$ and

*x*is an argument context variable that can match any argument context in $A(A)$. The category taking the argument (

*ax*|

*c*with |∈

*D*) is called

*primary category*, the one providing it (

*c*|

_{1}

*c*

_{1}⋯ |

_{k}

*c*

_{k}) is called

*secondary category*, and they are combined to an

*output category*(

*ax*|

_{1}

*c*

_{1}⋯ |

_{k}

*c*

_{k}). Given rule

*r*, we write sec(

*r*) to refer to the secondary category. Rules of degree 0 will be referred to as

*application rules*, while rules of higher degree are

*composition rules*. We write $R(A)$ for the set of all rules over

*A*. A

*rule system*is a pair

*Π*= (

*A*,

*R*), where

*A*is an alphabet and $R\u2286R(A)$ is a finite set of rules over

*A*. Given a rule

*r*∈

*R*, we obtain a

*ground instance*of it by replacing the variables {

*y*,

*y*

_{1},… } by concrete categories and the variable

*x*by a concrete argument context. The ground instances of

*Π*induce a relation $\u2192\Pi \u2286C(A)2\xd7C(A)$ and we write $c\u2032\u2032cc\u2032\Pi $ instead of (

*c*,

*c′*) →

_{Π}

*c*

^{′′}. The relation →

_{Π}extends to a relation $\u21d2\Pi \u2282(C(A)*)2$ on sequences of categories. It is given by

*combinatory categorial grammar*(CCG) is a tuple $G=(\Sigma ,A,R,I,L)$ that consists of an alphabet Σ of

*input symbols*, a rule system (

*A*,

*R*), a set

*I*⊆

*A*of

*initial categories*, and a finite relation $L\u2286\Sigma \xd7C(A)$ called

*lexicon*. It is called

*k*-

*CCG*if each rule

*r*∈

*R*has degree at most

*k*, where

*k*∈ℕ.

*generates*the category sequences $CG\u2286C(A)*$ and the string language $L(G)\u2286\Sigma *$ given by

*derivation tree of*$G$ if $t(w)t(w\u22c51)t(w\u22c52)(A,R)$ for every

*w*∈pos(

*t*) ∖leaves(

*t*). We denote the set of all derivation trees of $G$ by $D(G)$.

*category relabeling*$\rho :C(A)\u2192\Delta $ is a relabeling such that

*ρ*(

*c*) =

*ρ*(

*c′*) for all categories $c,c\u2032\u2208C(A)$ with tar(

*c*) = tar(

*c′*) and $argc,ar(c)=argc\u2032,ar(c\u2032)$. The relabeled derivation trees $T\rho (G)\u2286T\Delta 2,\u2205(\Delta 0)$ are given by

*generatable*by $G$ if there is a category relabeling $\rho \u2032:C(A)\u2192\Delta $ such that $T=T\rho \u2032(G)$.

**Example 1.**

*α*,

*β*,

*γ*,

*δ*} and

*A*= {⊥,

*a*,

*b*,

*c*,

*d*,

*e*} be a CCG with the lexicon ℒ given below, where $R(A,2)$ is the set of all rules over

*A*up to degree 2. Thus, it is a 2-CCG.

*x*is an argument context and can thus be replaced by an arbitrary sequence of arguments. Utilizing $x=\u25a1\u2216a$ yields the ground instance $\u22a5\u2216a/a/c\u22a5\u2216a/cc\u2216a/c$, which has primary category

*c*

_{1}= ⊥∖

*a*/

*c*and secondary category

*c*

_{2}=

*c*∖

*a*/

*c*. The latter has target tar(

*c*

_{2}) =

*c*and the two arguments ∖

*a*and /

*c*, so its arity is ar(

*c*

_{2}) = 2.

A derivation tree of $G$ is depicted in Figure 1. We start at the bottom with categories taken from the lexicon in accordance with the input symbols. Then neighboring categories are combined until we arrive at the root with initial category ⊥, so the input word is accepted.

## 3 Push-down Automata

We start by introducing a Moore variant of push-down automata (Autebert et al., 1997) that is geared towards our needs and still accepts the context-free languages (of strings of length ≥ 2). It will be similar to the push-down Moore machines of Decker et al. (2013). Instead of processing input symbols as part of transitions (as in Mealy machines), Moore machines output a unique input symbol in each state (Fleischner, 1977). For every set Γ, we let Γ^{≤1} = {*ε*}∪ Γ and Γ^{≥2} = {*w* ∈ Γ^{*}∣2 ≤|*w*|} be the sets of strings over Γ of length at most 1 and at least 2, respectively.

**Definition 2.**

A *Moore push-down automaton* (MPDA) $A=(Q,\Sigma ,\Gamma ,\delta ,\tau ,I,F)$ consists of (i) finite sets *Q*, Σ, and Γ of *states*, *input symbols*, and *stack symbols*, respectively, (ii) a set *δ* ⊆ (*Q* × Γ^{≤1} × Γ^{≤1} × *Q*) ∖ (*Q* × Γ × Γ × *Q*) of *transitions*, (iii) an output function *τ* : *Q* → Σ, and (iv) sets *I*,*F* ⊆ *Q* of *initial* and *final states*, respectively.

*δ*, in a single step we can either push or pop a single stack symbol or ignore the stack. Note that we explicitly exclude the case where a symbol is popped and another symbol is pushed at the same time. In the following, let $A=(Q,\Sigma ,\Gamma ,\delta ,\tau ,I,F)$ be an MPDA. On the set $ConfA=Q\xd7\Gamma *$ of configurations of $A$ the

*move relation*$\u22a9A\u2286ConfA2$ is

*initial*(respectively,

*final*) if

*q*∈

*I*and

*α*∈ Γ (respectively,

*q*∈

*F*and

*α*=

*ε*). An

*accepting run*is a sequence $\xi 0,\u2026,\xi n\u2208ConfA$ of configurations that are successively related by moves (i.e., $\xi i\u22121\u22a9A\xi i$ for all

*i*∈ [

*n*]), starts with an initial configuration

*ξ*

_{0}, and finishes in a final configuration

*ξ*

_{n}. In other words, we can start in an initial state with an arbitrary symbol on the stack and finish in a final state with the empty stack, and for each intermediate step there has to exist a transition. The language

*accepted*by $A$ contains exactly those strings

*w*∈ Σ

^{*}, for which there exists an accepting run 〈

*q*

_{0},

*α*

_{0}〉,…,〈

*q*

_{n},

*α*

_{n}〉 such that

*w*=

*τ*(

*q*

_{0})⋯

*τ*(

*q*

_{n}). Thus, we accept the strings that are output symbol-by-symbol by the states attained during an accepting run. As usual, two MPDA are

*equivalent*if they accept the same language. Since no initial configuration is final, each accepting run has length at least 2, so we can only accept strings of length at least 2. While we could adjust the model to remove this restriction, the presented version serves our later purposes best.

**Theorem 3.**

*MPDA accept the context-free languages of strings of length at least 2.*

The MPDA $A$ is *pop-normalized* if there exists a map pop: Γ → *Q* such that *q′* = pop(*γ*) for every transition (*q*,*γ*,*ε*,*q′*) ∈ *δ*. In other words, for each stack symbol *γ* ∈ Γ there is a unique state pop(*γ*) that the MPDA enters whenever *γ* is popped from the stack.

Later on, we will simulate the runs of an MPDA in a CCG such that subsequent configurations are represented by subsequent primary categories. Popping transitions are modeled by removing the last argument of a category. Thus, the target state has to be stored in the previous argument. This argument is added when the according pushing transition is simulated, so at that point we already have to be aware in which state the MPDA will end up after popping the symbol again. This will be explained in more detail in Section 7.

We can easily establish this property by storing a state in each stack symbol. Each pushing transition is replaced by one variant for each state (i.e., we guess a state when pushing), but when a symbol is popped, this is only allowed if the state stored in it coincides with the target state.

**Lemma 4.**

*For every MPDA we can construct an equivalent pop-normalized MPDA.*

The next statement shows that we can provide a form of look-ahead on the output. In each new symbol we store the current as well as the next output symbol. Standard techniques can be used to prove the statement. We will briefly sketch why this look-ahead is necessary. Before constructing the CCG, the MPDA will be used to model a spine grammar. The next output symbol of the MPDA corresponds to the label of the parent node along a so-called spine of a tree generated by the spine grammar. From this parent node we can determine the possible labels of its other child. This information will be used in the CCG to control which secondary categories are allowed as neighboring combination partners.

**Lemma 5.**

*For every context-free language*

*L*⊆ Σ

^{*}

*and*

*⊲*∉Σ,

*the language*Next(

*L*)

*is context-free, where*

**Corollary 6.**

*For every context-free language**L* ⊆ Σ^{≥2}*there exists a pop-normalized* MPDA $A$*such that*$L(A)=Next(L)$.

## 4 Spine Grammars

Now we move on to representations of tree languages. We first recall context-free tree grammars (Rounds, 1969), but only the monadic simple variant (Kepser and Rogers, 2011).

**Definition 7.**

A *simple monadic context-free tree grammar* (sCFTG) is a tuple $G=(N,\Sigma ,S,P)$ consisting of (i) disjoint ranked alphabets *N* and Σ of *nonterminal* and *terminal symbols* with *N* = *N*_{1} ∪ *N*_{0} and Σ_{1} = *∅*, (ii) a nullary *start nonterminal**S* ∈ *N*_{0}, and (iii) a finite set *P* ⊆ *P*_{0} ∪ *P*_{1} of *productions*, where $P0=N0\xd7T\Sigma 2,N1(N0\u222a\Sigma 0)$ and $P1=N1\xd7C\Sigma 2,N1(N0\u222a\Sigma 0)$.

*n*,

*r*) ∈

*P*simply as

*n*→

*r*. Given $t,u\u2208T\Sigma 2,N1(\Sigma 0\u222aN0)$ we let $t\u21d2Gu$ if there exist (

*n*→

*r*) ∈

*P*and a position

*w*∈pos(

*t*) such that (i)

*t*|

_{w}=

*n*and

*u*=

*t*[

*r*]

_{w}with

*n*∈

*N*

_{0}, or (ii)

*t*|

_{w}=

*n*(

*t′*) and

*u*=

*t*[

*r*[

*t′*]]

_{w}with

*n*∈

*N*

_{1}and $t\u2032\u2208T\Sigma 2,N1(\Sigma 0\u222aN0)$. The

*tree language $T(G)$ generated by $G$*is

*strongly equivalent*to $G$ if $T(G)=T(G\u2032)$, and it is

*weakly equivalent*to $G$ if $yieldT(G)=yieldT(G\u2032)$.

Spine grammars (Fujiyoshi and Kasai, 2000) are a restriction on simple monadic context-free tree grammars that remain equally expressive by Lemma 5.4 of Fujiyoshi and Kasai (2000) modulo relabelings. Let us clarify this result. Clearly, each spine grammar is itself an sCFTG and for each sCFTG $G$ there exists a spine grammar $G\u2032$ and a relabeling *ρ* such that $T(G)={\rho (t)\u2223t\u2208T(G\u2032)}$. Although sCFTGs are more established, we elect to utilize spine grammars because of their essential notion of spines.

**Definition 8.**

The sCFTG $G$ is a *spine grammar* if there exists a map *d*: Σ_{2} →{1,2} such that $wi\u2208Pref(pos\u25a1(C))$ with *i* = *d*(*C*(*w*)) for every production (*n* → *C*) ∈ *P* with *n* ∈ *N*_{1} and $w\u2208Pref(pos\u25a1(C))$ with *C*(*w*) ∈ Σ_{2}.

Henceforth let $G$ be a spine grammar with map *d*: Σ_{2} →{1,2}. Consider a production(*n* → *C*) ∈ *P* with *n* ∈ *N*_{1}. The *spine* of *C* is simply the path from the root of *C* to the unique occurrence $pos\u25a1(C)$ of $\u25a1$. The special feature of a spine grammar is that the symbols along the spine indicate exactly in which direction the spine continues. Since only the binary terminal symbols offer branching, the special feature of spine grammars is the existence of a map *d* that tells us for each binary terminal symbol *σ* ∈ Σ_{2} whether the spine continues to the left, in which case *d*(*σ*) = 1, or to the right, in which case *d*(*σ*) = 2. This map *d*, called *spine direction*, applies to all instances of *σ* in all productions with spines. In the original definition of spine grammars (Fujiyoshi and Kasai, 2000, Definition 3.2), only nonterminal symbols have a spine direction. By creating copies of binary terminal symbols we can show that both variants are equivalent modulo relabelings.

**Definition 9.**

Spine grammar $G$ is in *normal form* if each (*n* → *r*) ∈ *P* is of the form (i) *start:**r* = *b*(*α*) or *r* = *α* for some *b* ∈ *N*_{1} and *α* ∈ Σ_{0}, (ii) *chain:*$r=b1(b2(\u25a1))$ for some *b*_{1},*b*_{2} ∈ *N*_{1}, or (iii) *terminal:*$r=\sigma (\u25a1,a)$ or $r=\sigma (a,\u25a1)$ for some *σ* ∈ Σ_{2} and *a* ∈ *N*_{0} ∖ *S*.

*n*can be rewritten to a tree

*t*that consists of a spine of terminals, where each non-spinal child is a nullary nonterminal. Formally, for every nullary nonterminal

*n*∈

*N*

_{0}let

*P′*= {(

*n*→

*r*) ∈

*P*∣

*n*∈

*N*

_{1}}. So we perform a single derivation step using the productions of $G$ followed by any number of derivation steps using only productions of $G\u2032$. The elements of $IG(n)$ are called

*spinal trees*for

*n*and their

*spine generator*is

*n*. By a suitable renaming of nonterminals we can always achieve that the spine generator does not occur in any of its spinal trees. The spine grammar $G$ is

*normalized*if it is in normal form and $IG(n)\u2286T\Sigma 2,\u2205(\Sigma 0\u222a(N0\u2216{n}))$ for every nullary nonterminal

*n*∈

*N*

_{0}.

The following result is a variant of Theorem 1 of Fujiyoshi and Kasai (2000).

**Theorem 10.**

*For every spine grammar there is a strongly equivalent normalized spine grammar.*

**Example 11.**

*N*

_{1}= {

*t*,

*a*,

*b*,

*c*,

*b′*,

*e*}, $N0={s,a\xaf,b\xaf,c\xaf,e\xaf}$, Σ

_{2}= {

*α*

_{2},

*β*

_{2},

*γ*

_{2},

*η*

_{2}}, Σ

_{0}= {

*α*,

*β*,

*γ*,

*δ*}, and

*P*as shown below.

*α*

^{n}

*δ γ*

^{n}

*β*

^{m}∣

*n*,

*m*≥ 1}.

## 5 Tree-adjoining Grammars

Before we proceed we will briefly introduce TAG and sketch how a spine grammar is obtained from it. TAG is a mildly context-sensitive grammar formalism that operates on a set of *elementary trees* of which a subset is *initial*. To generate a tree, we start with an initial tree and successively splice elementary trees into nodes using *adjunction* operations. In an adjunction, we select a node, insert a new tree there, and reinsert the original subtree below the selected node at the distinguished and specially marked *foot node* of the inserted tree. We use the *non-strict* variant of TAG, in which the root and foot labels of the inserted tree need not coincide with the label of the replaced node to perform an adjunction. To control at which nodes adjunction is allowed, each node is equipped with two types of constraints. The *selective adjunction* constraint specifies a set of trees that can be adjoined and the Boolean *obligatory adjunction* constraint specifies whether adjunction is mandatory. Only trees without obligatory adjunction constraints are part of the generated tree language.

Figure 4 shows the elementary trees of an example TAG. Only tree 1 is initial and foot nodes are marked by a superscript asterisk ⋅^{*} on the label. Whenever adjunction is forbidden (i.e., empty set as selective adjunction constraint and non-obligatory adjunction), we omit the constraints altogether. Otherwise, the constraints are put next to the label. For example, {2,3}^{ +} indicates that tree 2 or 3 must ( + = obligatory) be adjoined.

We briefly sketch the transformation from TAG to sCFTG by Kepser and Rogers (2011). TAG is a notational variant of footed simple CFTG, in which all variables in right-hand sides of productions appear in order directly below a designated *foot node*. To obtain an sCFTG, the footed simple CFTG is first converted into a spine grammar, where the spine is the path from the root to the foot node, and then brought into normal form using the construction of Fujiyoshi and Kasai (2000). The spine grammar of Example 11 is strongly equivalent to the TAG shown in Figure 4.

## 6 Decomposition into Spines

We proceed with the construction starting from the normalized spine grammar $G$. First, we will construct a context-free grammar (CFG) that captures all information of $G$. It represents the spinal trees (from bottom to top) as strings and enriches the symbols with the spine generator (initialized by start productions and preserved by chain productions) and a non-spinal child (given by terminal productions). The order of these annotations depends on the spine direction of the symbol. The leftmost symbol of the generated strings has only a spine generator annotated since the bottom of the spine has no children. To simplify the notation, we write *n*_{g} for (*n*,*g*) ∈ *N*^{2}, *α*_{n} for (*α*,*n*) ∈ Σ_{0} × *N*, and $\sigma n1n2$ for (*σ*,*n*_{1},*n*_{2}) ∈ Σ_{2} × *N*^{2}.

**Definition 12.**

*N*. The spines $S(G)=L(G\u2032)$ of $G$ are the strings generated by the CFG $G\u2032=({\u22a4}\u222a(N2),\Sigma \u2032,\u22a4,P\u2032)$ with

*Σ′*= (Σ

_{0}×

*N*) ∪ (Σ

_{2}×

*N*

^{2}) and productions

*P′*=

*P*

_{0}∪

*P*

_{1}∪

*P*

_{2}given by

**Example 13.**

Note that each string generated by the CFG belongs to (Σ_{0} × *N*)(Σ_{2}×*N*^{2})^{*}. Next we define how to reassemble those spines to form trees again, which then relabel to the original trees generated by $G$. The operation given in the following definition describes how a string generated by the CFG can be transformed into a tree by attaching subtrees in the non-spinal direction of each symbol, whereby the non-spinal child annotation of the symbol and the spinal annotation of the root of the attached tree have to match.

**Definition 14.**

*w*∈

*A*with

*A*= (Σ

_{0}×

*N*)(Σ

_{2}×

*N*

^{2})

^{*}. The generator gen: (Σ

_{0}×

*N*) ∪ (Σ

_{2}×

*N*

^{2}) →

*N*is the nonterminal in spine direction and is given by

*n*∈

*N*, let

*T*

_{n}= {

*t*∈

*T*| gen(

*t*(

*ε*)) =

*n*} be those trees of

*T*whose root label has

*n*annotated in spinal direction. We define the tree language $attT(w)\u2286T\Sigma 2\xd7N,\u2205(\Sigma 0\xd7N)$ recursively by att

_{T}(

*α*

_{n}) = {

*α*

_{n}} for all

*α*

_{n}∈ Σ

_{0}×

*N*, and

*w*∈

*A*and $\sigma n1n2\u2208\Sigma 2\xd7N2$.

To obtain the tree language defined by $G$, it is necessary to apply this operation recursively on the set of spines.

**Definition 15.**

Let *L* ⊆ (Σ_{0} × *N*)(Σ_{2}×*N*^{2})^{*}. We inductively define the tree language $F(L)$ generated by *L* to be the smallest tree language $F$ such that $attF(w)\u2286F$ for every *w* ∈ *L*.

**Example 16.**

The CFG $G\u2032$ of Example 13 generates the set of spines $S(G)$ and $F(S(G))S$ contains the correctly assembled trees formed from these spines. Figure 3c shows a tree of $F(S(G))S$ since the generator of the main spine is *S* = *s*, which is stored in spinal direction in the root label $\alpha 2\u0101s$. We can observe the correspondence of annotations in non-spinal direction and the spine generator of the respective child in the same direction.

Next we prove that $F(S(G))S$ and $T(G)$ coincide modulo relabeling. This shows that the context-free language $S(G)$ of spines completely describes the tree language $T(G)$ generated by $G$.

**Theorem 17.**

*Let*$G$*be normalized. Then*$\pi (F(S(G))S)=T(G)$, *where the relabeling**π*: (Σ_{0} × *N*) ∪ (Σ_{2} × *N*^{2}) → Σ_{0} ∪ Σ_{2}*is given by**π*(*α*_{n}) = *α* and $\pi (\sigma n1n2)=\sigma $*for all**α* ∈ Σ_{0}, *σ* ∈ Σ_{2}, and *n*,*n*_{1},*n*_{2} ∈ *N*.

**Corollary 18.**

*There exists a pop-normalized MPDA*$A$*such that*$L(A)\u222aL1=Next(S(G))$, where $L1={w\u2208Next(S(G))\u2223|w|=1}$. *Moreover*, $F(L(A)\u222aL1)S$ and $T(G)$*coincide modulo relabeling*.

**Example 19.**

The MPDA constructed in Corollary 18 for the spine grammar $G$ of Example 11 is depicted in Figure 5. Initial states are indicated using a start marker and final states are marked by a double circle. Pushing and popping stack operations are written with downwards and upwards arrows, respectively. The MPDA consists of two components. The bigger one describes the main spine, and the smaller one describes the side spine. The distinction between the three stack symbols is necessary due to pop-normalization. The distinction between *q*_{1} and $q1\u2032$ (and similar states) is necessary because their previous action distinguishes their produced input symbol since we recognize $Next(S(G2))$. For example, $\tau (q1)=(\gamma 2sc-,\gamma 2sc-)$ and $\tau (q1\u2032)=(\beta 2sb-,\gamma 2sc-)$. Similarly, *τ*(*p*_{1}) = (*z*,*z*) and $\tau (p1\u2032)=(\u22b2,z)$ where $z=\eta 2\u0113b-$. To completely capture the behavior of $G$, we additionally require the set $L1={(\u22b2,\alpha \u0101),(\u22b2,\beta b-),(\u22b2,\beta \u0113),(\u22b2,\gamma c-)}$, which contains the spines of length 1.

## 7 Constructing the CCG

In this section, let $G=(N,\Sigma ,S,P)$ be a normalized spine grammar with spine direction *d*: Σ →{1,2} and $A=(Q,\Delta ,\Gamma ,\delta ,\tau ,I,F)$ the pop-normalized MPDA constructed in Corollary 18 with *pop*: Γ → *Q*. We note that *Δ* = *Σ′*× Σ^{′′} with *Σ′* = {⊲}∪ (Σ_{2} × *N*^{2}) as well as Σ^{′′} = (Σ_{0} × *N*) ∪ (Σ_{2} × *N*^{2}). Moreover, let ⊥∉*Q* be a special symbol. To provide better access to the components of the MPDA $A$, we define some additional maps.

The spine generator gen: *Q* → *N* is given for every state *q* ∈ *Q* by gen(*q*) = gen(*s*_{2}), where *τ*(*q*) = (*s*_{1},*s*_{2}) ∈ *Δ*. Since $A$ cannot accept strings of length 1, we have to treat them separately. Let $L1={w\u2208Next(S(G))\u2223|w|=1}$ and gen: *L*_{1} → *N* be given by gen(*w*) = *n* for all *w* = (⊲,*α*_{n}) ∈ *L*_{1}. We extend *τ* : *Q* → *Δ* to *τ*^{′′}: (*Q* ∪ *L*_{1}) → *Δ* by *τ′*(*q*) = *τ*(*q*) for all *q* ∈ *Q* and *τ′*(*a*) = *a* for short strings *a* ∈ *L*_{1}.

Recall that *D* = {*/*,∖}. The slash type slash: (*Q* ∖ *F*) → *D* and combining nonterminal comb: (*Q* ∖ *F*) ∪{⊥}→ *N* of a state *q* ∈ *Q* ∖ *F* tell whether the symbol *τ*(*q*) generated by state *q* occurs as the first or second child of its parent symbol and with which spine generator it is combined. Let $\tau (q)=(\sigma n1n2,s2)$ with $\sigma n1n2\u2208\Sigma 2\xd7N2$ and *s*_{2} ∈ Σ^{′′}. The slash type and the combining nonterminal can be determined from the next symbol $\sigma n1n2$. Formally, slash(*q*) =*/* if *d*(*σ*) = 1 and slash(*q*) = ∖ otherwise. In addition, comb(*q*) = *n*_{3−d(σ)} and comb(⊥) = *S*.

We simulate the accepting runs of $A$ in the spines consisting of primary categories of the CCG. The main idea is that the primary categories on the spine store the current configuration of $A$. This is achieved by adding an additional argument for transitions that push a symbol, whereas for each popping transition, an argument is removed. The rightmost argument stores the current state in the first component and the top of the stack in the second component. The previous arguments store the preceding stack symbols in their second components and the state the automaton returns to when the stack symbol stored in the next argument is popped in the first components. To implement the required transformations of consecutive primary categories, the secondary categories need to have a specific structure. This mandates that the categories at the top of a spine (which act as secondary categories unless they belong to the main spine) cannot store their corresponding automaton state in the first component of the last argument as usual, but instead utilize the third component of their target. Thus each argument stores the final state corresponding to its secondary combination partner in the third component. This third component also allows us to decide whether a category is primary: A category is a primary category if and only if the spine generator of the state stored in the first component of the last argument and the spine generator of the state stored in the last component of the target coincide. This is possible since $G$ is normalized, which yields that attaching spines have a spine generator that is different from the spine generator of the spine that they attach to.

**Definition 20.**

We define the CCG $GA,L1=(\Delta 0,A,R,I\u2032,L)$ as follows:

Let *A* = {(*q*,*γ*,*f*) ∈ *A′*∣gen(*f*) = comb(*q*)} with *A′* = (*Q* ∪{⊥}) × Γ × (*F* ∪ *L*_{1}). We use *a*_{i} to refer to the *i*-th component of an atom *a* ∈ *A*. Additionally, let *I′* = {(⊥,*ε*,*f*) ∈ *A*∣gen(*f*) = *S*}.

*ax*∖

*b*, which always needs to fulfill gen(

*a*

_{3}) = gen(

*b*

_{1}).

*well-formed*if | = slash(

*b*

_{1}) and

*b*

_{1}∈

*Q*for every

*i*∈ [ar(

*c*)] with |

*b*= arg(

*c*,

*i*). Let $Cwf={c\u2208C(A)\u2223cwell-formed}$ be the set of well-formed categories. Clearly

*I′*⊆

*C*

_{wf}. In addition, we introduce sets $\u22a4L1$ and $\u22a4A$ of top-of-spine categories derived from the short strings of

*L*

_{1}and the strings accepted by $A$, respectively:

*α*∈

*Δ*

_{0}=

*Σ′*× (Σ

_{0}×

*N*):

Each atom of *A* consists of three components. The first component stores the current state of $A$ (or the special symbol ⊥), the second component stores the current symbol at the top of the stack, and the third component stores the final state corresponding to the combining category of the attaching side spine. With this intuition, the rule system directly implements the transitions of $A$.

The lexicon assigns categories to symbols that can label leaves, so these symbols are taken from the nullary terminal symbols. The assigned categories consist of a category that appears at the top of a spine and an additional argument for the initial state of an accepting run. The spines of length 1 are translated directly to secondary categories or initial categories.

Let us make a few general observations that hold for all the categories that appear in derivation trees of $GA,L1$: (i) All categories are well-formed. This follows from the fact only well-formed categories occur in the lexicon and all categories in the derivation trees consist of atoms and arguments that were already present in the lexicon. (ii) All primary categories *ax* | *b* obey gen(*a*_{3}) = gen(*b*_{1}). This is directly required by the rule system.

Finally, we will now describe how to relabel the derivation trees $D(GA,L1)$ of the CCG $GA,L1$ that uses categories built using the input symbols of the MPDA $A$. Note that only well-formed categories will occur in derivation trees. Primary and non-primary categories are relabeled differently. The relabeling *ρ*: *C*_{wf} → *Δ* is defined for every *c* ∈ *C*_{wf} by *ρ*(*ax* | *b*) = *τ′*(*b*_{1}) for all primary categories *ax* | *b* ∈ *C*_{wf}; i.e., gen(*a*_{3}) = gen(*b*_{1}). Otherwise *ρ*(*ax*) = *τ′*(*a*_{3}) for all initial and secondary categories *ax* ∈ *C*_{wf}.

The following property requires that the spine grammar $G$ is normalized, so a spine never has the same spine generator as its attached spines.

**Lemma 21.**

*For all secondary categories**ax* | *b**we have* gen(*a*_{3})≠gen(*b*_{1}).

We are now ready to describe the general form of primary spines of $GA,L1$. Given a primary spine *c*_{0}…*c*_{n} read from lexicon entry towards the root with *n* ≥ 1, we know that it starts with a lexicon entry *c*_{0} = *ax* | *b* ∈ℒ(*Δ*_{0}) and ends with the non-primary category *ax*, which as such cannot be further modified. Hence each of the categories *c* ∈{*c*_{0},…,*c*_{n−1}} has the form *ax* |_{1}*b*_{1}… |_{m}*b*_{m} with *m* ≥ 1. Let *b*_{i} = (*q*_{i},*γ*_{i},*f*_{i}) for every *i* ∈ [*m*]. The category *c*_{n} is relabeled to *τ′*(*a*_{3}) and *c* is relabeled to *τ′*(*q*_{m}). Additionally, unless *a*_{1} = ⊥, the first components of all atoms in *ax* have the same spine generator gen(*a*_{1}) and gen(*q*_{1}) = ⋯ = gen(*q*_{m}), but gen(*a*_{1})≠gen(*q*_{1}). Finally, neighboring arguments |_{i−1}*b*_{i−1} |_{i}*b*_{i} in the suffix are coupled such that pop(*γ*_{i}) = *q*_{i−1} for all *i* ∈ [*m*] ∖{1}. This coupling is introduced in the rules of second degree and preserved by the other rules.

Using these observations, it can be proved that the primary spines of $GA,L1$ are relabeled to strings of $Next(S(G))$ and vice versa. Additionally, spines attach in essentially the same manner in the CCG and using $F$. This yields the following main theorem.

**Theorem 22.**

*Given a spine grammar*$G$, *we can construct a CCG*$G\u2032$*that can generate*$T(G)$.

**Example 23.**

Figure 6 shows part of the derivation tree of CCG $GA,L1$ that corresponds to the tree of Figure 3a, which is generated by the spinal grammar $G$ of Example 11. We use the following abbreviations: $\alpha =(\u22b2,\alpha a\xaf)$, $\beta =(\u22b2,\beta b\xaf)$, and $\gamma =(\u22b2,\gamma c\xaf)$. The labeling of the depicted section is *δ γ*_{2}*γ*_{2}*β*_{2} for the main spine and *β η*_{2} for the side spine (see Figure 3a). The corresponding runs of $A$ are $(\u2329q0,\omega \u232a,\u2329q1,\omega \u232a,\u2329q1\u2032,\upsilon \omega \u232a,\u2329q2,\upsilon \omega \u232a)$ and $(\u2329p0,\chi \u232a,\u2329p1\u2032,\epsilon \u232a)$.

Let us observe how the transitions of $A$ are simulated by $GA,L1$. The first transition (*q*_{0},*ε*,*ε*,*q*_{1}) on the main spine does not modify the stack. It is implemented by replacing the last argument /(*q*_{0},*ω*,*γ*) by /(*q*_{1},*ω*,*γ*). The next transition $(q1,\epsilon ,\upsilon ,q1\u2032)$ pushes the symbol *υ* to the stack. The argument /(*q*_{1},*ω*,*γ*) is thus replaced by $\u2216(q3,\omega ,\alpha )/(q1\u2032,\upsilon ,p1\u2032)$. As the stack grows, an additional argument with the new state and stack symbol is added. The previous argument stores pop(*υ*) = *q*_{3} to ensure that we enter the correct state after popping *υ*. It also contains the previous unchanged stack symbol *ω*. The popping transition $(p0,\chi ,\epsilon ,p1\u2032)$ on the side spine run is realized by removing /(*p*_{0},*χ*,*β*).

The third components are required to relabel the non-primary categories. At the bottom of the main spine, $c1=(\u22a5,\epsilon ,q3\u2032)/(q0,\omega ,\gamma )$ is a primary category because *q*_{0} and $q3\u2032$ are associated with the same spine generator *s*. Thus, *c*_{1} gets relabeled to *τ′*(*q*_{0}). However, for *c*_{2} = (*q*_{0},*ω*,*γ*)/(*q*_{1},*ω*,*γ*) the spine generators of *γ* and of the output of *q*_{1} are different ($c\xaf$ and *s*). Hence it is a non-primary category and gets relabeled to *γ*.

Concerning the lexicon, *c*_{1} is a lexical category due to the fact that $(\u22a5,\epsilon ,q3\u2032)\u2208\u22a4A$ can appear at the top of a spine as an initial category with $q3\u2032\u2208F$ in its third component, while the appended (*q*_{0},*ω*,*γ*) represents an initial configuration of $A$. Similarly, *c*_{2} is a well-formed secondary category of a rule and the third component of its target is in *L*_{1}. Therefore, it is an element of $\u22a4L1$, which is a subset of the lexicon.

Let us illustrate how the attachment of the side spine to the main spine is realized. The lexicon contains $(q1\u2032,\upsilon ,p1\u2032)\u2216(q2,\upsilon ,\alpha )\u2216(p0,\chi ,\beta )$, of which the first two atoms are responsible for performing a transition on the main spine. This part cannot be modified since the rule system disallows it. The target stores the final state $p1\u2032$ of the side spine run in its third component. The appended argument models the initial configuration of the side spine run starting in state *p*_{0} with *χ* on the stack.

For the converse inclusion we utilize Theorem 20 of Kuhlmann et al. (2019). It states that for every CCG $G\u2032$ there exists an sCFTG that generates the rule trees of $G\u2032$. Whereas derivation trees are labeled by categories, *rule trees* are labeled by lexicon entries at leaves and by applied rules (instead of the output category) at inner nodes. Rule trees are a natural encoding of derivation trees using only a finite set of labels. As each rule indicates the target and last argument of its output category, rule trees can be relabeled in the same manner as derivation trees. For completeness’ sake we restate Definition 16 of Kuhlmann et al. (2019).

**Definition 24.**

Let $G=(\Sigma ,A,R,I,L)$ be a CCG and T = *T*_{R,∅}(ℒ(Σ)). A tree *t* ∈T is a *rule tree* if cat(*t*) ∈ *I*, where the partial map cat: T → *C*(*A*) is inductively defined by (i) cat(*a*) = *a* for all lexicon entries *a* ∈ℒ(Σ), (ii) $cat(axyax\u2216bby(t1,t2))=azy$ for all trees *t*_{1},*t*_{2} ∈T with cat(*t*_{1}) = *az*/*b* and cat(*t*_{2}) = *by*, and (iii) $cat(axybyax\u2216b(t1,t2))=azy$ for all *t*_{1},*t*_{2} ∈T with cat(*t*_{1}) = *by* and cat(*t*_{2}) = *az*∖*b*. The set of all rule trees of $G$ is denoted by $R(G)$.

We observe that any category relabeling can equivalently be applied to rule trees instead of derivation trees (because a category relabeling only depends on the target *a* and the last argument | *b* of a category *ax* | *b*). This yields the second main theorem.

**Theorem 25.**

*CCGs and sCFTGs are strongly equivalent up to relabeling*.

Kepser and Rogers (2011) proved that TAGs and sCFTGs are strongly equivalent, which shows that they are also strongly equivalent (up to relabeling) to CCGs.

**Corollary 26.**

*CCGs and TAGs are strongly equivalent up to relabeling*.

Clearly, from strong equivalence we can conclude weak equivalence as well (without the relabeling since the lexicon provides the relabeling). Weak equivalence was famously proven by Vijay-Shanker and Weir (1994), but Theorem 3 of Kuhlmann et al. (2015) shows that the original construction is incorrect. However, Weir (1988) provides an alternative construction and proof. Our contribution provides a stronger form (and proof) of this old equivalence result. It avoids the *ε*-entries that the original construction heavily relies on. An *ε*-entry is a category assigned to the empty string; these interspersed categories form the main building block in the original constructions. The necessity of these *ε*-entries (Vijay-Shanker and Weir, 1994) is an interesting and important question that naturally arises and has been asked by Kuhlmann et al. (2015). We settle this question and demonstrate that they can be avoided.

**Corollary 27.**

*CCGs and TAGs are weakly equivalent, and CCGs with**ε*-*entries and CCGs generate the same* (*ε*-*free) languages*.

The tree expressive power of CCGs with restricted rule degrees has already been investigated by Kuhlmann et al. (2019). It has been shown that 0-CCGs accept a proper subset of the regular tree languages (Gécseg and Steinby, 1997), whereas 1-CCGs accept exactly the regular tree languages. It remained open whether there is a *k* such that *k*-CCGs and (*k* + 1)-CCGs have the same expressive power. Our construction establishes that 2-CCGs are as expressive as *k*-CCGs for arbitrary *k* ≥ 2. Another consequence of our construction is that first-order categories are sufficient.

**Corollary 28.**

*2-CCGs with first-order categories have the same expressive power ask*-CCGs *with**k* > 2.

## 8 Conclusion

We presented a translation from spine grammar to CCG. Due to the strong equivalence of spine grammar and TAG (Kepser and Rogers, 2011), we can also construct a strongly equivalent CCG for each TAG. Together with the translation from CCG to sCFTG (Kuhlmann et al., 2019), this proves the strong equivalence of TAG and CCG, which means that both formalisms generate the same derivation trees modulo relabelings. Our construction uses CCG rules of degree at most 2, only first-order categories, lexicon entries of arity at most 3, and no *ε*-entries in the lexicon. Such CCGs thus have full expressive power. Avoiding *ε*-entries is particularly interesting because they violate the Principle of Adjacency (Steedman, 2000, p. 54), which is a fundamental linguistic principle underlying CCG and requires that all combining categories correspond to phonologically realized counterparts in the input and are string-adjacent. Their elimination is performed by trimming them from the sCFTG obtained from a CCG with *ε*-entries and translating the trimmed sCFTG back to a CCG using our construction.

Translating CCG to sCFTG (Kuhlmann et al., 2019) yields sCFTGs whose size is exponential in a CCG-specific constant, which depends on the maximal rule degree and the maximal arity of lexicon entries. The increase can be attributed to variables in CCG rules, which need to be properly instantiated. Our construction increases the grammar size only polynomially, which can be verified for each step. Overall, a *k*-CCG can be converted to an equivalent 2-CCG without *ε*-entries in time and space exponential in *k* (and the maximal length of lexicon entries) and polynomial in the size of the grammar.

## Acknowledgments

We would like to thank Mark Steedman and the three anonymous reviewers for their valuable and detailed comments, which greatly helped in improving the comprehensibility of this paper. The work of Lena Katharina Schiffer was funded by the German Research Foundation (DFG) Research Training Group GRK 1763 ‘Quantitative Logics and Automata’.