## Abstract

The universal generation problem for unification grammars is the problem of determining whether a given grammar derives any terminal string with a given feature structure. It is known that the problem is decidable for LFG and PATR grammars if only acyclic feature structures are taken into consideration. In this brief note, we show that the problem is undecidable for cyclic structures. This holds even for grammars that are off-line parsable.

The universal generation problem for unification grammars is the problem of determining for an arbitrary grammar *G* and an arbitrary feature structure *F* whether there exists at least one sentence that *G* derives with *F*. If *F* is acyclic, Wedekind and Kaplan (2012) have shown that the problem is decidable for LFG (Kaplan and Bresnan 1982) and PATR (Shieber et al. 1983) grammars. They prove that the set of strings that a grammar relates to an acyclic feature structure can be described by a context-free grammar. Decidability of the problem then follows because the emptiness problem is decidable for context-free languages. For cyclic feature structures they demonstrated by example that the set of strings that a grammar relates to an input might not be context-free, but they did not further investigate the formal properties of the languages that are in general related to cyclic structures.

In this brief note, we show the undecidability of the universal generation problem by reduction from the undecidable emptiness problem for the intersection of two context-free languages. We provide a proof for LFG- or PATR-style grammars that associate feature structures with trees derived in accordance with a context-free grammar. Our result also applies to other systems such as HPSG (Pollard and Sag 1994) whose formal devices are powerful enough to simulate, albeit indirectly, the effect of context-free derivation.

To state the universal generation problem more formally, recall that a unification grammar *G* defines a binary derivation relation *Δ*_{G} between terminal strings and feature structures, as given in (1).

(1)

*Δ*_{G}(*s*,*F*) iff*G*derives terminal string*s*with feature structure*F*

*G*and an arbitrary feature structure

*F*whether {

*s*|

*Δ*

_{G}(

*s*,

*F*)} is empty or not.

For the reduction of the emptiness problem for the intersection of two context-free languages, we can, without loss of generality, assume that the context-free languages are *ε*-free. These languages can be described by grammars in Chomsky normal form, that is, by context-free grammars *G* = (*N*,*T*,S,*P*) with nonterminal vocabulary *N*, terminal vocabulary *T*, and start symbol S where every rule in *P* is of the form *A* →*BC* with *B*,*C* ∈ *N*, or *A* →*a* with *a* ∈ *T*.

For the proof we first define for each context-free grammar *G* in Chomsky normal form two LFG grammars that both derive *L*(*G*) and that associate with each derivable terminal string feature structures (f-structures) that provide slightly different encodings of the derivable string.

*G*= (

*N*,

*T*, S,

*P*) be a context-free grammar in Chomsky normal form. A

**type 1 string grammar**

*String*

_{1}(

*G*) for

*G*is an LFG grammar (

*N*,

*T*, S,

*P*′) whose rule set

*P*′ includes for each rule

*A*→

*B*

*C*in

*P*a rule of the form (2a) and for each rule

*A*→

*a*in

*P*a rule of the form (2b).A

**type 2 string grammar**

*String*

_{2}(

*G*) for

*G*is an LFG grammar (

*N*,

*T*, S,

*P*′) whose rule set

*P*′ includes a rule of the form (3a) for each

*A*→

*B*

*C*in

*P*and a rule of the form (3b) for each

*A*→

*a*in

*P*.

Figure 1 illustrates a c-structure and the f-structures associated with it by type 1 and type 2 string grammar derivations.^{1} The attributes l, r, b, and e are mnemonic for ‘left’, ‘right’, ‘begin’, and ‘end’, respectively. For later reference, we also depicted the constant *root* that we uniformly use to instantiate the ↑ of a derivation that refers to the c-structure root; *root* then labels the f-structure element to which it refers in the minimal model of the f-description. (In Kaplan and Bresnan’s 1982 terminology, *root* corresponds to the f-structure variable associated with the c-structure root, usually notated by *f*_{1}.)

Both types of string grammars have in common that they have *G* as their context-free skeleton and that for every string in *L*(*G*), the f-structure for each string grammar encodes both the string itself and also the branching structure of a derivation in *G* that leads to that terminal string. The f-structures derived by the two types of grammars vary only slightly in the labels that they use to encode those properties. An f-structure of a type 2 grammar derivation for a given string shares the ‘begin’ attribute (b) with the f-structure of a corresponding type 1 grammar derivation, but it has distinct ‘left’, ‘right’, and ‘end’ attributes (l', r', e').

Because the derived f-descriptions can never be unsatisfiable (the string grammars do not contain atomic values), the f-structure constraints of the string grammars do not actually filter the language of the context-free grammar. Thus *G* and its string grammars must have the same language *L*(*G*) = *L*(*String*_{1}(*G*)) = *L*(*String*_{2}(*G*)). By induction on the depth of the derivation trees it is also easy to see that the minimal solution of the f-description of a derivation of a terminal string *s* is acyclic and single-rooted, and satisfies (*root*b*s*′) = (*root*e) and (*root*b*s*′) = (*root*e'), respectively, if and only if *s*′ = *s*. That is, these grammars both encode their terminal strings in their respective (*root*b) to (*root*e)/(*root*e') paths.

Before going into the details of the undecidability proof, we first give an outline of the proof idea. For the reduction, we have to construct for two arbitrary *ε*-free context-free languages *L*_{1} and *L*_{2} an LFG grammar *G* and an input structure *F* such that the set of terminal strings that *G* derives with *F* is empty if and only if the intersection of *L*_{1} and *L*_{2} is empty. Because every *ε*-free context-free language is derivable by a context-free grammar in Chomsky normal form, we can make the LFG grammar *G* by combining the productions of *String*_{1}(*G*_{1}) and *String*_{2}(*G*_{2}), for two arbitrary context-free grammars *G*_{1} = (*N*_{1},*T*_{1}, S_{1},*P*_{1}) and *G*_{2} = (*N*_{2},*T*_{2}, S_{2},*P*_{2}) in Chomsky normal form. To avoid undesired interactions between the rules of the two string grammars, we assume that the sets of nonterminals of *G*_{1} and *G*_{2} are disjoint (this is without loss of generality because nonterminals can always be renamed).

We observed already that the string grammars *String*_{1}(*G*_{1}) and *String*_{2}(*G*_{2}) associate with any c-structure derivation of a terminal string *s*_{1} in *G*_{1} and any c-structure derivation of a terminal string *s*_{2} in *G*_{2} f-structures that encode *s*_{1} and *s*_{2} as their respective (*root*b) values. By construction of the string grammars, the only paths that the two f-structures share are the paths (*root*b*s*′) where *s*′ is a common prefix of *s*_{1} and *s*_{2}. Thus, if we define *G* to consist of the rules of *String*_{1}(*G*_{1}) and *String*_{2}(*G*_{2}), and a start rule that expands S to S_{1} S_{2} and forces the f-structures for *s*_{1} and *s*_{2} to unify, their (*root*e) and (*root*e') paths become reentrant ((*root*e) = (*root*e')) if and only if *s*_{1} and *s*_{2} are identical. *G* then assigns to a terminal string an f-structure with reentrant (*root*e) and (*root*e') paths if and only if it has the form *s*′*s*′ and *s*′ is in *L*(*G*_{1}) ∩ *L*(*G*_{2}).

*s*

_{1}

*s*

_{2}would still record information on the structure of their derivation. Thus distinct strings in {

*s*′

*s*′ |

*s*′ ∈

*L*(

*G*

_{1}) ∩

*L*(

*G*

_{2})} would get assigned distinct f-structures. However, the proof requires that there be a single f-structure that is assigned to all strings

*s*′

*s*′ with

*s*′ in

*L*(

*G*

_{1}) ∩

*L*(

*G*

_{2}). We achieve that by annotating the start rule so that the unified f-structures derived by the string grammars are folded up into one and the same cyclic f-structure

*F*. This f-structure consists of a single element (node) and |

*T*

_{1}∩

*T*

_{2}| + 7 cycles of length 1, each one labeled with one of the attributes in {b, l, r, l', r', e, e'} ∪ (

*T*

_{1}∩

*T*

_{2}).

*F*thus has the following form.

^{2}

*F*must contain cycles for all terminals in

*T*

_{1}∩

*T*

_{2}, so that it imposes no constraints on the strings that may appear in

*L*(

*G*

_{1}) ∩

*L*(

*G*

_{2}).

The folding into *F* is accomplished by annotations of *G*’s start rule whose contribution to a derivation in *G* is depicted in Figure 2. As earlier, we include *root* for later use. Obviously, if (*root*e) = (*root*e') holds in the unified f-structures of the string grammars, then the unification of the string grammar f-structures and the structure in Figure 2 yields *F*. Otherwise, their unification results in a structure that only properly subsumes *F*. This is because neither (*root*e) nor (*root*e'e) exists in the unified f-structures of the two string grammars, and therefore their values in the structure in Figure 2 are not merged when the structures are unified. Thus *G* derives with *F* exactly the set of strings *s*′*s*′ with *s*′ in *L*(*G*_{1}) ∩ *L*(*G*_{2}). Hence, this set is empty if and only if *L*(*G*_{1}) ∩ *L*(*G*_{2}) is empty.

We now give a rigorous statement and proof of our undecidability theorem.

**Theorem**

*For an arbitrary LFG grammar G** and an arbitrary f-structure F** it is undecidable whether *{*s*|*Δ*_{G}(*s*, *F*)} = ∅*.*

**Proof**

*G*

_{1}= (

*N*

_{1},

*T*

_{1}, S

_{1},

*P*

_{1}) and

*G*

_{2}= (

*N*

_{2},

*T*

_{2}, S

_{2},

*P*

_{2}) be two arbitrary context-free grammars in Chomsky normal form. Without loss of generality, we can assume that

*N*

_{1}∩

*N*

_{2}= ∅. On the basis of

*String*

_{1}(

*G*

_{1}) and

*String*

_{2}(

*G*

_{2}) we construct an LFG grammar

*G*= (

*N*,

*T*, S,

*P*) with

*N*=

*N*

_{1}∪

*N*

_{2}∪ { S},

*S*∉

*N*

_{1}∪

*N*

_{2}, and

*T*=

*T*

_{1}∪

*T*

_{2}. The rule set

*P*consists of the rules of

*String*

_{1}(

*G*

_{1}) and

*String*

_{2}(

*G*

_{2}) and the following start rule.The functional contribution of this start rule to a derivation in

*G*is depicted in Figure 2. The (↑ ee) = ↑ annotation at S

_{1}introduces the left cycle and the annotations at S

_{2}account for the rest. Now let

*F*be the f-structure in (4) and consider an arbitrary derivation of a terminal string

*s*with f-description

*FD*in

*G*. By construction of

*G*,

*s*must have the form

*s*

_{1}

*s*

_{2}, with

*s*

_{1}derived from S

_{1}and

*s*

_{2}derived from S

_{2}. We claim

*s*

_{1}=

*s*

_{2}iff

*F*is the f-structure for

*FD*. Note first that also

*G*does not contain atomic values. Thus,

*FD*cannot be unsatisfiable and must have an f-structure.

If *s*_{1} = *s*_{2}, then *FD* ⊢ (*root*e) = (*root*e'), since (*root*b*s*_{1}) = (*root*e) and (*root*b*s*_{2}) = (*root*e') follow from *FD*. From (*root*e) = (*root*e') and the instantiated annotations of the S rule, we get (*root x* = *root*), for all *x* ∈ {b, l, r, l', r'} ∪ (*T*_{1} ∩ *T*_{2}). With these equations we can then derive from (*root*b*s*_{1}) = (*root*e) and (*root*b*s*_{2}) = (*root*e') equations *root* = (*root x*), for *x* ∈ {e, e'}. Thus *F* must be the f-structure that we obtain from a minimal model of *FD*.

Now suppose *s*_{1} ≠ *s*_{2}. Let *FD*_{1} and *FD*_{2} be the f-descriptions of the string grammars, , , with *n*_{1} and *n*_{2} instantiating ↓ in the annotations at S_{1} and S_{2}, and . By construction of *G*, the only terms shared by the deductive closures of and are the common subterms of (*root*b*s*_{1}) and (*root*b*s*_{2}). Thus , because otherwise and would imply *s*′ = *s*_{1} and *s*′ = *s*_{2}, as we saw earlier. Because obviously (*root*ee) and (*root*ee) do not occur in any equation derivable from *FD*′, (*root*e) = (*root*e′) cannot follow from *FD* either, and *F* cannot be the f-structure for *FD*.

Thus {*s*|*Δ*_{G}(*s*,*F*)} = {*s*′*s*′|*s*′ ∈ *L*(*G*_{1}) ∩ *L*(*G*_{2})} and hence {*s*| *Δ*_{G}(*s*,*F*)} = ∅ if and only if *L*(*G*_{1}) ∩ *L*(*G*_{2}) = ∅. Since the emptiness problem for the intersection of context-free languages is in general undecidable, the generation problem must be undecidable too.

As a consequence of this theorem we know that there does not exist a general generation algorithm, at least if cyclic input structures are considered as legitimate inputs.

We note that the grammars constructed in this proof are off-line parsable (cf., e.g., Kaplan and Bresnan 1982; Johnson 1988; Jaeger et al. 2005). Off-line parsability is sufficient to guarantee the decidability of the recognition/parsing problem even for cyclic f-structures. But Wedekind and Kaplan (2012) have shown that off-line parsability is not necessary to guarantee that generation from acyclic structures is decidable, and the grammars in this proof demonstrate that it is not sufficient for cyclic structures.

Off-line parsability typically bounds the size of the c-structures of a string by a function of the length of that string. This works for parsing because the size of the f-structure is bounded by the size of the c-structure, but it is insufficient for generation because it does not constrain the structural correspondence between the c- and f-structure (see also Dymetman 1991). A single constraint that guarantees decidability for both parsing and generation must not only bound the size of the f-structures for a terminal string by the length of the string, but it must also ensure, as we have learned from the proof herein, that the determination of the terminal strings for an f-structure can be achieved with finite control.

## Acknowledgments

The author wishes to thank Ron Kaplan for his insightful comments and helpful suggestions during the preparation of this paper, and the four anonymous reviewers for their valuable feedback on an earlier draft.

## Notes

Note that the terminal symbols also occur as attributes in the annotations of the terminal rules. This “abuse” of the terminal symbols is not essential to our argument (a set of new attributes that is in one-to-one correspondence with the set of terminals would also suffice), but it makes the encoding of the terminal strings in the f-structures more perspicuous.

This f-structure may look peculiar in that it does not contain atomic feature values. However, this is not relevant to the proof. To make the f-structure look more “natural,” we can, for example, expand *G* by an annotation (↑ d) = v at the start rule and *F* by a feature d with value v.

## References

## Author notes

Center for Language Technology, University of Copenhagen, Njalsgade 140, 2300 Copenhagen S, Denmark. E-mail: jwedekind@hum.ku.dk.