Abstract
Tree transducers are defined as relations between trees, but in syntax-based machine translation, we are ultimately concerned with the relations between the strings at the yields of the input and output trees. We examine the formal power of Multi Bottom-Up Tree Transducers from this point of view.
1. Introduction
Many current approaches to syntax-based statistical machine translation fall under the theoretical framework of synchronous tree substitution grammars (STSGs). Tree substitution grammars (TSGs) generalize context-free grammars (CFGs) in that each rule expands a nonterminal to produce an arbitrarily large tree fragment, rather than a fragment of depth one as in a CFG . Synchronous TSGs generate tree fragments in the source and target languages in parallel, with each rule producing a tree fragment in either language. Systems such as that of Galley et al. (2006) extract STSG rules from parallel bilingual text that has been automatically parsed in one language, and the STSG nonterminals correspond to nonterminals in these parse trees. Chiang’s 2007 Hiero system produces simpler STSGs with a single nonterminal.
STSGs have the advantage that they can naturally express many re-ordering and restructuring operations necessary for machine translation (MT). They have the disadvantage, however, that they are not closed under composition (Maletti et al. 2009). Therefore, if one wishes to construct an MT system as a pipeline of STSG operations, the result may not be expressible as an STSG. Recently, Maletti (2010) has argued that multi bottom–up tree transducers (MBOTs) (Lilin 1981; Arnold and Dauchet 1982; Engelfriet, Lilin, and Maletti 2009) provide a useful representation for natural language processing applications because they generalize STSGs, but have the added advantage of being closed under composition. MBOTs generalize traditional bottom–up tree transducers in that they allow transducer states to pass more than one output subtree up to subsequent transducer operations. The number of subtrees taken by a state is called its rank. MBOTs are linear and non-deleting; that is, operations cannot copy or delete arbitrarily large tree fragments.
Although STSGs and MBOTs both perform operations on trees, it is important to note that, in MT, we are primarily interested in translational relations between strings. Tree operations such as those provided by STSGs are ultimately tools to translate a string in one natural language into a string in another. Whereas MBOTs originate in the tree transducer literature and are defined to take a tree as input, MT systems such as those of Galley et al. (2006) and Chiang (2007) find a parse of the source language sentence as part of the translation process, and the decoding algorithm, introduced by Yamada and Knight (2002), has more in common with CYK parsing than with simulating a tree transducer.
In this article, we investigate the power of MBOTs, and of compositions of STSGs in particular, in terms of the set of string translations that they generate. We relate MBOTs and compositions of STSGs to existing grammatical formalisms defined on strings through five main results, which we outline subsequently. The first four results serve to situate general MBOTs among string formalisms, and the fifth result addresses MBOTs resulting from compositions of STSGs in particular.
Our first result is that the translations produced by MBOTs are a subset of those produced by linear context-free rewriting systems (LCFRSs) (Vijay-Shankar, Weir, and Joshi 1987). LCFRS provides a very general framework that subsumes CFG, tree adjoining grammar (TAG; Joshi, Levy, and Takahashi 1975; Joshi and Schabes 1997), and more complex systems, as well as synchronous context-free grammar (SCFG) (Aho and Ullman 1972) and synchronous tree adjoining grammar (STAG) (Shieber and Schabes 1990; Schabes and Shieber 1994) in the context of translation. LCFRS allows grammar nonterminals to generate more than one span in the final string; the number of spans produced by an LCFRS nonterminal corresponds to the rank of an MBOT state. Our second result states that the translations produced by MBOTs are equivalent to a specific restricted form of LCFRS, which we call 1-m-LCFRS. From the construction relating MBOTs and 1-m-LCFRSs follow results about the source and target sides of the translations produced by MBOTs. In particular, our third result is that the translations produced by MBOTs are context-free within the source language, and hence are strictly less powerful than LCFRSs. This implies that MBOTs are not as general as STAGs, for example. Similarly, MBOTs are not as general as the generalized multitext grammars proposed for machine translation by Melamed (2003), which retain the full power of LCFRSs in each language (Melamed, Satta, and Wellington 2004). Our fourth result is that the output of an MBOT, when viewed as a string language, does retain the full power of LCFRSs. This fact is mentioned by Engelfriet, Lilin, and Maletti (2009, page 586), although no explicit construction is given.
Our final result specifically addresses the string translations that result from compositions of STSGs, with the goal of better understanding the complexity of using such compositions in machine translation systems. We show that the translations produced by compositions of STSGs are more powerful than those produced by single STSGs, or, equivalently, by SCFGs . Although it is known that STSGs are not closed under composition, the proofs used previously in the literature rely on differences in tree structure, and do not generate string translations that cannot be generated by STSG. Our result implies that current approaches to machine translation decoding will need to be extended to handle arbitrary compositions of STSGs.
2. Preliminaries
S, Σ, and Δ are ranked alphabets of states, input symbols, and output symbols, respectively.
F ⊂ S is a set of accepting states.
R is a finite set of rules l → r where, using a set of variables X, l ∈ TΣ(S(X)), and r ∈ S(TΔ(X)) such that:
every x ∈ X that occurs in l occurs exactly once in r and vice versa, and
l ∉ S(X) or r ∉ S(X).
We define a translation to be a set of string pairs, and we define the yield of an MBOT M to be the set of string pairs (s, t) such that there exist: a tree s′ ∈ TΣ having s as its yield, a tree t′ ∈ TΔ having t as its yield, and a transduction from s′ to t′ that is accepted by M. We refer to s as the source side and t as the target side of the translation. We use the notation source(T) to denote the set of source strings of a translation T, source(T) = { s |(s,t) ∈ T }, and we use the notation target(T) to denote the set of target strings. We use the notation yield(MBOT) to denote the set of translations produced by the set of all MBOTs.
Here, all nonterminals have fan-out one, reflected in the fact that all tuples defining the productions’ functions contain just one string. Just as CFG is equivalent to LCFRS with fan-out 1, SCFG and TAG can be represented as LCFRS with fan-out 2. Higher values of fan-out allow strictly more powerful grammars (Rambow and Satta 1999). Polynomial-time parsing is possible for any fixed LCFRS grammar, but the degree of the polynomial depends on the grammar. Parsing general LCFRS grammars, where the grammar is considered part of the input, is NP-complete (Satta 1992).
Following Melamed, Satta, and Wellington (2004), we represent translation in LCFRS by using a special symbol # to separate the strings of the two languages. Our LCFRS grammars will only generate strings of the form s#t, where s and t are strings not containing the symbol #, and we will identify s as the source string and t as the target string. We use the notation trans(LCFRS) to denote the set of translations that can be produced by taking the string language of some LCFRS and splitting each string into a pair at the location of the # symbol.
3. Translations Produced by General MBOTs
In this section, we relate the yield of general MBOTs to string rewriting systems.
To begin, we show that the translation produced by any MBOT is also produced by an LCFRS by giving a straightforward construction for converting MBOT rules to LCFRS rules.
We first consider MBOT rules having only variables, as opposed to alphabet symbols of rank zero, at their leaves. For an MBOT rule l → r with l ∈ TΣ(S(X)), let S1, S2, …, Sk be the sequence of states appearing from left to right immediately above the leaves of l. Without loss of generality, we will name the variables such that xi,j is the jth child of the ith state, Si, and the sequence of variables at the leaves of l, read from left to right, is: x1,1,…,x1,d(S1),…,xk,1,…,xk,d(Sk), where d(Si) is the rank of state Si. Let S0 be the state symbol at the root of the right-hand-side (r.h.s.) tree r ∈ S(TΔ(X)). Let π and μ be functions such that xπ(1),μ(1), xπ(2),μ(2), …, xπ(n),μ(n) is the sequence of variables at the leaves of r read from left to right. We will call this sequence the yield of r. Finally, let p(i) for 1 ≤ i ≤ d(S0) be the position in the yield of r of the rightmost leaf of S0’s ith child. Thus, for all i, 1 ≤ p(i) ≤ n.
Finally, we add a start rule rule S → g(Si), g(〈e,f〉) = 〈e#f〉 for each Si ∈ F to generate all final states Si of the MBOT from the start symbol S of the LCFRS.
We now show that the language of the LCFRS constructed from a given MBOT is identical to the yield of the MBOT. We represent MBOT transductions as derivation trees, where each node is labeled with an MBOT rule, and each node’s children are the rules used to produce the subtrees matched by any variables in the rule. We can construct an LCFRS derivation tree by simply relabeling each node with the LCFRS rule constructed from the node’s MBOT rule. Because, in the MBOT derivation tree, each node has children which produce the states required by the the MBOT rule’s left-hand side (l.h.s.), it also holds that, in the LCFRS derivation tree, each node has as its children rules which expand the set of nonterminals appearing in the parent’s r.h.s. Therefore the LCFRS tree constitutes a valid derivation.
Given the mapping from MBOT derivations to LCFRS derivations, the following lemma relates the strings produced by the derivations:
Lemma 1
Let TMBOT be an MBOT derivation tree with I as its input tree and O as its output tree, and construct TLCFRS by mapping each node nMBOT in TMBOT to a node nLCFRS labeled with the LCFRS production constructed from the rule at nMBOT.Let 〈t0,t1,…,tk〉 be the string tuple returned by the LCFRS combination function at any node nLCFRS in TLCFRS. The string t0 contains the yield of the node of I at which the MBOT rule at the node of TMBOT corresponding to nLCFRS was applied. Furthermore, the strings t1, …, tk contain the k yields of the k MBOT output subtrees (subtrees of O) that are found as children of the root (state symbol) of the MBOT rule’s right-hand side.
Proof
The correspondence between LCFRS string tuples and MBOT tree yields gives us our first result:
Theorem 1
yield(MBOT) ⊂ trans(LCFRS).
Proof
From a given MBOT, construct an LCFRS as described previously. For any transduction of the MBOT, from Lemma 1, there exists an LCFRS derivation which produces a string consisting of the yield of the MBOT’s input and output trees joined by the # symbol. In the other direction, we note that any valid derivation of the LCFRS corresponds to an MBOT transduction on some input tree; this input tree can be constructed by assembling the left-hand sides of the MBOT rules from which the LCFRS rules of the LCFRS derivation were originally constructed. Because there is a one-to-one correspondence between LCFRS and MBOT derivations, the translation produced by the LCFRS and the yield of the MBOT are identical.
Because we can construct an LCFRS generating the same translation as the yield of any given MBOT, we see that yield(MBOT) ⊂ trans(LCFRS). ▪
The translations produced by MBOTs are equivalent to the translations produced by a certain restricted class of LCFRS grammars, which we now specify precisely.
Theorem 2
The class of translations yield(MBOT) is equivalent to yield(1-m-LCFRS), where 1-m-LCFRS is defined to be the class of LCFRS grammars where each rule either is a start rule of the form S → g(Si), g(〈e,f〉) = 〈e#f〉, or meets both of the following conditions:
The function g1 returns a tuple of length 1.
Proof
Our construction for transforming an MBOT to an LCFRS produces LCFRS grammars satisfying the given constraints, so yield(MBOT) ⊂ trans(1-m-LCFRS).
Because we have containment in both directions, yield(MBOT) = trans(1-m-LCFRS). ▪
We now move on to consider the languages formed by the source and target projections of MBOT translations.
Grammars of the class 1-m-LCFRS have the property that, for any nonterminal A (other than the start symbol S) having fan-out ϕ(A), one span is always realized in the source string (to the left of the # separator), and ϕ(A) − 1 spans are always realized in the target language (to the right of the separator). This property is introduced by the start rules S → g(Si), g(〈e,f〉) = 〈e#f〉 and is maintained by all further productions because of the condition on 1-m-LCFRS that the combination function must keep the two sides of translation separate. For a 1-m-LCFRS rule constructed from an MBOT, we define the rule’s source language projection to be the rule obtained by discarding all the target language spans, as well as the separator symbol # in the case of the start productions. The definition of 1-m-LCFRS guarantees that the combination function returning a rule’s l.h.s. source span needs to have only the r.h.s. source spans available as arguments.
For an LCFRS G, we define L(G) to be the language produced by G. We define source(G) to be the LCFRS obtained by projecting each rule in G. Because more than one rule may have the same projection, we label the rules of source(G) with their origin rule, preserving a one-to-one correspondence between rules in the two grammars. Similarly, we obtain a rule’s target language projection by discarding the source language spans, and define target(G) to be the resulting grammar.
Lemma 2
For an LCFRS G constructed from an MBOT M by the given construction, L(source(G)) = source(trans(M)), and L(target(G)) = target(trans(M)).
Proof
There is a valid derivation tree in the source language projection for each valid derivation tree in the full LCFRS, because for any expansion rewriting a nonterminal of fan-out ϕ(A) in the full grammar, we can apply the projected rule to the corresponding nonterminal of fan-out 1 in the projected derivation. In the other direction, for any expansion in a derivation of the source projection, a nonterminal of fan-out ϕ(A) will be available for expansion in the corresponding derivation of the full LCFRS. Because there is a one-to-one correspondence between derivations in the full LCFRS and its source projection, the language generated by the source projection is the source of the translation generated by the original LCFRS. By the same reasoning, there is a one-to-one correspondence between derivations in the target projection and the full LCFRS, and the language produced by the target projection is the target side of the translation of the full LCFRS. ▪
Lemma 2 implies that it is safe to evaluate the power of the source and target projections of the LCFRS independently. This fact leads to our next result.
Theorem 3
yield(MBOT) ⊊ trans(LCFRS).
Proof
Although the source side of the translation produced by an MBOT must be a context-free language, we now show that the target side can be any language produced by an LCFRS.
Theorem 4
target(yield(MBOT)) = LCFRS
Proof
Given our earlier construction for generating the target projection of the LCFRS derived from an MBOT, we know that target(yield(MBOT)) ⊂ LCFRS. Combining these two facts yields the theorem. ▪
4. Composition of STSGs
Maletti et al. (2009) discuss the composition of extended top–down tree transducers, which are equivalent to STSGs, as shown by Maletti (2010). They show that this formalism is not closed under composition in terms of the tree transformations that are possible. In this article, we focus on the string yields of the formalisms under discussion, and from this point of view we now examine the question of whether the yield of the composition of two STSGs is itself the yield of an STSG in general. It is important to note that, although we focus on the yield of the composition, in our notion of STSG composition, the tree structure output by the first STSG still serves as input to the second STSG.
In terms of string translations, STSGs and SCFGs are equivalent, because any SCFG is also an STSG with rules of depth 1, and any STSG can be converted to an SCFG with the same string translation by simply removing the internal tree nodes in each rule. We will adopt SCFG terminology for our proof because the internal structure of STSG rules is not relevant to our result.
- 1.
Associate each sequence of terminals with the preceding nonterminal, or the following nonterminal in the case of initial terminals.
- 2.
Replace each group consisting of a nonterminal and its associated terminals with a fresh nonterminal A, and add a rule rewriting A as the group in source and target. (Nonterminals with no associated terminals may be left intact.)
- 3.
In each rule created in the previous step, replace each sequence of terminals with another fresh nonterminal B, and add a rule rewriting B as the terminal sequence in source and target.
Lemma 3
Let π be a preterminal permutation produced by an SCFG derivation containing rules of maximum rank r, and let π′ be a permutation obtained from π by removing some elements and renumbering the remaining elements with a strictly increasing function. Then π′ falls within the class of compositions of permutations of length r.
Proof
From each rule in the derivation producing preterminal permutation π, construct a new rule by removing any nonterminals whose indices were removed from π. The resulting sequence of rules produces preterminal permutation π′ and contains rules of rank no greater than r. ▪
As an example of Lemma 3, removing any element from the permutation (3,2,1) results in the permutation (2,1), which can still (trivially) be produced by an SCFG of rank 2.
We will make use of another general fact about SCFGs, which we derive by applying Ogden’s Lemma (Ogden 1968), a generalized pumping lemma for context-free languages, to the source language of an SCFG.
Lemma 4 (Ogden’s Lemma)
For each context-free grammar G = (V, Σ, P, S) there is an integer k such that for any word ξ in L(G), if any k or more distinct positions in ξ are designated as distinguished, then there is some A in V and there are words α, β, γ, δ, and μ in Σ* such that:
S ⇒* αAμ ⇒* αβAδμ ⇒* αβγδμ = ξ, and hence αβm γδmμ ∈ L(G) for all m ≥ 0.
γ contains at least one of the distinguished positions.
Either α and β both contain distinguished positions, or δ and μ both contain distinguished positions.
βγδ contains at most k distinguished positions.
Lemma 5
For each SCFG G = (V, Σ, Δ, P, S) having source alphabet Σ and target alphabet Δ, there is an integer k such that for any string pair (ξ, ξ′) in L(G), if any k or more distinct positions in ξ are designated as distinguished, then there is some A in V and there are words α, β, γ, δ, and μ in Σ* and α′, β′, γ′, δ′, and μ′ in Δ* such that:
γ contains at least one of the distinguished positions.
Either α and β both contain distinguished positions, or δ and μ both contain distinguished positions.
βγδ contains at most k distinguished positions.
Proof
We refer to a substring arising from a term cni or dni in the definition of (Equation (2)) as a run. In order to distinguish runs, we refer the run arising from cni or dni as the ith run. We refer to the pair (cni, cni) or (dni, dni) consisting of the ith run in the source and target strings as the ith aligned run. We now use Lemma 5 to show that aligned runs must be generated from aligned preterminal pairs.
Lemma 6
Assume that some SCFG G′ generates the translation for some fixed ℓ. There exists a constant k such that, in any derivation of grammar G′ having each ni > k, for any i, 1 ≤ i ≤ 2ℓ, there exists at least one aligned preterminal pair among the subsequences of source and target preterminals generating the ith aligned run.
Proof
We consider a source string , such that the length ni of each run is greater than the constant k of Lemma 5. For a fixed i, 1 ≤ i ≤ 2ℓ, we consider the distinguished positions to be all and only the terminals in the ith run. This implies that the run can be pumped to be arbitrarily long; indeed, this follows from the definition of the language itself.
Because our distinguished positions are within the ith run, and because Lemma 5 guarantees that either α, β, and γ all contain distinguished positions or γ, δ, and μ all contain distinguished positions, we are guaranteed that either β or δ lies entirely within the ith run. Consider the case where β lies within the run. We must consider three possibilities for the location of δ in the string:
Case 1. The string δ also lies entirely within the ith run.
Case 2. The string δ contains substrings of more than one run. This cannot occur, because pumped strings of the form αβmγδmμ would contain more than 2ℓ runs, which is not allowed under the definition of .
Because Cases 2 and 3 are impossible, δ must lie entirely within the ith run. Similarly, in the case where δ contains distinguished positions, β must lie within the ith run. Thus both β and δ always lie entirely within the ith aligned run.
Because the β and δ lie within the ith aligned run, the strings αβmγδmμ have the same form as αβγδμ, with the exception that the ith run is extended from length ni to some greater length ni′. For the pairs of Equation (3) to be members of the translation, β′ and δ′ must be substrings of the ith aligned run in the target. Because βmγδm and (β′)mγ′(δ′)m were derived from the same nonterminal, the two sequences of preterminals generating these two strings consist of aligned preterminal pairs. Because both βmγδm and (β′)mγ′(δ′)m are substrings of the ith aligned run, we have at least one aligned preterminal pair among the source and target preterminal sequences generating the ith aligned run. ▪
Lemma 7
Assume that some SCFG G′ generates the translation for some fixed ℓ. There exists a constant kx such that, if (ξ, ξ′) is a string pair generated by G′ having each ni > k, any derivation of (ξ, ξ′) with grammar G′ must contain a rule of rank at least 2ℓ.
Proof
Because the choice of i in Lemma 6 was arbitrary, each aligned run must contain at least one aligned preterminal pair. If we select one such preterminal pair from each run, the associated permutation is that of Figure 7. This permutation cannot be decomposed, so, by Lemma 3, it cannot be generated by an SCFG derivation containing only rules of rank less than 2ℓ. ▪
We will use one more general fact about SCFGs to prove our main result.
Lemma 8
Proof
Let V be the nonterminal set of G, and let S be the state set of F. Construct the SCFG G′ with nonterminal set V×S ×S by applying the construction of Bar-Hillel, Perles, and Shamir (1961) for intersection of a CFG and finite state machine to the source side of each rule in G. ▪
Now we are ready for our main result.
Theorem 1
SCFG = yield(STSG) ⊊ yield(STSG;STSG), where the semicolon denotes composition.
Proof
Assume that some SCFG G generates Tcrisscross. Note that is the result of intersecting the source of Tcrisscross with the regular language a[c+d+]ℓa. By Lemma 8, we can construct an SCFG Gℓ generating . By Lemma 7, for each ℓ, Gℓ has rank at least 2ℓ. The intersection construction does not increase the rank of the grammar, so G has rank at least 2ℓ. Because ℓ is unbounded in the definition of Tcrisscross, and because any SCFG has a finite maximum rank, Tcrisscross cannot be produced by any SCFG. ▪
4.1. Implications for Machine Translation
The ability of MBOTs to represent the composition of STSGs is given as a motivation for the MBOT formalism by Maletti (2010), but this raises the issue of whether synchronous parsing and machine translation decoding can be undertaken efficiently for MBOTs resulting from the composition of STSGs.
In discussing the complexity of synchronous parsing problems, we distinguish the case where the grammar is considered part of the input, and the case where the grammar is fixed, and only the source and target strings are considered part of the input. For SCFGs, synchronous parsing is NP-complete when the grammar is considered part of the input and can have arbitrary rank. For any fixed grammar, however, synchronous parsing is possible in time polynomial in the lengths of the source and target strings, with the degree of the polynomial depending on the rank of the fixed SCFG (Satta and Peserico 2005). Because MBOTs subsume SCFGs, the problem of recognizing whether a string pair belongs to the translation produced by an arbitrary MBOT, when the MBOT is considered part of the input, is also NP-complete.
Given our construction for converting an MBOT to an LCFRS, we can use standard LCFRS tabular parsing techniques to determine whether a given string pair belongs to the translation defined by the yield of a fixed MBOT . As with arbitrary-rank SCFG, LCFRS parsing is polynomial in the length of the input string pair, but the degree of the polynomial depends on the complexity of the MBOT . To be precise, the degree of the polynomial for LCFRS parsing is (Seki et al. 1991), which yields when applied to MBOTs.
If we restrict ourselves to MBOTs that are derived from the composition of STSGs, synchronous parsing is NP-complete if the STSGs to compose are part of the input, because a single STSG suffices. For a composition of fixed STSGs, we obtain a fixed MBOT, and polynomial time parsing is possible. Theorem 5 indicates that we cannot apply SCFG parsing techniques off the shelf, but rather that we must implement some type of more general parsing system. Either of the STSGs used in our proof of Theorem 5 can be binarized and synchronously parsed in time O(n6), but tabular parsing for the LCFRS resulting from composition has higher complexity. Thus, composing STSGs generally increases the complexity of synchronous parsing.
The problem of language-model–integrated decoding with synchronous grammars is closely related to that of synchronous parsing; both problems can be seen as intersecting the grammar with a fixed source-language string and a finite-state machine constraining the target-language string. The widely used decoding algorithms for SCFG (Yamada and Knight 2002; Zollmann and Venugopal 2006; Huang et al. 2009) search for the highest-scoring translation when combining scores from a weighted SCFG and a weighted finite-state language model. As with SCFG, language-model–integrated decoding for weighted MBOTs can be performed by adding n-gram language model state to each candidate target language span. This, as with synchronous parsing, gives an algorithm which is polynomial in the length of the input sentence for a fixed MBOT, but with an exponent that depends on the complexity of the MBOT . Furthermore, Theorem 5 indicates that SCFG-based decoding techniques cannot be applied off the shelf to compositions of STSGs, and that composition of STSGs in general increases decoding complexity.
Finally, we note that finding the highest-scoring translation without incorporating a language model is equivalent to parsing with the source or target projection of the MBOT used to model translation. For the source language of the MBOT, this implies time O(n3) because the problem reduces to CFG parsing. For the target language of the MBOT, this implies polynomial-time parsing, where the degree of the polynomial depends on the MBOT, as a result of Theorem 4.
5. Conclusion
MBOTs are desirable for natural language processing applications because they are closed under composition and can be used to represent sequences of transformations of the type performed by STSGs. However, the string translations produced by MBOTs representing compositions of STSGs are strictly more powerful than the string translations produced by STSGs, which are equivalent to the translations produced by SCFGs. From the point of view of machine translation, because parsing with general LCFRS is NP-complete, restrictions on the power of MBOTs will be necessary in order to achieve polynomial–time algorithms for synchronous parsing and language-model–integrated decoding. Our result on the string translations produced by compositions of STSGs implies that algorithms for SCFG-based synchronous parsing or language-model-integrated decoding cannot be applied directly to these problems, and that composing STSGs generally increases the complexity of these problems. Developing parsing algorithms specific to compositions of STSGs, as well as possible restrictions on the STSGs to be composed, presents an interesting area for future work.
Acknowledgements
We are grateful for extensive feedback on earlier versions of this work from Giorgio Satta, Andreas Maletti, Adam Purtee, and three anonymous reviewers. This work was partially funded by NSF grant IIS-0910611.
References
Author notes
Computer Science Department, University of Rochester, Rochester NY 14627. E-mail: [email protected].