This article investigates the Final-over-Final Constraint (FOFC): a head-initial category cannot be the immediate structural complement of a head-final category within the same extended projection. This universal cannot be formulated without reference to the kind of hierarchical structure generated by standard models of phrase structure. First, we document the empirical evidence: logically possible but crosslinguistically unattested combinations of head-final and head-initial orders. Second, we propose a theory, based on a version of Kayne’s (1994) Linear Correspondence Axiom, where FOFC is an effect of the distribution of a movement-triggering feature in extended projections, subject to Relativized Minimality.
This article investigates a putative language universal. Like much fruitful recent work, it builds on the two principal currents of research on universals that have emerged in the past fifty years or so: the Chomskyan tradition, in which the existence of language universals is deduced from the existence of an innate predisposition to language acquisition, and the Greenbergian tradition, in which universals, or at least strong tendencies to common patterning, are observed in wide-ranging surveys of crosslinguistic data.
More specifically, while the Greenbergian program has yielded many empirical results of real interest and relevance to the study of universals, many of the insights into natural language syntax that emerge from Chomskyan syntactic theory play a minor role, if any, in this work. In particular, many of the Greenbergian word order generalizations (e.g., Greenberg’s (1963) Universals 2–5) relate only to weakly generated strings, paying no attention at all to hierarchical relations (see Chomsky 1965:60–62 on the distinction between weak and strong generation). Nor do Greenberg’s generalizations indicate the need for an approach to grammatical categories that goes much beyond traditional parts of speech; the possibility that categories may be broken down into classes by being decomposed into features plays little or no role in most typological work. The main goal of this article is to argue for the existence of a hierarchical universal, one that cannot be stated in purely linear terms, but only in terms of strongly generated structure—and that, we believe, is best understood as applying to extended projections (Grimshaw 1991, 2001, 2005), a notion defined in terms of categorial features. We believe that the existence of this universal not only provides strong evidence for the hierarchical nature of natural language syntax and the existence of extended projections, but also shows that the greatest insight into language universals may be gained by combining both the Greenbergian and the Chomskyan traditions (in line with Cinque 2013).
Whitman (2008:234, 251) discusses the nature of hierarchical universals (hierarchical generalizations in his terms). He defines such universals as describing the relative position of two or more categories in a single structure, where this position follows from the underlying hierarchical arrangement of constituents. For example, according to Whitman, the fact that specifiers (Specs) appear universally to the left of the head they specify is an instance of a hierarchical universal (see, e.g., Pearson 2001 and Aldridge 2004 for evidence that apparently Spec-final VOS languages are also best analyzed as Spec-initial). Whitman (2008:234) suggests that hierarchical universals are absolute, while implicational universals of the kind familiar since Greenberg 1963 are better conceived of as cross-categorial generalizations, which, being the product of processes of language change, are typically statistical.
The purpose of this article is to introduce, motivate, and explain the Final-over-Final Constraint (FOFC, pronounced [fofk]), a generalization that we, paceWhitman (2008), take to be both a hierarchical and a cross-categorial universal. FOFC is a universal constraint on phrase structure configurations, not statable in purely linear terms. Initially, we formulate FOFC as follows (see Holmberg 2000:124):1
The Final-over-Final Constraint (FOFC) (informal statement)
A head-final phrase αP cannot dominate a head-initial phrase βP, where α and β are heads in the same extended projection.
The converse does not hold: a head-initial phrase αP may dominate a phrase βP that is either head-initial or head-final, where α and β are heads in the same Extended Projection.
Consider the logically possible complementation combinations among head-initial and head-final categories.
As (2) shows, FOFC determines that three of the four logically possible combinations are allowed and one is nonexistent (within a single extended projection). The harmonic configurations in (2a–b) are very common, while (2c) is somewhat less common but still occurs. In other words, harmony is preferred (as has often been observed: Greenberg 1963, Hawkins 1983, Dryer 1992, Baker 2008), but disharmony is allowed. Crucially, though, only one kind of disharmony is allowed.
In other words, the configuration (3) is ruled out, where αP is dominated by a projection of β, γP is a sister of α, and α and β are heads in the same extended projection.
(3) *[βP . . . [αP . . . α γP] β . . . ]
As we will show, this generalization holds across categories and across typologically widely divergent languages. It holds in the verbal extended projection (VP, TP, CP) as well as the nominal extended projection (NP, DP, PP). It also constrains diachronic change. It has a number of interesting and far-reaching consequences. For one thing, it entails that, in mixed systems at least, head-final order is more constrained than head-initial order, and, in that sense, it is also more marked than head-initial order. It is therefore pertinent to the controversial question whether one order is more ‘‘basic’’ than the other (as debated by, among others, Kayne (1994, 2000, 2013) and Haider (1992, 1995, 1997a,b,c, 2000, 2012)). We will argue that FOFC thus provides support for (a version of ) Kayne’s (1994) Linear Correspondence Axiom (LCA), one notorious consequence of which is that head-final order is derivationally more complex than head-initial order, and, in that sense, more marked.
As we will show, FOFC explains, or is part of the explanation of, a range of crosslinguistic generalizations, including the following:
The higher a head is in an extended projection, the less likely it is to be head-final. Thus, for example, OVorder is more common crosslinguistically than clause-final complementizers (this follows from the fact, documented in section 2.3, that final Cs are not found in VO languages, but initial Cs are found in both VO and OV languages).
In diachronic word order change, change from head-final to head-initial starts at the top of the extended projection; for example, change from OV to VO is preceded by change from VP-T to T-VP, which is preceded by change from TP-C to C-TP. Conversely, change from head-initial to head-final starts at the bottom of the extended projection.
In OV languages that have embedded finite clauses with an initial complementizer, the embedded clause is always extraposed.
We will propose a formal account of FOFC in terms of Kayne’s (1994) Linear Correspondence Axiom (LCA) combined with the hypothesis advanced by Chomsky (2000, 2001) that all movement is triggered by a special feature on heads.
The LCA (adapted from Kayne 1994)
α precedes β if and only if α asymmetrically c-commands β, or if α is contained in γ, where γ asymmetrically c-commands β.
According to the LCA, head-final order must be derived by movement of the complement to a position asymmetrically c-commanding the head. If movement is always triggered by a feature (an ‘‘EPP-feature’’ or an ‘‘edge feature’’; Chomsky 2001, 2007, 2008), the head in a head-final phrase must have a feature triggering movement of its complement, which a head-initial counterpart does not have. We can then understand FOFC as being an effect of ‘‘spreading’’ of this movement-triggering feature from head to head along the spine of an extended projection to a designated head, which may be the highest head in the extended projection ( yielding a harmonically head-final tree), but need not be ( yielding a partially harmonic tree). We will argue that the domain of FOFC is, indeed, the extended projection, roughly in Grimshaw’s (1991, 2001, 2005) sense. This conclusion motivates the relevance of Grimshaw’s notion for the investigation of universals, and arguably the concomitant notion that syntactic categories should be decomposed into features.
The article is organized as follows: in section 2, we present FOFC and provide the principal empirical motivation for it; in section 3, we present and account for certain apparent counterexamples; and in section 4, we present our theory of linear order and show how FOFC can be derived from it, applying the analysis to the data introduced in sections 2 and 3. Section 5 concludes the article.
2 The Final-over-Final Constraint (FOFC)
As mentioned in section 1, our key proposal is that FOFC is a universal. The import of the formulation of FOFC in (1) is that it rules out structures like (3), repeated here, where αP is the complement of β and γP is the complement of α, and α and β are part of the same extended projection. Note that, if head-final orders are derived in the manner described above, αP may have moved to its position in (3), leaving a copy to the right of β.
(3) *[βP . . . [αP . . . α γP] β . . . ]
Our principal empirical claim, then, is that configurations instantiating the schema in (3) are not found in the world’s languages. We now present the evidence for this.
2.1 *[V O] Aux
2.1.1 *[V O] Aux in Germanic
Our initial observation comes from comparative Germanic. Looking across Germanic varieties, both synchronically and diachronically, we observe a very wide range of word orders, particularly at the clausal level and in VP. If we consider the three elements Aux,2 V, and O, we find all possible permutations of these, with one very striking exception: the order V-O-Aux is not found. This fact has often been noted (see, e.g., Travis 1984: 157–158, Den Besten 1986, Pintzuk 1991, 1999, Kiparsky 1996:168–171, Hrόarsdόttir 1999, 2000, Fuss and Trips 2002). Given the analysis [AuxP [VP V O] Aux], this construction violates FOFC, with α = V and β = Aux (whether or not VP is in a derived position here). Let us look at the various permutations one by one.
First, we readily find the order O-V-Aux ((John) the book read has). (5) illustrates this from German, extrapolating from main clauses, as is standard practice, in order to avoid the confound introduced by the verb-second (V2) phenomenon.
. . . dass Johann das Buch gelesen hat.
that Johann the book read.PART has
‘ . . . that Johann has read the book.’
This order is found, primarily in subordinate clauses of various kinds, in German, Dutch, Afrikaans, Yiddish, all German, Dutch/Flemish, and Afrikaans dialects, Old English, and Old Norse. It is usually thought to derive from head-final order in both AuxP and VP, and thus respects FOFC.
Second, we find Aux-V-O ((John) has read the book). This is the head-initial order, different variants of which are found in Modern English and throughout Modern North Germanic. Given the standard analysis [AuxP Aux [VP V O]], it respects FOFC trivially. It is also found in Yiddish (Santorini 1992), colloquial Afrikaans, and older Germanic varieties (see, e.g., van Kemenade 1987 and Pintzuk 1991, 1999 on Old English, Schallert 2010 and Sapp 2011 on earlier German, and Hoeksema 1993 on Middle Dutch).
. . . oyb dos yingl vet oyfn veg zen a kats.
whether the boy will on.the way see a cat
‘ . . . whether the boy will see a cat on the way.’
. . . þæt he mot ehtan godra manna.
that he might persecute.INF good men
‘ . . . that he might persecute good men.’
We may assume, for now, that the different word orders are base-generated. We will return in section 4 to the derivation of the different structures/orders and the role of FOFC in these derivations.
Third, we find the order Aux-O-V ((John) has the book read).3
. . . da Jan wilt een huis kopen.
that Jan wants a house buy.INF
‘ . . . that Jan wants to buy a house.’
. . . das de Hans wil es huus chaufe.
that the Hans wants a house buy.INF
‘ . . . that Hans wants to buy a house.’
. . . þæt hie mihton swa bealdlice Godes geleafan bodian.
that they could so boldly God’s faith preach.INF
‘ . . . that they could preach God’s faith so boldly.’
(The Homilies of the Anglo-Saxon Church I 232; van Kemenade 1987:179)
This order is also found in Middle Dutch (Hoeksema 1993), Old High German (Behaghel 1932), Old Norse (Hrόarsdόttir 1999:203ff.), and numerous nonstandard varieties of Dutch, Swiss and Austrian German, and Afrikaans (see Schmid 2005 and Wurmbrand 2006 for discussion and overview). Note that we appear to have the inverse of the configuration excluded by FOFC here, in that we plausibly have a head-initial AuxP (with the order Aux > VP) and, as complement to the Aux, a head-final VP. Initial-over-final structures are readily attested, then, while final-overinitial structures, schematized in (3), are not. This is the central asymmetry that we observe.
Fourth, we find the order O-Aux-V ((John) the book has read).4
. . . dat Jan het boek wil lezen.
that Jan the book wants read.INF
‘. . . that Jan wants to read the book.’
. . . þe æfre on gefeohte his handa wolde afylan.
who ever in battle his hands would defile.INF
‘ . . . whoever would defile his hands in battle.’
(Ælfric’s Lives of Saints 25.858; Pintzuk 1991:102)
This order is also found in all variants of Afrikaans and many nonstandard West Germanic varieties, but not in Standard German. Whatever the precise analysis of these sentences, there is no reason to think that they violate FOFC here; to our knowledge, no derivation featuring an intermediate or initial V-O-Aux order has ever been proposed for examples like these.
A rarer but still attested order is V-Aux-O ((John) read has the book). This has often been described as ‘‘object extraposition’’ (see, e.g., Reuland 1981, Den Besten and Rutten 1989). Here we illustrate with ‘‘PP extraposition’’ in colloquial Afrikaans and Old English:
. . . dat hy die boek gegee het vir sy suster.
that he the book give.PART has for his sister
‘ . . . that he gave the book to his sister.’
. . . þæt ænig mon atellan mæge ealne þone demm.
that any man relate.INF can all the misery
‘ . . . that any man can relate all the misery.’
(Orosius 52.6–7; Pintzuk 2002:283, (16b))
Where the ‘‘extraposed’’ element is a CP, this order is obligatory in German, Dutch, and Afrikaans (see also section 2.3). It is also found in Old Norse (Hrόarsdόttir 1999:201–202) and in earlier German and Dutch (see, e.g., Hoeksema 1993, Bies 1996, Sapp 2011). Whether we derive this in the traditional fashion, by extraposition of the complement of V (see, e.g., Evers 1975, Rutten 1991), or by a succession of leftward movements, including possibly remnant VP-movement, from a head-initial underlying structure (see, e.g., Zwart 1997, Hinterhölzl 2005), no analysis has been proposed that would not respect FOFC.
At first sight, then, it seems that all possible word orders are found—that, across the range of varieties, synchronically and diachronically, anything goes. But this is not the case. As noted at the start of this section, the crucial observation is that V-O-Aux is not attested.6 The missing order is the one that instantiates the FOFC schema in (3) for α = V, β = Aux. In other words, the missing configuration is that in (10).
Whether derived or base-generated, the structure in (10) is not found in Germanic.
In the cases we have illustrated in this section, FOFC is ‘‘surface-true’’ in the sense that strings made up of a head followed by its complement followed by a higher head are ruled out. We will show in section 2.2 that this is not always the case, which is not surprising given that FOFC is a structural constraint, not a constraint on surface word order.
2.1.2 *[V O] Aux in Finnish
The absence of V-O-Aux order is not restricted to Germanic. The same gap exists in Finnish (Holmberg 2000:128), Northern Saami (Marit Julien, pers. comm.), Basque (Haddican 2004:116), and Late Latin, all languages that exhibit VO as well as OV order. We include a discussion of Kaaps here as well, as a Germanic variety that also exhibits VO and OV order in basically the same contexts (independently of V2).
Holmberg (2000:128) shows that Finnish is basically Aux-V-O. But, under specific conditions, where the matrix C is [+focus] or [+wh], OV order is permitted. (See Holmberg 2000 and section 4.6 for more details.) Furthermore, under those conditions, O-V-Aux is permitted as an alternative to Aux-O-V. That is to say, the auxiliary may precede or follow the VP. However, V-O-Aux is never allowed. The paradigm is illustrated in (11) (from Holmberg 2000:125).
That is to say, the structure that violates FOFC is ruled out.
2.1.3 *[V O] Aux in Basque
Haddican (2004:116) observes the absence of FOFC-violating V-O-Aux structures in Basque.
2.1.4 *[V O] Aux in Kaaps
Biberauer, Sheehan, and Newton (2010) have shown that V-O-Aux orders are also unattested in language contact situations. For example, in South Africa there is extensive contact between Afrikaans, an OV language with head-final order in IP and VP, and English, which of course has head-initial IP and VP. In the variety most heavily influenced by English, Kaaps, spoken by the so-called Coloured population in the Cape, we find a range of possible orders in subordinate clauses (where V2 is generally inoperative). However, the one order that we do not find is V-O-Aux.
2.1.5 *[VO] Aux in Latin
Latin is generally analyzed as an OV language with rather free word order, especially in literary classical texts (Harris 1978, Vincent 1988, Pinkster 1990, Salvi 2004, Devine and Stephens 2006, Clackson and Horrocks 2007, Ledgeway 2012). Given the fairly synthetic nature of its verbal morphology, it is uncertain that Latin had auxiliaries. However, one candidate construction is the perfect form of passives and deponents, formed from the perfect participle of the verb and the auxiliary esse ‘to be’. Ledgeway (2012:255) specifically notes the dearth of auxiliaries in Latin and the concomitant difficulty of testing FOFC in this domain; he also notes, following Adams (1994a,b), that esse acts like a clitic in some respects, tending to appear in second position in the clause or enclitic to the negator non. This clearly adds a further difficulty, but let us nevertheless make the assumption that Latin esse was indeed an auxiliary from at least the Classical period (1st century BCE – end of the 1st century CE) onward.
Citation forms of the perfect tenses of passives and deponents generally appear in head-final order (e.g., amatus sum, loved-NOM.SG.M be.1SG, ‘I have been/was loved’), as would be expected for an OV language. However, it is clear from the manuals (Ernout and Thomas 1953, Kühner and Stegmann 1955, Gildersleeve and Lodge 1997, Salvi 2004, Devine and Stephens 2006, Ledgeway 2012) and also from Danckaert 2012a,b that the reverse order was also possible. Given that the class of deponents included transitives (e.g., sequor ‘follow’, hortor ‘encourage, urge’, and minor ‘threaten’), that Latin allowed impersonal passives in which the logical direct object could appear in the accusative (see Keenan 1985, Keenan and Dryer 2007), and that A-movement of the object was not obligatory in passives, as in Modern Italian (Burzio 1986), V-O-Aux, where O is a logical object, bearing either accusative or nominative case, was clearly a possibility. Danckaert (2012b:3), however, highlights a surprising fact about the attestation of these structures: although Late Latin (200 CE onward) appears to conform to FOFC, exactly as the Germanic languages, Finnish, and Basque do, Classical Latin does appear to exhibit a low, yet nonnegligible, level of V-O-Aux.8 This is surprising since literary Classical Latin is generally believed to have been fundamentally OV (see Ledgeway 2012:225–235 for detailed discussion and references, including the suggestion that OV orders may have been a conservative feature of this variety of Latin consciously used by certain writers, notably Caesar), with VO ordering only becoming systematically available as a neutral order during later eras. We would therefore expect V-O-Aux ordering to be common in later rather than earlier Latin. The fact that the reverse is true strongly suggests that the V-O-Aux structures found in the Classical period may be of a ‘‘special’’ type not instantiating the schema in (3); we return to these structures in section 2.2. Here we note the diachronically very significant fact that Late Latin, a variety used at a stage during which VO order was very common, but during which head-final esse survived (see Danckaert 2012b:37–38), does not appear to permit V-O-Aux structures.
Our first piece of evidence for FOFC, then, stems from the crosslinguistic absence of V-O-Aux order, notably in the mixed systems of the West Germanic languages (including Kaaps), Old Norse, Western Finno-Ugric languages, and Basque; this order also appears to be largely absent in Latin. Since this word order instantiates the FOFC-violating structure (3), if FOFC holds as a universal, we understand why this order is absent.
FOFC also accounts for the fact that languages that in principle have the means to violate this constraint nevertheless do not do so (see the discussion in following sections for further illustrations of this fact). In the context of V-O-Aux structures, Holmberg (2000:134–135), drawing on the typological research presented by Dryer (1992), discusses the crosslinguistic distribution of the form expressing volition (‘want’) in relation to V and O in VP. He shows that only four languages at first sight appear to permit the FOFC-violating [VO]-WANT order.9 Upon closer inspection, however, it emerges that these languages also permit OV orders in certain contexts and that VO plus final WANT strings appear not to occur. Even in languages that exhibit the means to potentially violate FOFC, we do not observe FOFC violations, then.
Languages like Finnish or Kaaps, which allow both head-initial and head-final orders within vP/VP, are also particularly telling since they show clearly that FOFC is not just a typological fact (i.e., a fact about the crosslinguistic distribution of a particular order or structure), but a constraint that is active in speakers’ I-languages: FOFC violations are systematically avoided in a derivation that might otherwise be expected to allow them. Thus, in Finnish and Kaaps VO and OV order both occur, as do Aux-VP and VP-Aux, but the FOFC-violating combination [VO] Aux is systematically avoided.
2.2 Apparent cases of [V O] X
Crosslinguistically, then, a mix of patterns is found, and, notably, disharmonic orders of the ‘‘verb projection raising’’ (Aux-O-V) type are attested; but the mirror image of verb projection raising seems to be missing. This kind of typological gap is striking, especially when attested in unrelated families,10 and calls for an explanation. FOFC is not an explanation, but at least it subsumes this gap under a broader generalization.
As mentioned in section 2.1.1, in the cases we have discussed, FOFC is ‘‘surface-true’’ in the sense that strings made up of a head followed by its complement followed by a higher head are ruled out. This is not always the case, though. One apparent counterexample to FOFC comes from structures in Germanic involving ‘‘low’’ final negation, like (15a–b).
Du verstehst mich (einfach) nicht.
you understand me simply not
‘You (simply) don’t understand me.’
Jag såg den inte.
I saw it not
‘I didn’t see it.’
These examples look as though they instantiate the order V-O-Neg. If Neg is a head (as often thought since Pollock 1989), then this might seem to be a FOFC violation in that the verb and object, in that order, precede the Neg head. However, it is generally agreed that these structures feature a combination of verb movement (most likely to C, in order to meet part of the V2 requirement) and object shift out of VP to the left of negation. Thus, the verb and the object move separately and to separate target positions, with V moving to C (see Den Besten 1983) and the object to Spec,vP (Chomsky 2001). Other analyses of these operations are of course possible, but the point for our purposes is that there are strong arguments that V and the object do not form a single phrasal constituent in their derived positions, and that the negative is located in a lower hierarchical position than that occupied by these elements, and so FOFC does not apply here. V-O-Neg structures are not uniformly of the Germanic type, though, a point to which we return in section 3.2.
We now return to the apparent literary Classical Latin counterexamples to the generalization that *V-O-Aux is universal that were briefly mentioned in section 2.1.3. Kühner and Stegmann (1955:603) state that the regular head-final order V-Aux is typically found in the perfect tenses of passives and deponents in early Latin up to Plautus (2nd century BCE). In Classical Latin, particularly the writings of Cicero, however, the surface subject (i.e., the underlying object) can appear between V and Aux.
. . . adducta quaestio est.
adduced.NOM.F.SG question.NOM.F.SG is
‘ . . . the question has been adduced.’
(Kühner and Stegmann 1955:603)
. . . secuti eum sunt admodum quingenti Cretenses.
follow.NOM.M.PL 3SG.ACC be.3PL.PRES fully/just about 500 Cretans
‘ . . . about five hundred Cretans followed him.’
(Livy, ab urbe condita 10:24.10; Lieven Danckaert, pers. comm.)
. . . damnetur is qui
condemn.PRES.SUBJ.PASS.3SG that.3SG.NOM who.3SG.NOM
fabricatus gladium est
manufactured.NOM.M.SG sword.ACC be.3SG.PRES
‘ . . . he should be condemned, who manufactured the sword.’
(Cicero, pro Rabirio 7; Danckaert 2012b:28, (42))
In some of these cases, such as (16b), it is plausible that the participle has been fronted into the left periphery (a productive option, as Ledgeway (2012) shows; given the highly articulated left periphery and the productive fronting options available to Classical Latin, documented by Ledgeway, this fronting does not necessarily entail that the participle is in a derived surface-string-initial position). In this case, the configuration in (3) plausibly fails to arise because a head-initial VP has fronted into the CP domain, where Latin CP is unambiguously head-initial (see Ledgeway 2012:150–158 for discussion and references); head-initial VP, then, is dominated by head-initial CP. The same is true for Sardinian focus-fronting structures like (17a) and, in the nominal domain, for possessive structures like English (17b), regardless of whether the possessor DP, the girl from next door, was first-merged in Spec,DP or whether it moved there.11
Tunkatu su barkone asa!
shut the window have.2SG
‘It’s shut the window you have!’
the girl from next door’s smile
That Ā-fronting cases of the type illustrated here, where a head-initial phrase fronts into the specifier of a head-initial phrase, do not violate (3) will become clearer when we present our formal analysis of the constraint resulting in (3) (see section 4).
Returning to the Latin examples in (16): (16a) and (16c) would both be FOFC violations if they involved a head-initial VP dominated by a head-final AuxP, but, again, there is evidence that this is not the structure underlying these examples. This emerges most clearly if we consider the transitive deponent case (16c), where V and O do not bear the same case; more specifically, V is nominative-marked, whereas O is accusative-marked. Taking into account standard Minimalist assumptions about locality and Case assignment, this pattern is not possible if V and O both remain in situ: in this case, the expectation would be that v would undergo Agree with both V and O, leading us to expect accusative marking on both elements in (16c)-type structures.12 If V raises out of VP, however, T may probe V, resulting in nominative case marking. On the assumption that participle placement is a parametrically defined and thus language-internally constant property (see, e.g., Caponigro and Schütze 2003, and, more generally, work following from Emonds 1976, Pollock 1989), we would expect Latin participles always to undergo raising out of VP. It is therefore plausible to analyze the structures in (16) as involving a head-final VP dominated by a head-initial participle-hosting verbal projection—that is, an instance of the inverse FOFC disharmonic structure in (2c). What still needs to be understood, though, is why this headinitial verbal structure may be dominated by a head-final Aux, instantiated by forms of esse in (16).
Remberger (2012) argues convincingly on both diachronic and synchronic grounds that the -t- component of Latin participles is properly nominal, and that Latin participles therefore were also nominal. (18), adapted from Remberger 2012:288, (36), illustrates the proposed analysis.
In terms of this analysis, then, Latin participles have a verbal ‘‘core,’’ dominated by a nominal functional domain; where they are dominated by a head-final tense-marking element as in (16a) and (16b), therefore, (3) is not violated since V (= Part) is nominal, whereas the functional structure dominating it is verbal and therefore the two elements are not part of the same extended projection in the sense of Grimshaw 1991, 2001, 2005. What this predicts is that languages in which participles combining with auxiliaries can independently be shown to constitute nominal entities will permit superficially FOFC-violating V-O-Aux structures. In this connection, we can note that there is no reason to take West Germanic participles to be nominal, and in fact if the participial prefix ge- is verbal, then we have direct evidence that the perfect/passive participles in these languages are verbal (see section 3.3 for further discussion of this point). Hence, the absence of Part-O-Aux order with perfect/passive participles in West Germanic comes under the general absence of V-O-Aux in these languages, as documented in section 2.1.1.
2.3 The Crosslinguistic Distribution of Complementizers
Our second piece of evidence for FOFC also concerns clause-level syntax. This is the observation, originally due to Hawkins (1990:256–257), that sentence-final complementizers are not found in VO languages (see also Dryer 1992:102, 2009a:199–205, Kayne 2000:320–321, Hawkins 2004). Crosslinguistically, we find OV languages with both initial and final complementizers. Latin is generally taken to be an OV language (see the references given above) and has initial complementizers, as (19a–b) show (taking ut and quod to be complementizers).
Ubii Caesarem orant [CPut sibi parcat].
Ubii.NOM Caesar.ACC beg.3PL.PRESC selves.DAT spare.3SG.SUBJ.PRES
‘The Ubii beg Caesar to spare them.’
Accidit perincommode [quod eum nusquam vidisti].
happened.3SG.PERF unfortunately C him nowhere saw.2SG.PERF
‘It is unfortunate that you didn’t see him anywhere.’
(see Roberts 2007:162–163 for sources and discussion)
On the other hand, Japanese is an OV language with final complementizers.
Bill-ga [CP[TP Mary-ga John-ni sono hon-o watasita] to] itta (koto).
Bill-NOM Mary-NOM John-DAT that book-ACC handed that said (fact)
‘Bill said that Mary handed that book to John.’
(Fukui and Saito 1998:443)
And of course we can readily find VO languages with initial complementizers, English being an example.
But the fourth logical possibility, VO languages with final complementizers, appears not to be attested. The World Atlas of Language Structures (WALS; Dryer and Haspelmath 2011) does not have a specific feature for the order of general clausal subordinators in relation to the clause they introduce, and so we cannot directly look for evidence there. However, the order of ‘‘adverbial subordinators’’ such as although, when, while, and if in relation to the clauses they introduce is covered (Map 94A; Dryer 2011a). It is possible that some, if not all, of these elements are complementizers; however, Dryer (2011a) is explicit on the point that ‘‘care was taken not to include general markers of subordination.’’ Nonetheless, in the languages investigated, the skewing is evident. Combining Map 94A with Map 83A (Dryer 2011b), 305 languages have VO and initial subordinators and 91 have OV and final subordinators. Placing the subordinators in C, then, we observe cross-categorial harmony in the majority of cases. More importantly for our purposes, there is a very clear asymmetry in the disharmonic orders: 61 languages have OV and initial subordinators, but only 2 are said to show the combination of final subordinators with VO: Buduma (Afro-Asiatic) and Guajajara (Tupi-Guaraní). Dryer (2011b) also notes that subordinating suffixes are found, particularly in OV languages. In fact, there is only 1 VO language with subordinating suffixes (the Australian language Yindjibarndi), as against 56 OV languages with subordinating suffixes. In this respect, then, the data are significantly skewed.13 The few counterexamples clearly require closer investigation, but the overall asymmetry in the distribution of logically possible combinations of orders is clear.
This instantiates the schema in (3) for α = V and β = T, and so constitutes a FOFC violation of the same type as the V-O-Aux orders considered in the previous section.
Alternatively, we could have a head-initial TP inside a head-final CP, as in (22).
This structure instantiates (3), and hence violates FOFC, for α = T and β = C. This is the second piece of evidence in favor of FOFC.
In this connection, we can in fact make a further observation, which constitutes the basis of another piece of evidence for FOFC: OV languages with initial complementizers systematically extrapose their CP complements (the significance of this in relation to FOFC was first pointed out in Sheehan 2008). This is apparent in the Latin examples in (19), where the subordinate CP is in postverbal position, apparently the typical order in Latin (see Devine and Stephens 2006: 124–125, Ledgeway 2012:242ff.). It is also true of German, where finite CPs must be postverbal, while raising complements, which we analyze as TPs (following Chomsky 1981), are not (see Biberauer and Roberts 2008).14
Er weiß, dass sie kommen.
he knows that they come.PL
‘He knows that they’re coming.’
. . . dass Hans sich zu rasieren schien.
that Hans self to shave seemed
‘ . . . that Hans seemed to shave himself.’
The same is true in many other OV languages, such as Afrikaans, Bengali, Dutch, Hindi, Iraqw, Mangarrayi, Neo-Aramaic, Persian, Sorbian, and Turkish (see Biberauer, Sheehan, and Newton 2010 for further details and exemplification; Dryer 2009a). This oddity of the word order appears to be a FOFC-compliance strategy. If the head-initial CP were to appear in the complement position of the head-final V, we would have a structure like (24).
This structure violates FOFC for α = V and β = C. In fact, as noticed by Koptjevskaja-Tamm (1988, 1993) (see also Givόn 2001), OV languages tend to have either postverbal finite CP clausal complements, or preverbal nominalized clauses. As mentioned in section 2.2, preverbal nominalizations are exempt from FOFC since the nominalization forms a distinct extended projection from that headed by V; we will return to this point in more detail in sections 3 and 4. A third possibility, not noticed by Koptjevskaja-Tamm, is a preverbal finite clausal complement with a final complementizer: this of course is a harmonic order and therefore FOFC-compliant. It seems, therefore, that there is a crosslinguistic conspiracy to avoid the structure in (24), with even languages that, in principle, have the means to violate FOFC not doing so (also see Holmberg 2000).
A related point, which may be of some importance, is that failure to overtly realize the complementizer does not appear to be a strategy facilitating (nonscrambled; see footnote 14) preverbal head-initial CPs ( Josef Bayer, pers. comm.). This can be seen in Hindi, a language that, like English, allows complementizers to delete (see Bayer 2001:15), as shown in (25).
He knows (that) they are coming.
usee (yah) maluum hai [ki vee aa rahee haiN].
3SG.DAT this known is that 3PL come PROG are
‘He/She knows that they are coming.’
*usee [(ki) vee aa rahee haiN] maluum hai.
3SG.DAT that 3PL come PROG are known is
‘He/She knows that they are coming.’
Question particles initially also appear to be good candidates for C-elements. As we will demonstrate in section 3.2, though, VO languages quite readily allow a range of final particles that appear to instantiate C-related categories, including question particles and force particles of various kinds, apparently violating FOFC.
Coordinating conjunctions are another kind of clause introducer. Zwart (2005) investigates 214 languages and finds no ‘‘true’’ final coordinating conjunctions in head-initial languages (see his table 3; also see Zwart 2009). This is consistent with our general claim here.
The absence of V-O- . . . C orders crosslinguistically is our second piece of evidence for FOFC.
2.4 The Nominal Domain
Turning from the clausal to the nominal domain, we find three further pieces of evidence for FOFC, two direct and one indirect. The direct evidence comes from Finnish nominals and Latin gerunds (see also Holmberg 2000 on the former, and Ledgeway 2012 on the latter). We look at Finnish first.
As a predominantly head-initial language, Finnish has postnominal complements and adjuncts, including relative clauses. Finnish has postpositions, though. An NP consisting of a noun with a PP complement or adjunct will typically look like (26) or (27), respectively.
käynti nurkan takana
visit corner.GEN behind
‘the/a visit around the corner’
[NP käynti [PP nurkan takana]]
raja maitten välillä
border countries between
‘the/a border between the countries’
[NP raja [PP maitten välillä]]
Some Finnish adpositions can be either prepositions or postpositions. This is the case with yli ‘across’.15
Both: ‘across the border’
yli [rajan maitten välillä]
across border countries between
‘across the border between the countries’
*[rajan maitten välillä] yli
border countries between across
Taking adpositions to be in the same extended projection as their nominal complements, at least in languages like Finnish, this is an effect of FOFC: the head-final PP must immediately dominate a head-final phrase, but the NP complement of yli in (29b) has a postnominal complement.17
maitten välinen raja
countries between.ADJ border
‘the/a border between the countries’
This NP may be combined with a preposition or a postposition, such as yli ‘across’.
yli [maitten välisen rajan]
[maitten välisen rajan] yli
Both: ‘across the border between the countries’
Finnish has both prepositions and postpositions.
N can precede a complement or adjunct PP.
When it does, the NP cannot itself be the complement of a postposition, because of FOFC.18
The violation can be rescued by fronting the complement or adjunct.
This is, then, another case showing that FOFC is a constraint that is active in speakers’ I-languages: the FOFC-violating combination [[N PP] P] is systematically avoided, even though the building blocks, a head-initial NP and a head-final PP, are in frequent use. Furthermore, the violation can be avoided in a given derivation, by shifting the complement of the noun to the left.
Our second piece of evidence for FOFC in nominals comes from Latin. Following Elerick (1994), Ledgeway (2012:252–255) shows that gerundive complements to nouns or prepositions strongly tend to follow either harmonic order—that is, either [N/P [VGer O]] or [[O VGer] N/P]—although the non-FOFC-violating disharmonic order [N/P [O VGer]] is also attested. On the other hand, the FOFC-violating disharmonic order [[VGer O] N/P]] is hardly attested at all (see Ledgeway’s table 5.7, p. 252). Assuming that the gerundive V is really a nominal element, it will be located in the same extended projection as N/P, and therefore this configuration falls under FOFC as we have formulated it.
The third piece of evidence for FOFC in nominals is more indirect, stemming from Cinque’s (2005) account of Greenberg’s Universal 20. Greenberg (1963:87) states his Universal 20 as follows: ‘‘When any or all of the items (demonstrative, numeral and descriptive adjective) precede the noun, they are always found in that order. If they follow, the order is either the same or its exact opposite.’’ Thus, Greenberg states that these adnominal elements appear in the following orders in relation to each other and to N:
Dem > Num > A > N
N > Dem > Num > A
N > A > Num > Dem
For simplicity, let us disregard adnominal APs, and hence the relative order of A and N (AP is probably not a unified category; see Cinque 2005:315–316n2, 2010). (32a) will then have the following harmonically head-initial structure, assuming that Dem and Num are heads (see also Shlonsky 2004:1482):
(33) [DemP Dem [NumP Num NP]]
(34) [DemP[NumP NP Num] Dem]
The order (32b), according to Cinque (2005:319), occurs in ‘‘few languages.’’ Another order, not acknowledged by Greenberg, but that also, according to Cinque, occurs in ‘‘few/very few languages’’ is (35).19
(35) Dem > N > Num
These orders all observe FOFC. Consider, however, the following word order, attested in ‘‘few languages’’ according to Cinque (2005:320) (see also Biberauer et al., to appear, for further discussion of these languages):
(36) *Num > N > Dem
This word order could be derived by NumP-movement to Spec,DemP, taking the order in (3) to be first-merged.
(37) [DemP [Num NP] [Dem t]]
But this is apparently not a licit derivation. We can now answer Cinque’s (2005:325) question: ‘‘Why is movement of phrases other than NP unavailable?’’ This movement leads to a FOFC violation, in that the derived structure in (37) instantiates (3) for α = Num and β = Dem, and taking Num and Dem to belong to the same extended projection.20
Cases such as (38a–b) may, however, appear to pose counterexamples to Universal 20 and FOFC.
y tair plaid arall hyn
the three parties other these
‘these three other parties’
na trì leabhraichean mòra seo
the.PL three books big.PL this
‘these three big books’
(David Adger, pers. comm.)
This order is also found in Semitic and other languages, and Num > A > N > Dem, with the possibility of an initial determiner element, is found in various creoles (Bislama, Berbice Dutch, Sranan; see Haddican 2002, cited in Cinque 2005:320nn15, 16; see also Biberauer et al., to appear). If these orders are derived by NumP-movement to Spec,DemP, they will violate FOFC. Note, however, that they feature an initial definite determiner. In many, perhaps most, languages with both determiners and demonstratives, Dem is in complementary distribution with D (consider English, for example): Dem occupies either D or Spec,DP. However, consideration of a wider range of languages shows that Universal Grammar makes available at least two Dem positions, one high (D or Spec,DP), and one a low postnominal position, which is what we see in (38a–b). Universal 20, we assume, concerns the high Dem/D position (see Biberauer et al., to appear, and the references given there). The Celtic languages and the others mentioned after (38) have ‘‘low’’ demonstratives, and so the structure of (38) is as in (39), rather than (33), with N or NP raising at least to Spec,nP, or perhaps higher depending on the position of the AP.
(39) [DP D [NumP Num [nP Dem n NP]]]
2.5 Diachronic Evidence
FOFC is a constraint on synchronic grammars. Since we take it to represent a universal constraint on synchronically possible word orders, we predict that no system can change into a FOFC-violating system. FOFC-violating systems fall outside the range of possible outcomes of syntactic change. This is a consequence of the general fact that, as Kiparsky (2008:23) puts it, ‘‘If language change is constrained by grammatical structure, then synchronic assumptions have diachronic consequences.’’
More specifically, if FOFC is an absolute universal, then word order change must proceed along certain pathways. Change from head-final to head-initial order in the clause must go ‘‘top-down,’’ in that CP must be affected first, followed by TP, followed by VP, as in (40).
(40) [[[O V] T] C] → [C [[O V] T]] → [C [T [O V]]] → [C [T [V O]]]
Conversely, head-initial to head-final change must go ‘‘bottom-up,’’ starting at VP before affecting TP, and then affecting TP before affecting CP.
(41) [C [T [V O]]] → [C [T [O V]]] → [C [[O V] T]] → [[[O V] T] C]
Any other sequence of changes in either case will lead to an intermediate synchronic system that violates FOFC. For example, consider what would happen if, starting from a uniformly head-final system like the first one shown in the series in (40), VP changed headedness first. This would give rise to a [[[V O] T] C] system; as we showed in sections 2.1.1 and 2.1.2, such systems are not found. If they were possible outcomes of natural processes of change, presumably such systems would be found; FOFC explains their absence synchronically and, therefore, diachronically. Furthermore, as we showed in section 2.1.4, even contact-induced change does not bring about FOFC-violating structures.
Direct diachronic evidence concerning these trajectories of change is not easy to come by, given the paucity of long-term attestation of most of the world’s languages. Such evidence as exists concerning the earliest stages of Germanic supports our position, though. The earliest attested stages of Germanic (Gothic, Old English, and Old Norse) show C-IP order.
. . . ef han hefði þat viljað fága.
if he has it wanted clean.INF
‘ . . . if he had wanted to clean it.’
(Finn; Hrόarsdόttir 1999:203)
. . . þæt hie mihton swa bealdlice Godes geleafan bodian.
that they could so boldly God’s faith preach.INF
‘ . . . that they could preach God’s faith so boldly.’
(The Homilies of the Anglo-Saxon Church I 232; van Kemenade 1987:179)
. . . domjandas thata thatei ains faur allans geswalt.
thinking this that one for all dies
‘ . . . thinking this, that one may die for all.’
See also the Latin examples in (19). These languages all have apparently mixed order in IP and VP (see above for Old English and Old Norse; Ferraresi 1997 on Gothic; Harris 1978:18ff., Vincent 1988:59ff., Salvi 2004, Devine and Stephens 2006, Ledgeway 2012 on Latin). Later, IP and VP became head-initial in English, Mainland Scandinavian, and Romance. There is in fact evidence from the history of English that the order in IP changed from head-final to head-initial before that in VP (examples from Biberauer, Newton, and Sheehan 2009:717, Biberauer, Sheehan, and Newton 2010).
Head-initial TP, head-final VP
. . . þat ne haue [VP noht here sinnes forleten].
who NEG have not their sins forsaken
‘ . . . who have not forsaken their sins.’
(11th century: Trinity Homilies 67.934)
Head-initial TP, head-initial VP
. . . oðet he habbe [VP iʒetted ou al þet ʒe wulleð].
until he has granted you all that you desire
‘ . . . until he has granted you all that you desire.’
(c1215: Ancrene Riwle)
Work by Santorini (1992) and Wallenberg (2009) suggests that exactly the same thing happened in the history of Yiddish. In this case, there is also synchronic evidence of the fact that the order of headedness within VP must have changed last since head-final auxiliaries are impossible in the modern language, while head-final VPs are able to alternate with head-initial VPs (see Santorini 1992 and Wallenberg 2009 for discussion).
Biberauer, Sheehan, and Newton (2010) observe that the OV-to-VO change from Latin to French appears to have followed the same pattern. Moreover, Ledgeway (2012:chap. 5) shows in great detail that the same seems to hold true for word order change from Latin to Romance, again a change from head-final to head-initial order. In this connection, Ledgeway states that ‘‘both complementizers and adpositions are the only categories to show a fixed head-initial order since our earliest texts’’ (2012:205) and that, once head-initiality is established in the topmost CP and PP layers, it is ‘‘free to percolate down harmonically to the phrases that these in turn embed’’ (2012:242).
The same holds for Finnish and Saami. Finno-Ugric languages further east are strictly head-final, with O > V > T > C order, reflecting the original Finno-Ugric, and indeed Uralic, pattern (Abondolo 1998). But Finnish and Northern Saami, two of the westernmost languages in the family, have C > TP strictly, with T/Aux > VP and V > O as unmarked orders, but allowing VP > T/Aux and O > V as marked options (Marit Julien, pers. comm., on Northern Saami; see section 4.4 on Finnish).
Further evidence comes from Niger-Congo languages that have undergone a VO-to-OV change that is limited to VP (see Nikitina 2008 for discussion and references), and the Ethiopian Semitic languages, which have undergone a change from the typical Semitic head-initial pattern to a largely head-final pattern under the influence of Cushitic. See Biberauer, Newton, and Sheehan 2009 and Biberauer, Sheehan, and Newton 2010 for detailed discussion of these and other case studies corroborating the above pathways.
FOFC affects change in another way, too, in that it also restricts borrowing options—that is, change triggered by ‘‘external’’ factors. Biberauer, Newton, and Sheehan (2009) and Biberauer, Sheehan, and Newton (2010) discuss a case study focusing on the borrowing/innovation of a clause-final complementizer. They report that, among Indo-Aryan languages that had borrowed a final complementizer, only those languages not featuring an initial question-marking polarity marker (Pol) developed a final complementizer. The relevant data are summarized in table 1 (based partly on information in Davison 2007). In many Indo-Aryan languages, C (a complementizer) can cooccur with Pol (a question particle), in which case we find the orders C > Pol > TP (e.g., Hindi-Urdu) and TP > Pol > C (e.g., Marathi), indicating that this interrogative-oriented Pol is located hierarchically below C (see Laka 1994).21 The gap in D of table 1 then follows from FOFC, since the structure would be as in (45).
|Type .||Position of Pol .||Position of C .||Languages .|
|A||Initial||Initial only||Hindi-Urdu, Panjabi, Kashmiri, Sindhi, Maithili, Kurmali|
|B||Final/Medial||Initial and final||Marahi, Gujarati, Assamese, Bangla, Dakhini Hindi, Oriya, Nepali (and some North Dravidian languages, e.g., Brahui)|
|C||Final/Medial||Final only||Sinhala (and most Dravidian languages)|
|D||Initial||Final||Unattested in the area|
|Type .||Position of Pol .||Position of C .||Languages .|
|A||Initial||Initial only||Hindi-Urdu, Panjabi, Kashmiri, Sindhi, Maithili, Kurmali|
|B||Final/Medial||Initial and final||Marahi, Gujarati, Assamese, Bangla, Dakhini Hindi, Oriya, Nepali (and some North Dravidian languages, e.g., Brahui)|
|C||Final/Medial||Final only||Sinhala (and most Dravidian languages)|
|D||Initial||Final||Unattested in the area|
This structure instantiates the schema in (3) for α = Pol and β = C, and therefore is a further example of FOFC.
We are not aware of any detailed studies in a generative framework of changes in head-complement order in the DP. Nonetheless, the prediction FOFC makes is clear: if change from head-final to head-initial order in the nominal must go ‘‘top-down,’’ then ON > NO should follow all other changes, in that DP must be affected first, followed by other DP-internal functional projections, followed by NP, and conversely for change from NO to ON. Some indication regarding the latter sequence of changes can be gleaned from Greenberg’s (1980) study of word order change in the Ethiopian Semitic languages. Here we see GenN + Preposition order in 14th-century Amharic changing to GenN + Postposition order in Harari. If Gen here corresponds to the complement of N, this is the FOFC-compliant order of changes (see also Croft 2003:250ff., Roberts 2007:343–345).
Interestingly, Finnish appears to exhibit the opposite order of changes: both an innovating NO order and postpositions are found, in the sense that they cooccur in the language, but they do not combine to form a FOFC-violating structure, as pointed out in section 2.4. This shows that languages can change ‘‘in the wrong order,’’ as it were, as long as they have structural alternatives to the potentially problematic structure (most likely through either partial retention of a conservative structure or concomitant innovation of a novel structure).
2.6 A More General Prediction
A more general prediction that FOFC makes is that there will be more instances of head-final orders in structurally lower parts of the clause and more head-initial orders in the higher parts. A special case of this prediction is that we should find many languages that combine initial subordinating complementizers with verb-final order, while we should find no languages with the inverse order. As shown in section 2.3, this is true (on clause-final particles in VO languages, see section 3.2). Unfortunately, the data in the nominal domain are not at all clear.
Arguably another case is the predominance of postpositions in the world’s languages compared with the above-mentioned predominance of complementizers: Dryer (1992) shows that postpositions are found in 119 genera out of 196 (i.e., in roughly 61% of genera).22 His (1989) research also showed that OV order dominates in 111 genera (i.e., it surfaces in 58% of the genera considered at the time).
A further prediction is that the most deeply embedded structures will strongly tend to be head-final. If suffixes are heads of complex words, then word-internal structure appears to be just like this; compare the ‘‘suffixing preference’’ noted by Hawkins and Gilligan (1988), which is discussed in relation to FOFC in Biberauer et al., to appear).23
2.7 Conclusion and Summary
We have shown evidence of FOFC in the absence of certain logically possible word order patterns in the clausal domain (sections 2.1.1–2.1.5, 2.3), in the nominal domain (section 2.4), and also in diachrony (section 2.5). We maintain that this suffices to take the constraint seriously, as a possibly universal constraint on disharmonic structures. We summarize the FOFC violations we have observed so far in (46).
We will look more closely at these violations in what follows, and in certain cases revise our assumptions about the precise structures involved. (Some of this appears in the online appendix; see section 5.) For now, though, (46) can be taken as a convenient summary of the observations made so far, and the common pattern underlying them.
We have also noted that FOFC is not just a typological generalization, but plays a role in the I-language of speakers, in that FOFC violations are systematically avoided in a single derivation. This is shown most clearly in languages where head-final and head-initial orders both occur within some extended projection, but the FOFC-violating combination of head-initial and head-final orders is systematically avoided.
Having presented a range of empirical evidence for FOFC and dealt with some apparent counterevidence (notably in section 2.2), we now consider some potential evidence that the pattern we claim to underlie (46) is not fully general. We will maintain that this pattern, instantiating the general ban on the configuration in (3), is in fact exceptionless once the data are fully considered.
3 The Role of the Categorial Feature [±V] and the Extended Projection
3.1 Nominal Complements of Verbs
A head-initial DP or PP may be immediately dominated by a head-final VP in many OV languages, as in (47).
Johann hat [VP[DP einen Mann] gesehen].
Johann has a man seen
‘Johann has seen a man.’
Johann ist [VP[PP nach Berlin] gefahren].
Johann is to Berlin gone
‘Johann has gone to Berlin.’
In purely configurational terms, (47a–b) instantiate the schema in (3) for α = D/P and β = V. However, they are grammatical. Clearly, the difference between these cases and those considered in section 2 has to do with the fact that α and β are categorially distinct and hence in different extended projections. It is for this reason that our formulation of FOFC consistently makes reference to extended projections: FOFC holds of pairs of categories that belong to the same extended projection. Here and in section 4 we will develop this idea more systematically.24
Instead of referring to extended projections, an alternative hypothesis about what makes (47) crucially different from the cases discussed above might be that the preverbal constituents are arguments/referential expressions, assigned Case and a θ-role by the verb, which would make them opaque to FOFC. That this cannot be right is shown by the contrast between CP and DP complements. As discussed in section 2.3, CP complements are sensitive to FOFC: CPs with an initial complementizer are not acceptable in preverbal position in OV languages. In this, they minimally contrast with DPs.
* . . . dass Johann niemals [CP dass er eigentlich ein angenommenes Kind sei]
that Johann never that he actually an adopted child be.SUBJ
. . . dass Johann niemals besprochen hat [CP dass er eigentlich
that Johann never discussed has that he actually
ein angenommenes Kind sei].
an adopted child be.SUBJ
‘ . . . that Johann has never discussed the fact that he is actually an adopted child.’
The well-formed counterpart of (48a) has the bracketed CP extraposed, as in (48b). The CP complement is an argument of the verb, so argumenthood is clearly not the crucial property. Significantly, a clause embedded under a noun is possible in preverbal position in German—hence the minimal contrast between (48) and (49).
. . . dass Johann niemals [DP den Verdacht [CP dass er eigentlich
that Johann never the suspicion that he actually
ein angenommenes Kind sei]] besprochen hat.
an adopted child be.SUBJ discussed has
‘ . . . that Johann has never discussed the suspicion that he is actually an adopted child.’
. . . dass Johann niemals den Verdacht besprochen hat [CP dass er eigentlich
that Johann never the suspicion discussed has that he actually
ein angenommenes Kind sei].
an adopted child be.SUBJ
Here, too, the embedded CP may be extraposed, as in (49b), but in this case it is optional (although preferred in spoken German). Moreover, predicative nominals behave just like argument nominals, as shown by (50).
. . . dat Johan [NP minister van buitelandse sake] geword het.
that Johan minister of foreign affairs become has
‘ . . . that Johan has become minister of foreign affairs.’
Here the bracketed constituent is not an argument, but a predicate noun phrase—a bare NP that in this case contains a postnominal complement, apparently insensitive to FOFC.
These facts indicate that a crucial property is categorial identity, rather than argumenthood or referentiality: FOFC does not apply to a verbal head taking a nominal complement. Furthermore, the fact that VP, AuxP, TP, and CP pattern together against DP, NP, and PP supports our assertion that the crucial notion is extended projection, in roughly the sense of Grimshaw 1991, 2001, 2005. Informally, the extended projection of V is VP, vP, TP, CP, and any other projections along the ‘‘spine’’ of the tree between VP and CP (AspP, AuxP, etc.). The extended projection of N is NP, DP, and any other projections between them along the spine of the tree, such as NumP, QuantifierP, and ClassifierP, as well as PP (subject perhaps to crosslinguistic variation, and/or to variation in the class of ‘‘Ps’’; see Cinque and Rizzi 2010 for recent discussion). Assume that the defining characteristic of the extended projection of V is the categorial feature [+V], while the defining characteristic of the extended projection of N is [−V]. That is to say, each head along the spine of the tree from V to C (i.e., v, Asp, Aux, T, . . . ) includes [+V] among its features, and each head along the spine of the tree from N to P (Num, Quantifier, Classifier, D, . . . ) includes [−V] among its features.
We can now modify the formulation of FOFC as follows:
*[βP . . . [αP . . . α γP] β . . . ]
αP is immediately dominated by a projection of β, and
α and β have the same value for [±V].
(52) [VP[CP dass . . . ] V]
(53) [VP[DP den Verdacht [CP dass . . . ]] V]
We now have an explanation for Koptjevskaja-Tamm’s (1988, 1993) observation that OV languages either have embedded finite clauses that are extraposed, or have no embedded finite clauses but nominalizations instead: these are two ways to comply with FOFC. Extraposition avoids placing a head-initial complement under a head-final VP, and the complement in (53) has the status of a nominalized clausal complement of V, complying with FOFC by virtue of clause (51b).
Clearly, this analysis entails that we view regular that-complementizers, and therefore the CPs they head, as verbal rather than nominal, pace Grimshaw (and hence, again pace Grimshaw, as part of the same extended projection as both the embedded and the selecting verb, making our notion of extended projection similar to Kayne’s (1983) notion of g-projection). An alternative possibility, which would bring our view closer to Grimshaw’s, is explored in detail by Biberauer and Sheehan (2012), who also discuss in detail how CP-extraposition is formally derived in headfinal languages, a matter we leave aside here.
One prominent class of potential counterexamples to FOFC involves sentence-final particles in otherwise head-initial languages. The following are representative examples from the CP domain:
Hongjian xihuan zhe ben shu ma?
Hongjian like this C book Q
‘Does Hongjian like this book?’
Kɔ̀kú yrɔ́ Kòfí à?
Koku call.PERF Kofi Q
‘Did Koku call Kofi?’
San Lucas Quiaviní Zapotec
B-da’uh Gye’eihlly gueht èee?
PERF-eat Mike tortilla Q
‘Did Mike eat tortillas?’
It is tempting to analyze these particles as Cs (see Paul, to appear, for an argument that this is the right approach). If so, we would have instances of final Cs in VO languages, and hence counterexamples to the generalization put forward in section 2.3 (FOFC would be violated for α = V and β = C here). Although many cases involve putative C-elements like those in (54), final particles also occur in phrases of a variety of types: Aspect, Mood, Negation, Polarity, Specificity, Force, among others. Often, the languages with these types of particles are ‘‘repeat offenders,’’ with multiple FOFC-violating elements (see Dryer 2009b for discussion).
Focusing on the clause-final, seemingly C-related particles, we take it to be significant that we do not find this kind of order with true subordinating Cs of the type discussed in section 2.3 (see also the Indo-Aryan data involving complementizers and polarity items in section 2.5). Subordinating Cs seem to be invariably clause-initial in VO languages, including languages that have some clause-final particles. Consider (55).25
Tân mua gi the?
Tan buy what Q
‘What did Tan buy?’
Anh đã nói (rằng) cô ta không tin.
PRNANT say that PRNNEGPRT believe
‘He said that she didn’t believe (him).’
Presumably this is possible because the question force is signaled by some other means, such as intonation. Conceivably, then, the languages in question have an abstract head in the left periphery encoding question force, triggering question intonation in the languages that have it, which is optionally doubled or, in the case of languages with numerous question particles (see, e.g., Lee 2005, 2006 on San Lucas Quiaviní Zapotec), made more precise by a final overt particle.26
As also discussed by Bailey (2010, 2012), at least some of the apparently FOFC-violating final question particles may actually be initial negative disjunctions of an elided disjunct clause. The structure of these yes/no questions would be [Q [TP [OR-NOT
TP]]], where ellipsis of the second TP, identical with the first TP, leaves the negative disjunction as an apparently clause-final particle (see also Aldridge 2011, Yaisomanang 2012). The question force would be supplied by an abstract higher question morpheme (see Ladusaw 1992, Zeijlstra 2004 for a similar proposal with respect to negation). The most obvious evidence in favor of this analysis is that in many languages question particles are homophonous with, or clearly derived from, either a negation or a combination of negation and disjunction. This is considerably more common in the case of final question particles than initial question particles (see Bencini 2003 and Bailey 2010, 2012 for discussion).27
If these are partially disguised coordinate structures, then there is no FOFC violation for the same reason that there is no FOFC violation in any coordination of head-initial phrases, as in (56), for example (assuming the structure of coordination in Kayne 1994, Johannesen 1998, and Zhang 2009).
(56) He has [ConjP[VP finished his tea] [and [VP eaten his biscuit]]].
Consider again the definition of FOFC in (51). Given that and can coordinate XPs of all kinds, without interfering with the selection relationships coordinated elements can enter into (a verb can select a coordinated nominal as readily as it can select any other nominal), the conjunction and, and conjunctions more generally that similarly coordinate XPs of various kinds, will not have the same value for [±V] as (either of ) the two conjuncts. Therefore, the head-initial first conjunct in (56) is dominated by an acategorial head, which is spelled out by and. This is illustrated in (57).
Where Conj is a disjunction marker functioning as a question particle (as proposed by Jayaseelan (2008), Aldridge (2011), and Bailey (2012), among others) and the second conjunct is deleted, no FOFC violation results. See Zwart 2005, 2009, discussed in section 2.3, for evidence that conjunctions are different from most other heads in not showing any crosslinguistic head-complement order variation. As will be discussed in section 4.3.1, this is a direct consequence of these elements being unmarked for [±V].
Acategoriality also seems to be the key consideration determining the availability of many final negation/concord marker structures in languages with at least partially head-initial clausal syntax (see Cinque 1999 for discussion of negation as a ‘‘syncategorematic’’ category). Like coordination markers in many languages, negation markers do not appear to c-select specific complements and therefore cannot be associated with an independent categorial specification. To the extent that the central African and Austronesian V-O-Neg languages discussed by Dryer (2009b) and Reesink (2002) can be shown to have ‘‘promiscuous’’ negation markers of this kind, we can understand their apparent ability to violate FOFC: if Neg lacks a categorial specification, V-O-Neg structures instantiate a further case of the structure schematized in (57), with Neg replacing Conj. Further, Biberauer (2009, 2012) shows how this analysis may also be extended to head-final negative concord markers in languages with head-initial XPs. In the case of Afrikaans, the availability of clause-final and ‘‘high Pol’’-instantiating28nie2 even though Afrikaans, like other West Germanic languages, has a head-initial CP, can be understood as a consequence of this element’s acategoriality: nie2 not only doubles sentential negation, in which case it realizes a CP-peripheral Pol head, but also doubles constituent negations targeting DPs, PPs, APs, and so on. Since this doubling does not affect the selectability of the relevant constituents, it is best analyzed as an acategorial XP-peripheral Pol head (we return to the peripherality of this element and similar ones, which mirrors that of the coordinators discussed above, in section 4.3.1). As such, it too does not violate FOFC. Similar analyses can be extended to ‘‘promiscuous’’ focus and topic markers, as discussed in Biberauer et al., to appear.
Also relevant to the case of optional (and thus, typically, emphatic) peripheral concord markers29 and optional discourse-related markers more generally is that many of these can be shown not to be fully integrated into the structures they are associated with. So, for example, the optional clause-final negative reinforcer (‘‘concord marker’’) não in Brazilian Portuguese, which may also surface independently of the clause-internal negator that it typically doubles, is clearly not integrated into the CP domain, as it cannot license NPIs (see Bailey 2012 for discussion of question particles, which, similarly, cannot license NPIs; see Biberauer and Cyrino 2008, 2009, for discussion of the Brazilian Portuguese facts).
So far, then, we have shown that there is a range of apparently FOFC-violating clause- and XP-final particles that, upon closer inspection, do not violate FOFC because they are acategorial elements. This generalization across a very diverse range of elements once again underlines the validity of appealing to the notion of extended projection in characterizing the nature of FOFC. We leave to further research the wider empirical question of the extent to which this analysis can be shown to hold for apparently FOFC-violating particles in general.
3.3 Verb Clusters and Infinitivus pro Participio in West Germanic
In section 2.1.1, we observed that two-member verbal clusters in West Germanic always obey FOFC. However, there is one class of verb cluster that is potentially problematic. These are the clusters involving the ‘‘231 order.’’ In this terminology, n + 1 is the complement of n, and the left-to-right order of integers indicates the surface order of the verbal elements making up the verb cluster. Hence, ‘‘231’’ indicates a structure of the type [[v V] Aux], a clear FOFC violation if V is the complement of v (these labels are purely illustrative here). Walkden (2009) points out that this order is attested in West Flemish.
. . . da Valère willen dienen boek lezen eet.
that Valère want.INF that book read.INF has
‘ . . . that Valère has wanted to read that book.’
In this example, willen dienen boek lezen is arguably a head-initial complement (of unclear category; see below) of head-final eet. Taking into account data cited by Barbiers (2005), Schmid (2005), and Brandner and Salzmann (2011, 2012), and newly collated Afrikaans data, Biberauer and Walkden (2010) discuss the considerable extent to which these structures are found in various Swiss German, Dutch, and Afrikaans varieties (see also Abels 2012).
The 231 order seen in examples like (58) is only a problem, however, if the three verbs in the construction all belong to the same extended projection. There are indications, however, that this is not the case (see Biberauer et al., to appear, for full discussion).
First, we consider it to be both striking and significant that the linearly initial verb (2 in 231) bears surprising morphology in nearly every example of 231 order known to us: infinitival morphology instead of the participial morphology otherwise required by the perfective auxiliary (see below on the apparent exceptions to this). This is the much-studied phenomenon known as infinitivus pro participio (IPP). One proposal regarding the origin of this phenomenon is that it initially involved structures featuring a participle lacking the characteristic Germanic perfective ge- (see, e.g., Zwart 2007 for discussion). These ge-less participles were then reanalyzed as infinitives. Building on the well-established idea that infinitival morphology is nominal (see in particular Kayne 2000:283ff.) and the proposal by Remberger (2012) discussed in section 2.2, this change can be understood as entailing the removal of what is in Germanic a verbal projection, ge- typically being thought of as a perfective verbal prefix (see Streitberg 1891 for the original proposal).30 Consider how Remberger’s postulated participial structure, given in (18), might apply to Germanic.
Importantly, n in Germanic generally is not associated with Asp, as it is in Latin (note the -t ending in (18)), the [perfective] component being encoded on the higher verbal affix instead.31 Where acquirers encounter IPP forms, which historically featured ge-less participles or were created by analogy to these forms, they postulate a structure in which the highest verb (e.g., the perfect/past auxiliary in (58)) selects a defective [+N] nP complement. FOFC is therefore respected, as the structurally lowest verbs (2 and 3) are separated from the auxiliary in 1 by an intervening [+N] projection. The lowest verbal pair, in turn, exhibits head-initial rather than head-final order because 2 is necessarily a verb raiser (or restructuring verb; see, e.g., Wurmbrand 2001, 2004, Cinque 2006), which therefore forces raising of the infinitival verb in its defective complement clause to the highest verbal projection in that XP (cf. Kayne 1991 and also Roberts 1997, who discusses raising to nonfinite T in this context; paceCinque 2006).
Second, as noted by Biberauer and Walkden (2010), the most prolific 231-permitting systems, the various varieties of Afrikaans, exhibit 231 structures featuring a range of more recently innovated 2-verbs. These include a future-related auxiliary gaan ‘go’ and various ‘‘linking verbs,’’ two of which are illustrated in (60) (see also de Vos 2006 for discussion).
. . . dat hy die boek loop2 koop3 het1.
that he the book walk.INF buy.INF have.FIN
‘ . . . that he went to buy the book.’
Hy loop (*gou) koop gou die boek (*koop).
he walk.FIN fast buy.INF fast the book buy.INF
‘He goes and buys the book quickly.’ ( pseudocoordination reading)
Hy gaan (*gou) loop (*gou) koop gou die boek.
he go.FIN fast walk.INF fast buy.INF fast the book
‘He goes and walks and buys the book quickly.’ ( pseudocoordination reading)
. . . dat hy gou die boek gaan2 loop3 koop4 het1.
that he fast the book go.INF walk.INF buy.INF have.FIN
‘ . . . that he goes and walks and buys the book quickly.’
As (60b) clearly shows, verbs 2 and 3 in structures of this type necessarily undergo V2 together: it is impossible to separate them with an adverb and it is also impossible to strand the nonfinite verb (here, lexical koop ‘buy’) in postobject position. As comparison of (60c) and (60d) shows, the addition of a further linking verb, gaan ‘go’, which is distinct from future gaan, increases not only the size of the head-initial cluster, but also the size of the cluster that undergoes V2. The relevant verbs also front together in predicate-doubling and VP-fronting structures (see Biberauer 2009), further supporting the idea that they function as a unit more generally. In these cases, then, it would appear that verbs 2 and 3 are not syntactically distinct in the way that heads forming part of an extended projection typically are. We leave aside here the question of precisely how this should be formally captured (but see Biberauer et al., to appear), noting only that these structures do not appear, in violation of FOFC, to involve a head-final XP (here, AuxP) dominating a head-initial XP (here, the XP associated with verb 2).
Once again, then, we have shown that structures that superficially appear to violate FOFC do not, upon closer inspection, actually seem to do so.
4 Linear Order and Movement
Having illustrated FOFC as an empirical generalization, we now consider the formal mechanisms underlying the general ban on structures of the form in (3), repeated here for convenience.
(3) *[βP . . . [αP . . . α γP] β . . . ]
The observation that (3) is not allowed poses a challenge for any account of linearization: it should predict (i) the preference for harmony and (ii) the fact that only one disharmonic order is allowed. In other words, it should predict FOFC.
Within the Principles-and-Parameters framework, the head parameter, regulating the linear order of head and complement, is standardly taken to explain the preference for cross-categorial harmony. In fully harmonic languages, all heads have the head parameter set the same way: either the head precedes the complement or the head follows the complement (see Koopman 1984, Travis 1984, Fukui and Saito 1998, Richards 2004). The fact that not all languages are consistently head-initial or head-final means that the parameter must be relativized to categories: some heads may deviate from the general setting of the head parameter, allowing disharmony in the phrase structure.32 However, on its own this does not explain why there should be a difference between the two kinds of disharmonic structures instantiated by (3) and its inverse where the head-initial category is structurally higher than the head-final one. We therefore have to look elsewhere for an explanation of FOFC.
4.2 An LCA-Based Account of Linearization
Consider FOFC again, this time the informal statement (1).
The Final-over-Final Constraint (FOFC) (informal statement)
A head-final phrase αP cannot dominate a head-initial phrase βP, where α and β are heads in the same extended projection.
On the other hand, a head-initial phrase may dominate either a head-initial or a head-final phrase. As noted earlier, this entails that, at least in mixed systems, head-final order is more constrained than head-initial order, and in that sense more marked than head-initial order. Therefore, what we should look for is a theory of the relation between structure and linear order in which head-final order is, in the relevant fashion, more marked than head-initial order. One such theory is Kayne’s (1994) antisymmetry theory, including the LCA. In the following, we will argue that FOFC is indeed indirectly an effect of the LCA, in conjunction with certain other postulates widely assumed within Minimalist syntactic theory.
The presentation proceeds as follows. First, we present the LCA and the corollary that head-final order is derived by complement movement. This leads to the hypothesis that phrase-final heads involve a movement-triggering feature. FOFC is then seen as an effect of ‘‘spreading’’ or inheritance of this feature from the lexical head up, from head to head within the extended projection, observing standard locality conditions on head-to-head relations. We then compare this theory with an alternative theory that has all the same components as the LCA-based theory except the LCA itself. We argue that this theory does not, in fact, represent a simpler or more elegant alternative to the LCA-based theory.
We state the LCA as follows:
α precedes β if and only if α asymmetrically c-commands β or if α is contained in γ, where γ asymmetrically c-commands β.
In (61), we depart from Kayne’s original formulation of the LCA but follow the basic ideas of bare phrase structure in taking α and β to be potentially both terminal nodes and lexical items; in particular, we do not regard lexical items as constituents of categories. We define c-command as follows:33
α c-commands β if and only if α is a category and β is contained in the sister of α.
α asymmetrically c-commands β if and only if α c-commands β and β does not c-command α.
Consider the standard X-bar structure, as in (63).
(63) [XP α [X′ X β]]
Given (61) and (62), the specifier, α, precedes the head X, since that head is contained in α’s sister. As long as β has internal structure, X will precede it, since X asymmetrically c-commands anything contained in β (and containment dependencies cannot ‘‘cross’’). If β has no internal structure, X and β cannot be ordered, as either they c-command each other or there is no c-command relation (this depends on whether ‘‘contain’’ is reflexive; see footnote 33). Further specifiers and adjuncts will be ordered among themselves, because they will all be in asymmetric c-command relations with one another and will always be to the left of the ‘‘core’’ X′ containing X and β, given the definitions in (61) and (62).
Since movement is always ‘‘upward,’’ a moved element will always asymmetrically c-command its trace or copy.34 The LCA then guarantees that movement is always leftward. It is worth noting that surface linear order is, quite independently of LCA-related assumptions, very often in part the result of (leftward) movement of one kind or another (A-movement, Ā-movement, or head movement). The proposal here, as in Kaynean work more generally, is that surface head-final order is also always the result of movement. In order to precede a given head, a complement must move from its position as sister of that head to a position where it asymmetrically c-commands the head (Kayne 1994:47–48). Head-initial order, on the other hand, can (but need not) be derived without any movement. That is to say, head-final order is derivationally more complex than head-initial order, in the sense that it must involve a step of movement that head-initial order does not absolutely require. This, we contend, is essentially why it is more constrained than head-initial order.
Furthermore, as also originally proposed by Kayne (1994:52–53), consistent head-final order is derived by ‘‘roll-up’’ (successive leftward movement of complements and categories containing moved complements). The derived structure of a ‘‘roll-up’’ derivation in CP is as shown in (64) (the internal structure of the copies of the rolled-up categories is not indicated for ease of exposition).
The LCA applied to this tree yields the string O > V > v > T > C. This is a harmonically head-final structure. A harmonically head-initial structure arises where no complement movement takes place (in the simplest case). Most importantly for our purposes, disharmonic orders result when some complements, and/or elements contained in those complements, undergo movement and others do not. If movement of an XP is always triggered by a property of some head, then what FOFC shows is that the distribution of the movement-triggering property is constrained, so that only one type of disharmony is allowed. More precisely, a typical FOFC violation will arise when a superordinate head triggers movement of its complement, but inside that complement the head does not trigger movement of its complement. Suppose, for example, that v triggers VP-movement but V does not trigger object movement. Then we have the structure in (65).
If v can contain an auxiliary, then this configuration gives surface V > O > Aux order (assuming O has internal structure, as mentioned above), the FOFC violation discussed in section 2.1. On the other hand, if V triggers movement of its complement, and v does not, the result is the FOFC-compliant disharmonic structure shown in (66).
Hence, if a superordinate head does not trigger movement, but the head of its complement does, the result is permissible, non-FOFC-violating disharmony. Note that this means that disharmonic languages are always partially harmonic: there is a node in the disharmonic extended projection such that above that node, they are harmonically head-initial, and below that node, they are harmonically head-final. This generalization is a direct consequence of FOFC.
We are now in a position to take a crucial step forward in understanding FOFC. It emerges from the above discussion that head-final order can be derived by complement movement, as long as, when iterated, complement movement starts at the bottom of the tree and iterates monotonically up the tree. The iterations can stop at any point (as designated in the grammar of the language), as long as the stopping is ‘‘permanent’’—that is, as long as complement movement does not ‘‘start again’’ in a higher position within the same extended projection. This is an informal, movement-based statement of FOFC. We now have to make the statement more formal and explain exactly why complement movement should be constrained in this way.
4.3 Our Proposal
4.3.1 FOFC, Feature Copying, and Movement
It should be clear from the previous section that our account of FOFC relies on movement, and in particular on the way in which movement is triggered. Accordingly, we adopt the following idea:
(67) Movement is triggered by a general movement-triggering feature. We use ^ (caret) as a symbol for this feature.
We take ^ to be a purely formal, arbitrary diacritic. In itself, it has no semantic content, and no connection to phonological or morphological properties beyond simply causing movement. Moreover, although it can be seen as a kind of formal feature, ^ differs in several important respects from formal features like φ-features. Unlike φ-features, which are arguably best seen as attribute-value pairs, it is privative, it has no internal structure, it cannot be valued or in any obvious way ‘‘checked off,’’ and as already mentioned it has no semantic or morphophonological effects.35
The idea that movement is triggered by a purely formal diacritic is widespread in the current literature. In different versions, and with different notations, it appears in Müller and Sternefeld 1993, Chomsky 2000, 2001, 2008, Pesetsky and Torrego 2001, and Roberts and Roussou 2003, among other works; the idea of a ‘‘spell-out’’ diacritic associated with certain positions is also found in the representational system proposed in Brody 1995.
Very much in the spirit of Müller and Sternefeld 1993, we take it that the properties of different types of movement depend on the features that ^ is associated with. Where the movement trigger ^ is associated with the uninterpretable φ-features of an active probe (e.g., finite T), it gives rise to A-movement; in this respect, it replaces the EPP-features of Chomsky 2000, 2001. Where ^ is associated with a phase head (e.g., C), it triggers Ā-movement (see Chomsky 2008: 144). Finally, and most important for our purposes, where ^ is associated with the categorial, extended projection–defining feature [±V], linearization movement takes place—that is, movement of the sister of a head as seen in the previous section.36 Examples of the different types of movement triggers are as follows:37
T[uφ, ^] triggers movement of the goal of the probe [uφ] to Spec,TP.
C[EF, ^] triggers Ā-movement to Spec,CP.
V[+ V, ^] triggers movement of the sister of V to Spec,VP.
We can now state FOFC in terms of movement more formally.
The Final-over-Final Constraint (formal statement)
If a head αi in the extended projection EP of a lexical head L, EP(L), has ^ associated with its [±V]-feature, then so does αi+1, where αi+1 is c-selected by αi in EP(L).
Where αi is L, (69) holds trivially, in the absence of a head αi+1. The hypothesis that the movement-triggering feature accompanies the extended projection–defining feature [±V] can explain why head-final order spreads from the bottom up, starting at the base of the extended projection: any property associated with [±V] will be a property of the lexical head L that defines the extended projection. This follows from what we take to be the intuitive notion of extended projection as the ‘‘inheritance’’ of core properties of the lexical head through the functional superstructure associated with that head. We define extended projection as in (71). As the definition involves the notion spine, we begin by defining this notion.38
(70) A sequence of nodes Σ = (α1, . . . , αn) is a spine if and only if
αn is a lexical category and an X0; and
for all αi<n in Σ, either
αi is a projection of and immediately dominates αi+1, or
αi is an X0 and the sister of αi+1.
This definition states that a lexical head, its projections, any category immediately dominating a projection of the lexical head, and any category immediately dominating such a category, as well as any head that is a sister of such a category, can be part of the spine. Call a spine whose final member is γ, the spine from γ. We can now define extended projection as follows:
(71) Π is the extended projection of L if and only if Π is the maximal subsequence of Σ(L), the spine from L, such that
L ∈ Π, and
if α ∈ Π, then α and L have the same value for [±V].
Assume that for instance v c-selects [+V] and therefore merges with V(P). Assume, however, that v is not inherently valued [+V], but that this feature ‘‘spreads’’ to v from its sister VP.39 This can be iterated at the T level, making T [+V], whatever other features it has, and so on for C as well. So we see that [+V] spreads through the functional heads making up the core extended projection, thereby defining the extended projection in accordance with (71b). What is most important for our purposes concerns the interaction of ^ with selection. If a given head can select [+V] and inherit [+V], exactly the same applies in a system with [+V^]. In this situation, a higher head may select [+V] and inherit [+V] without ^, but, crucially, no head can inherit ^ without inheriting [+V]. The assumption behind this is that ^ cannot be selected alone, since it is not a categorial feature. Parametric variation in word order can then be encoded in terms of the highest head in the extended projection that selects [±V^].
It now follows that if X c-selects Y, and X and Y share the same value for [±V], if X is ^ then Y is also ^. FOFC is thus a consequence of the locality of c-selection. The configuration (72) is ruled out.
Once movement has applied to (72), the resulting structure would be (73): a head-final phrase XP immediately dominating a head-initial phrase YP (itself containing a head-final ZP)—that is, a FOFC violation, and a structure not attested among the world’s languages, if we are right.
Since FOFC is a consequence of the locality of c-selection, we can ask what makes this relation so local. We propose that this is an effect of Relativized Minimality, which we state as follows:
Relativized Minimality (adapted from Rizzi 2001)
In a configuration X . . . Y . . . Z, where X asymmetrically c-commands Z, no syntactic relation R can hold between X and Z if Y asymmetrically c-commands Z but does not c-command X, and R potentially holds between X and Y.
By Relativized Minimality, X in (72) cannot enter a selection relation with Z directly; hence, ^ cannot spread to X from Z, and the FOFC-violating structure (73) cannot be derived. ^ can only spread to X if it also spreads to Y.
Thus, FOFC is a result of the following syntactic conditions:
Head-finality is a consequence of the movement trigger ^ being paired with the categorial feature [±V], which enters the derivation with the head of the extended projection.
The movement trigger ^ can spread with [±V] from head to head along the spine of the extended projection, subject to parametric variation.
C-selection relations are subject to Relativized Minimality.
An interesting and, we think, desirable consequence of this approach to extended projections is that functional heads only have one categorial feature: what they c-select is what they are, in categorial terms. Lexical heads, on the other hand, have their intrinsic categorial feature and a distinct c-selection feature; for example, a canonical transitive verb is [+V] and [____ + N]. The richer specification of lexical categories can be connected directly to the fact that they have s-selection properties, which functional heads are typically thought to lack.
Some comment is also necessary in relation to (75c). As originally argued by Chomsky (1965), c-selection/subcategorization is subject to a sisterhood condition. In terms of bare phrase structure, this can be implemented by taking c-selection to be a constraint on (external) Merge. However, consider the case of negation in many languages: typical positions for clausal negation are between C and T (e.g., Italian, Spanish) or between T and v (Germanic). In general, C, T, and v make up the clausal extended projection, and T is selected by C and v is selected by T. We do not want to say that the c-selection properties of these heads are different in negative clauses. If we take negation to lack categorial properties, then, if c-selection is subject to Relativized Minimality, the negation head will be invisible to selection. It also follows from the account of FOFC given above that negative particles will form a class of systematic exceptions to FOFC, which, as far as we are aware, is true. The same logic can be carried over to topic and focus markers associated with selected constituents and thus, arguably, to particles of these kinds and to acategorial particles more generally (see also the discussion of conjunctions in section 3.2).
A question that arises, of course, is how acategorial elements like negation, topic, focus, and coordination markers can be merged into a structure when they are never c-selected by another element. Biberauer et al. (to appear) suggest that elements of this type are, for this reason, always the last of the elements in a given lexical array to be merged, meaning that we predict them to occupy peripheral positions in relation to phasal domains (assuming lexical arrays to define such domains). In the specific case of elements that can plausibly be thought to lack formal features entirely—basic coordination elements are a case in point (see section 2.3)—it might then be expected that these elements are always head-initial: they lack the features required in terms of (68) to host movement-triggering and thus potentially head-finality-generating ^.40
Before we demonstrate in more detail that the FOFC violations discussed in section 2 can be explained by this theory, let us consider an alternative theory that does not rely on the LCA.
4.3.2 An Alternative: Linearization without Movement
Consider the following theory, which has exactly the same elements as the theory above, except that ^ is not seen as a feature-triggering movement, thus affecting linear order only indirectly by virtue of the LCA, but as a ‘‘direct linearization’’ feature. On this alternative theory, the feature ^, associated with the feature [±V] of a head, would be a PF instruction to linearize the head to the right of its sister. More precisely, according to this theory, a head H may or may not have ^ associated with its categorial feature [+V] or [−V]. In the absence of ^ (the unmarked case), H is linearized to the left of its sister. With ^, H is linearized to the right of its sister. As in the theory described in section 4.3.1, the feature [±V], with or without ^, originates on the lexical head, and the same spreading of [±V] and ^ along the spine as described in section 4.3.1 is assumed; FOFC is an effect of locality of selection between heads, and violations of FOFC are ruled out by Relativized Minimality. Thus, a tree with the distribution of ^ in (76a) will have the structure/order in (76b), as a result of the linearization instruction associated with [+V] (in the absence of any syntax-internal movement reordering the constituents).
A tree where a functional head has ^ paired with [±V] but the head of the extended projection does not is impossible, if [±V] spreads together with ^ along the spine of the extended projection. A tree where a functional head has ^ paired with [±V] but the next functional head down the spine does not is impossible because of Relativized Minimality. Thus, as in the theory sketched in section 4.3.1, FOFC violations are underivable.
On the face of it, (76b) is structurally simpler than its LCA-based counterpart (64). However, we contend that the theory behind (76) (call it the direct linearization theory) is not simpler or more elegant than the theory behind (64) (antisymmetry theory) as a theory of the mapping between structure and linear order. First and foremost, in the direct linearization theory the premise that head-final order is marked and head-initial order the default is purely a stipulation; this just happens to be the case in all languages of the world. In antisymmetry theory, it is explained by the LCA, a principle that also explains a number of other pervasive universal properties of grammar, including why specifiers typically (arguably always) precede the head, and why movement is typically (arguably always) leftward. Given the LCA, head-final order requires movement, hence a movement trigger, to shift the complement to a position where it asymmetrically c-commands the head, while head-initial order does not require complement movement (though other movements are possible). Second, movement is obviously important as a factor determining constituent order in a variety of structures: passives, wh-movement, topicalization, scrambling, and so on. Thus, under direct linearization, order will be determined by movement and direct linearization. Under antisymmetry, order can be determined by movement alone, given the LCA. Related to this, under direct linearization, languages operate with both movement and linearization diacritics, whereas on the LCA-based view, only the former are required.
We take this to be reason enough to prefer antisymmetry over direct linearization. However, the hypothesis that FOFC is an effect of a feature originating on lexical heads that spreads up the spine of the extended projection, subject to Relativized Minimality, can be seen as independent of antisymmetry. See Biberauer and Sheehan 2012 for further discussion of the shortcomings of a non-LCA-based approach to linearization.
5 Some Outstanding Issues
Four outstanding issues are discussed in the online appendix (see http://www.mitpressjournals.org/doi/suppl/10.1162/ling_a_00153).
We demonstrate that each of the FOFC violations summarized in section 2.7 is indeed accounted for by the mechanisms we have postulated. One configuration discussed in the appendix requires special attention: namely, when a head with categorial feature value α takes (or appears to take) a complement with the same feature value α, potentially raising an issue for the Relativized Minimality–based account of FOFC that we have proposed.
We discuss optional head-finality, as found in Finnish, for example. The main observation to be accounted for is that when the option of head-finality is taken, the grammar operates exactly as it does in languages with obligatory head-finality.
In this connection, we take up the issue of ‘‘leakage’’ from prehead to posthead position, observed in varying degrees in head-final languages/projections.
We consider alternative theoretical accounts of FOFC: one in terms of processing, based on proposals by Hawkins (1990, 1994, 2004, 2013), and another addressing FOFC within an optimality-theoretic model (see Philip 2013).
In this article, we have provided empirical motivation for FOFC, as an exceptionless syntactic universal, and we have presented our account of it in terms of (variants of ) existing theoretical proposals.
In section 2, we presented FOFC and provided the principal empirical motivation for it. In section 3, we presented and accounted for certain apparent counterexamples. In section 4, we presented our theory of linear order and showed how FOFC can be derived from it; we also applied that analysis to the data introduced in sections 2 and 3. The central elements in our account of FOFC are
the antisymmetric analysis of head-final orders;
the general movement-triggering diacritic ^;
the notion of extended projection;
the strong locality condition on selection, which derives from Relativized Minimality.
Assuming that our data are correct, and that we have not missed some significant and intractable set of counterexamples, our analysis can be seen as supporting the postulates in (77). In particular, our analysis motivates the relevance of Grimshaw’s notion of extended projection for the investigation of universals, and arguably the concomitant notion that syntactic categories should be decomposed into features. It additionally supports the postulation of an antisymmetric analysis of surface head-final orders.
In FOFC, then, we seem to have a generalization that gives an empirical indication regarding the nature of the language faculty.
The research reported here was funded by Arts and Humanities Research Council of Great Britain (Award No. AH/ E009239/1 ‘‘Structure and Linearization in Disharmonic Word Orders’’). It has been presented, in one variant or another, at the 32nd GLOW Colloquium, Nantes; Deutsche Gesellschaft für Sprachwissenschaft, Berlin; NELS 40, MIT; the 35th Incontro di Grammatica Generativa, Siena; the University of Groningen; the University of Cambridge; GLOW VI in Asia, Chinese University of Hong Kong; the 6th Seminar on English Historical Syntax, Leiden University; the International Conference on Linguistics in Korea (ICLK), Seoul; Nanzan University, Nagoya; the 33rd Incontro di Grammatica Generativa, Bologna; the 26th West Coast Conference on Formal Linguistics, University of California, Berkeley; the 18th Colloquium on Generative Grammar, Girona; the Autumn Meeting of the Linguistics Association of Great Britain, King’s College, London; Leiden University; University of Mannheim; Senshu University, Tokyo; and the Universitat Autónoma de Barcelona. Additionally, it was the subject of graduate seminars at the University of Cambridge and Université de Paris VII and the European Generative Grammar (EGG) summer school in Brno (2006). We would like to thank the audiences at all those presentations for their comments and criticisms, especially the following students past and present: Alastair Appleton, Laura Bailey, Tim Bazalgette, Silvio Cruschina, Iain Mobbs, Neil Myler, Norma Schifano, and George Walkden. We would also like to thank the following for valuable comments: David Adger, Elena Anagnostopoulou, Josef Bayer, Ted Briscoe, Noam Chomsky, Guglielmo Cinque, Norbert Corver, Lieven Danckaert, Marcel den Dikken, Joseph Emonds, Ángel Gallego, John Hawkins, Richard Kayne, Olaf Koeneman, Adam Ledgeway, Krzysztof Migdalski, David Pesetsky, Gertjan Postma, Norvin Richards, Henk van Riemsdijk, Luigi Rizzi, Martin Salzmann, Peter Svenonius, Sten Vikner, Joel Wallenberg, Hedde Zeijlstra, and Jan-Wouter Zwart.
Particular thanks are due to Michelle Sheehan, our colleague on the AHRC-funded project, whose input is felt on almost every page of what follows, and an anonymous LI reviewer, whose immensely helpful, constructive criticisms went well beyond the call of duty, as well as Eric Reuland for his editorial forbearance. All errors remain our responsibility.
1Emonds (1976:19) put forward a similar constraint, the Surface Recursion Restriction: phrases to the left of a head inside an XP have to be head-final under certain conditions (see also Williams’s (1981) Head-Final Filter). As Joseph Emonds points out ( pers. comm.), the Surface Recursion Restriction does not make the same predictions as FOFC, but it is somewhat similar. We are grateful to Norbert Corver and Henk van Riemsdijk for drawing our attention to this and to Joseph Emonds for helpful discussion and references. David Pesetsky ( pers. comm.) further drew our attention to the work of Hale, Jeanne, and Platero (1977:385), who observe a case of FOFC in Papago/Tohono O’odham and propose a surface structure constraint on the distribution of a #-boundary in order to account for it (see their (25), p. 391).
2 ‘‘Aux’’ may be either an auxiliary or a verb capable of triggering clause union/restructuring. We use the generic label Aux for these heads here.
3 At least since Haegeman and Van Riemsdijk 1986, this order has been known as ‘‘verb projection raising.’’
4 This corresponds to what has been known, since Evers 1975, as the ‘‘verb-raising’’ order in Dutch.
5 Afrikaans data cited without sources are ‘‘constructed’’ data, which have, however, been checked with a minimum of three native speakers, all of whom readily accepted the structures indicated as grammatical.
6 See section 3.3 on a restricted class of three-verb clusters that at first sight appear to undermine this generalization, in that they appear to show [AuxP1[AuxP2 Aux2 VP] Aux1] order. Nonetheless, the generalization as stated holds of all two-verb clusters.
7 In the case of Basque, as in the other cases discussed, we are making assumptions about the analysis of these word orders that are, we think, reasonable, but obviously not incontestable. We cannot actually be certain that, for example, [[V O] Aux] is the only analysis of (13b) and that therefore FOFC is the reason why it is ill-formed. A reviewer observes that we are here treating FOFC as if it were a necessarily ‘‘surface-true’’ generalization, in the manner of Greenberg’s word order universals, while claiming that it is a hierarchical universal. We will show later that FOFC is not always a surface-true generalization. On the other hand, if it were not the case that FOFC is quite often surface-true, we would probably not have discovered it in the first place.
8 In addition to the core texts systematically studied by Flobert (1975)—four from the Classical period, and one from the Late Latin period—Danckaert’s (2012a,b) corpus consists of 1,175,031 words, of which roughly half (560,534) come from the Late Latin period. The Classical Latin component of this corpus features 45 V-O-Aux structures, mostly from the works of Livy. See section 2.2 for further discussion.
It is worth noting that we ignore here the modal-containing V-O-Aux structures discussed in Danckaert 2012a,b on the grounds that the putative Aux in this case fairly uncontroversially instantiates the head of a distinct clause (it can be independently modified, negated, etc.).
9 To the extent that the form expressing ‘want’ need not be an auxiliary, this gap cannot be fully explained by appealing to FOFC, but it is nonetheless striking.
10 This does not seem to be just an areal effect, since Basque, although spoken in Europe, is not part of the European Sprachbund that Haspelmath (2001) designates ‘‘Standard Average European.’’
11 In this latter, possessive case, it is also clear that the initial DP realizing the possessor (here, the girl from next door) defines its own extended projection, which is distinct from that of the possessum (here, smile).
12 If transitive deponents are associated with a nondefective phasal domain, the expectation would further be that V would not be available for probing by T since VP would, in the terms of Chomsky’s (2000) Phase Impenetrability Condition, be spelled out upon completion of the vP phase (see Richards 2013 for discussion of the effects of phasal spell-out on the realization of Case/case). Even if it undergoes movement to the edge of vP, however, T would not be expected to be able to probe V (or O), given the prior Agree operation involving v.
13 There are a number of other combinations that are somewhat indeterminate in relation to our concerns. For example, Dryer (2011a) defines a ‘‘mixed’’ category for subordinators, which includes the combination of initial and final, as well as clause-internal (often second position) and suffixal. There are 31 VO languages with mixed subordinators, and clearly these need to be investigated. Furthermore, some languages are taken to have no dominant order between OV and VO: 4 of these have mixed subordinators, 3 have suffixal subordinators, and 2 have final subordinators. The numbers are small in every case, but again these cases should be investigated more closely.
The numbers of languages discussed here do not add up to 660. As explained in WALS, when features are combined (here, 94A Order of Adverbial Subordinator and Clause and 83A Order of Object and Verb), ‘‘the numbers for languages in each row or column do not necessarily add up to the total number of languages for the respective feature value. This is due to the fact that the sets of languages for two features are typically not identical’’ (see wals.info/feature/combined).
14 There are contexts in which it is possible for head-initial CPs to surface preverbally in West Germanic and other OV languages—namely, where these CPs have undergone Ā-movement, either to clause-initial position (see Koster 1978, Alrenga 2005, on sentential subjects) or to a clause-internal position, which may superficially appear to be the complement position. As Barbiers (2000) notes, however, this preverbal position can be shown to be higher than that associated with the unmoved preverbal complement, CPs located in this position being to the left of VP adverbs and satisfying the same diagnostics as scrambled elements.
16 It is not crucial that the offending complement (or adjunct) in (29b) is a PP. The same effect is found when the complement is a clause (the postposition jälkeen assigns genitive; see footnote 15).
‘after the decision’
päätös pitää kokous
decision hold meeting
‘the decision to hold a meeting’
*päätöksen pitää kokous jälkeen
decision hold meeting after
17 In fact, there are reasons to think that the apparent complement of rajan ‘border’ in these examples is a reduced relative, hence an adjunct. If so, the ungrammaticality of (29b) shows that FOFC also applies in cases where the sister of a lexical head is an adjunct (as countenanced by our definition in (1) and (3)). If we mark the sister of N explicitly as a relative clause, it is still excluded in postnominal position, when the NP is embedded under a postposition.
*rajan [joka kulkee maitten välillä] yli
border which goes countries between across
Intended: ‘across the border which runs between the countries’
Similarly, in the case of verbs taking a locative adjunct, FOFC applies just as it does with complements. Compare (ii) with (11d).
*Milloin Jussi [VP uinut [PP Englannin kanaalin yli]] on?
when Jussi swum England’s Channel across has
Intended: ‘When has Jussi swum across the English Channel?’
A difference between (some) complements and adjuncts is that adjuncts ‘‘have more ways to avoid FOFC,’’ so to speak, since they are generally less rigidly ordered than complements. Thus, a relative clause can occur to the right of the higher postposition (see Sheehan 2013a,b, and Biberauer and Sheehan 2012 for discussion).
(iii) rajan yli [joka kulkee maitten välillä]
18 ‘‘Circumpositions’’—found in, among other places, West Germanic, the Gbe languages of West Africa, and Iranian contact varieties—appear to be somewhat different from the Finnish pre/postpositions discussed here. These are illustrated in (i).
in den Laden rein
in the.ACC store R-in
‘into the store’
onder de brug door
under the bridge through
‘under the bridge ( path)’
in die huis in
in the house in
‘into the house’
Here it appears that we have a head-initial PP in the complement of a postposition, in violation of FOFC: in each case, the postpositional element expresses directional meaning, which is universally encoded higher within the adpositional domain than the locative meaning encoded by the preposition (see Cinque and Rizzi 2010 for recent discussion).
However, the postpositions in these constructions appear to be a rather nonuniform set of elements, including adverbial or particle-like intransitive prepositions (see Svenonius 2003a,b, 2006, 2007, 2008, 2010). In cases where they cannot plausibly be viewed as integrated with the head-initial PP, they would not be heads taking the preceding PP as their complement, and hence no challenge to FOFC.
Where integration seems likely, as in (ia–c), it is striking that closer inspection of the relevant structures reveals that the preposition is dominated by a complete extended projection including the equivalent of a CP domain, while the postposition is not (see, e.g., Aelbrecht and Den Dikken 2011, Djamouri, Paul, and Whitman 2013, and Biberauer et al., to appear, for detailed discussion of relevant cases). This means that the postposition is defective and, as we will argue in section 3.2 in relation to defective elements more generally, not part of the extended projection of its PP/DP complement.
In the specific case of West Germanic structures such as those illustrated in (i), it is also highly plausible to assume, as Noonan (2010) does, that the defectiveness of the postpositional element in fact entails that it is integrated within a verbal projection headed by silent GO in roughly the manner schematized in (ii). (See Noonan 2010 for the full structure. Also see Van Riemsdijk 2002 for entirely independent arguments in favor of the existence of silent GO in West Germanic varieties. On the oft-noted difficulty of distinguishing adpositions and particles in these systems, see Neeleman 1994, Den Dikken 1995, Zeller 2001, and de Vos 2013, among others. Biberauer et al. (to appear) present a detailed discussion of all these points.)
Strikingly, independently occurring postpositions in West Germanic are always directional (see Biberauer 2006, Aelbrecht and Den Dikken 2011). Assuming them to take the form in (ii), it also becomes possible to understand the presence of postpositional elements in systems that, like Germanic, have head-initial DPs.
19Cinque (2005) proposes that (35) is derived by NP-movement to Spec,NumP, applied to the underlying structure (33) (which Cinque, following Kayne 1994, takes to be the universal underlying structure).
(i) [DemP Dem [NumP NP [Num t]]]
(34) is then derived by a further movement of the NP, to Spec,DemP.
(ii) [DemP NP [Dem [NumP t [Num t]]]]
Cinque treats these cases as involving NP-movement, as he does not assume any head movement. He takes putative complements of N to be NP-external (see Cinque 2005:327n4).
20 Typical clausal specifiers (raised subjects, shifted objects) are in a distinct extended projection from the clausal functional categories and hence do not induce FOFC violations. Possessor arguments in DP are categorially nondistinct from the nominal they are embedded in, though, and we therefore expect them to be sensitive to FOFC, which they are not (e.g., in English: the author of the book’s agent). We take this to be a ‘‘freezing’’ or ‘‘multiple spell-out’’ effect (see Uriagereka 1999). This will arguably also exempt Ā-moved categories from FOFC (and in fact also subjects and shifted objects regardless of their categorial identity). See section 4.3.1 for discussion of the notion ‘‘extended projection.’’
21 Not all Pol heads appear to be structurally dominated by C, however, a possibility that Laka (1994) in fact allows for: in her account, it is a matter of parametric variation whether Pol surfaces within the IP or CP domain. Consideration of the distribution of question particles clearly shows that these elements are dominated by C (see also Bailey 2012); investigation of the behavior of clause-final negation and negative concord markers, such as Biberauer’s (2009, 2012) and Biberauer and Cyrino’s (2008, 2009), however, suggests that where these elements derive from structurally ‘‘high’’ elements like anaphoric negation or other speaker/hearer-oriented discourse markers, they are typically merged above C, at the very periphery of CP. Here they may (e.g., Afrikaans nie2) or may not (e.g., Brazilian Portuguese não) be integrated into the clausal extended projection, with the result that we do not always expect high negation markers to be superficially FOFC-compliant. See section 3.2 for brief further discussion of why this difference between CP-internal and CP-peripheral Pol might obtain.
22 This may be particularly interesting if one considers that adpositions may be reanalyzed as complementizers (see Roberts and Roussou 2003, van Gelderen 2004, 2010 for discussion and references). Given what is known about the distribution of head-final complementizers, it would appear that postpositions in head-initial languages systematically fail to undergo this diachronic process, another case of a FOFC-violating option being avoided, this time in the diachronic domain.
23 If we adopt the general approach to morphology articulated by Marantz (1997), typified by the slogan that, within the word, the structure is ‘‘syntax all the way down,’’ then we would expect that FOFC holds at and below the word level. Myler (2009) presents some evidence that this is the case. This supports Marantz’s thesis, as well as providing further evidence for FOFC; see Biberauer et al., to appear.
24 It should be noted, though, that among languages that have head-final VP but head-initial PP, most appear to systematically avoid having PP in preverbal position, typically resorting to extraposition instead, as reported by Sheehan (2008). Why this should be is a matter that we leave aside here, as the empirical point of central relevance to the present discussion is that head-initial PPs in systems like West Germanic unproblematically surface in preverbal position.
26 This appears to be a more generally attested pattern in languages. Consider, for example, the Jespersenian doubling that so frequently arises in negative contexts, the ‘‘forked modality’’ structures discussed by Cheng and Sybesma (2003), and the ‘‘definiteness’’ doubling found in many systems, including the Celtic cases discussed in section 2.4.
27 For example, the SVO language Thai has several final question particles, which, according to Yaisomanang (2012), are all derived by ellipsis from a disjunctive structure. The question (i) is derived from (roughly) the underlying structure (ii) with two disjoint Polarity Phrases ( PolP). The question particle mǎy in (i) is in fact the spell-out of
rǔu ‘or’ and mây ‘not’, taking the segmental form of the negation and the tone of the disjunction.
nát khàp rót mǎy?
Nath drive car Q
‘Does Nath drive?’
(ii) [Q [IP nát I [[PolP khàp rót]
rǔu[PolP mây khàp rót]]]]
28 See footnote 21.
29 As discussed by Biberauer (2009, 2012), speaker/hearer-oriented tags and anaphoric negation elements represent an underdiscussed source of negative reinforcement markers of the kind most famously associated with Jespersen’s (1917) work. Where they occur as final elements in ( partially) head-initial systems, these structurally ‘‘high’’ elements naturally pose a potential challenge to FOFC and can be treated along the lines proposed here.
30 This proposal has been challenged, but plausible alternatives retain the assumption that ge- instantiated a verbal suffix. As noted in section 2.2, the systematic unavailability of Latin-type V-O-Aux structures involving participles also suggests that Germanic participles, unlike their earlier Latin counterparts, are [+V].
31 This variable distribution of clausal features on adjacent heads is precisely what is expected, both system-internally and crosslinguistically, if a ‘‘spanning’’ approach of the type discussed in Distributed Morphology terms by Bjorkman (2011) and in Nanosyntactic terms by Svenonius (2012) is on the right track.
32 As discussed by Baker (2008), although few languages are consistently head-initial or head-final, harmony is still preferred, so that languages cluster at both ends of the scale, where the endpoints are consistently head-initial and consistently head-final.
33 If ‘‘contain’’ is irreflexive, all c-command is asymmetric, and (62b) is not needed.
35 This raises the question of the status of ^ at the interfaces. It may be that LF simply ignores this element, since it has no denotation; unlike ignoring an unvalued/uninterpretable φ-feature, this has no deleterious effects. In PF, ^ has an effect, in that the head associated with it must have a category in its specifier (although that category may be a copy and so undergo deletion at some point).
36 Here we assume that head movement is not triggered by ^, either because it is not part of core syntax (Chomsky 2001:37–38) or because it is the consequence of a particular type of Agree relation—namely, the type where the goal is defective in relation to the probe in that its formal features are included in those of the probe (see Roberts 2010, where this idea is developed). Hence, ^ only triggers phrasal movement.
37 The phase-head-related EFs discussed here should not be confused with the generalized Merge features, also designated edge features, ascribed to every lexical item in Chomsky 2007, 2008. As languages do not differ with respect to the fact that their lexical items may undergo external Merge, whereas they do differ with respect to whether already merged, and thus EF-bearing, items can trigger movement (internal Merge), it may be necessary to draw a distinction here (contra Chomsky 2007:17, 2008:144; see Kandybowicz 2008, 2009 for a proposal along these lines). We leave open the possibility that non-Agree-driven movement simply involves a head associated with two EFs, that is, an external- Merge-triggering EF that bears a further internal-Merge-triggering EF as a secondary feature.
38 Thanks to an anonymous LI reviewer for clarifying and correcting these definitions.
39 Here we depart from widespread assumptions regarding the relation between v and V. While it is sometimes assumed that v determines the verbal nature of the lexical root V (see Chomsky 2001), we are claiming that the lexical root allows its verbal feature to be copied by v. Here we follow Myler (2009), who makes a distinction between the clausal v, which we are dealing with here, and sub-word-level verbalizing v, which renders a root verbal.
40 If the Merge-triggering EF can host ^, this is, of course, no longer true.