I argue that Bošković’s (2011c) generalization concerning the island-voiding effect of incorporation can be captured naturally within minimalist bare phrase structure if head movement (a) is a syntactic operation and (b) leaves no trace/copy. É. Kiss’s (2008) ‘‘domain-flattening’’ phenomena are also expected under the proposed account. Further empirical consequences are discussed.
Head movement (HM) has been subject to close scrutiny in recent syntactic theorizing. The reason is that some of its properties appear unusual in the context of the minimalist conception of grammar and its desiderata. As Chomsky (2001) points out, in contrast to XP-movement, HM (a) does not seem to affect interpretation; (b) has no clear triggers; (c) is acyclic; (d) leads to a configuration where the moved element does not c-command its trace; and (e) in long-distance contexts, proceeds in a ‘‘snowball’’ fashion, forming increasingly bigger clusters with each step, rather than successive- cyclically as XPs would (in fact, successive-cyclic HM would require excorporation; see Roberts 1991). All of this creates an impression that does not fit easily into the mainstream conception of movement (internal Merge) as an operation that (A) affects interpretation; (B) is triggered; (C) is strictly cyclic; (D) observes the c-command requirement on trace; and (E) is successive-cyclic.
The observed discrepancy regarding HM has generated a number of proposals aimed at achieving greater coherence between HM and what is considered to be good minimalist design. These proposals diverge regarding whether HM should be treated as part of narrow syntax, and they fall roughly into three different types. The first, originally due to Chomsky (2001), takes the above-mentioned properties as a possible indication that HM is not part of the core engine performing syntactic computations, but is subject to requirements at the PF interface. Boeckx and Stjepanović (2001) explore this suggestion in the context of pseudogapping (see also Baltin 2002).1 Another, more radical view assumes that HM does not exist at all and that its effects are derivable essentially from a series of remnant XP-movements (e.g., Kayne 1994, Koopman and Szabolcsi 2000 ). Finally, the third approach argues that HM must be retained in core syntax and attempts to incorporate its seemingly unusual properties into the minimalist system via reinterpreting assumptions pertaining to XP-movement and combining them with certain morphological requirements (Matushansky 2006), exploring the theoretical potential of HM (e.g., Donati 2006, Roberts 2010, Surányi 2008), and strengthening independent evidence that HM indeed has syntactic status (Lechner 2006).
I argue that at least some cases of HM must be treated as part of core syntax by virtue of an important syntactic effect that it produces. These are cases of incorporation of D or P heads into a c-commanding lexical category (e.g., V). That incorporation exists, has long been known (Baker 1988). The currently accumulating evidence suggests that when the incorporating D itself heads a syntactic island (e.g., complex DP), incorporation of that D effectively removes that island, so that a dependency such as wh-movement can be formed across the vanished boundary. As I will show, this seemingly surprising effect is in fact not surprising at all, but follows naturally from considerations of endocentricity in combination with the minimalist bare phrase structure.
2 Islands Headed by Traces
(1) A phrase that is normally a barrier to movement ceases to be a barrier if headed by a trace.
At issue here is an unusual situation where islandhood of a phrase is voided when that phrase is headed by a trace. Consider some relevant data.
2.1 Voiding DP Islands
Uriagereka (1988, 1996) shows that argument (2) and adjunct (3) wh-extraction from inside the DP in Galician are possible only if determiner incorporation into the selecting verb has taken place (examples from Uriagereka 1996:270–271; emphasis mine).
(?)De quén liche-los mellores poemas de amigo?
of whom read.2SG-the best poems of friend
‘Who did you read the best poems of friendship by?’
cf. *De quén liches os mellores poemas de amigo?
of whom read.2SG the best poems of friend
De que zonas liche-los mellores poemas de amigo?
of what areas read.2SG-the best poems of friend
‘What areas did you read the best poems of friendship from?’
cf. *De que zonas liches os mellores poemas de amigo?
of what areas read.2SG the best poems of friend
Such incorporated determiners act as ‘‘gates’’ for extraction from islands. Interestingly, D-incorporation in Galician is possible not only out of DP objects, but also out of subjects (4) and adjuncts (5). Bošković (2011b) claims that D-incorporation out of an adjunct in (5) opens the adjunct gate for extraction (see also below and footnote 3).
Merda fixested-los fachas!
shit did-the fascists
‘You fascists did nothing!’
Chegamo-la semana pasada.
arrived-the week last
‘We arrived last week.’
?De que semana chegastede-lo Luns?
of which week arrived.2PL-the Monday
‘Of which week did you guys arrive on the Monday?’
In essence, after D-incorporation the remaining phrase behaves, with respect to extraction, very much like a bare NP in languages without the category Determiner. A number of authors have shown that in languages that do not have strong overt determiners such as definite articles, the category D is not realized (Bošković 2008, Corver 1992, Stjepanović 1998), and instead of DP these languages feature just bare NPs ( possibly with some additional functional structure, though never a DP; see Bošković 2011a). Such bare NPs allow wh-extraction out of them.
Left branch specifiers can be extracted from bare NPs, not from DPs. This suggests that in D languages, after D-incorporation, DPs behave like NP with respect to extraction, an observation that will become relevant in the following discussion. The generalization in (1) captures that intuition while making use of the notion of HM trace, along the lines of (8) (cf. (2)).2
2.2 Voiding PP Islands
P(reposition)-stranding languages provide another well-known illustration of the voiding effect—for example, in pseudopassives and wh-questions.
(9) This booki has been frequently [referred to] ti.
(10) Whati did he [talk about] ti?
Assuming PPs to be generally islands (cf. Van Riemsdijk’s (1978) Head Constraint), a special rule of reanalysis was proposed in early Government-Binding Theory to account for P-stranding (Hornstein and Weinberg 1981). This rule reanalyzes the PP, whereby the preposition becomes part of the selecting predicate. Pseudopassivization and wh-movement are possible only out of reanalyzed PPs.
(11) *This cityi was frequently traveled [to ti].
(12) *Which piecei did John fall asleep [during ti]?
Even though conditions regulating reanalysis remain somewhat poorly understood, HM is a very good candidate to be part of the explanation of the relevant phenomena. Bošković (2011c) relates P-stranding to the generalization in (1) on the basis of a larger sample of crosslinguistic examples involving P-incorporation, many of which again involve incorporation out of an adjunct.3 If a preposition incorporates into a verb, then its complement behaves like a bare NP and can be either A- or Ā-moved. Once again, incorporation somehow frees the structure, removing the island boundary.
Bošković (2011c) offers an interpretation of (1) via the ‘‘repair-by-PF deletion’’ strategy. The original ‘‘repair-by-deletion’’ account of Chomsky (1972) was concerned with the important observation made by Ross (1969) that ellipsis voids island effects. Starting from Merchant 1999, the issue has received renewed attention in the literature (cf., e.g., Lasnik 2001 and Merchant 2008). This is illustrated in (13) (deletion is indicated with overstrikes).
*That he will hire someone is possible, but I will not divulge whoi [[Island that he will hire ti] is possible].
That he will hire someone is possible, but I will not divulge whoi [
[Island that he will hire ti] is possible].
Chomsky originally suggested putting a # on an island crossed by a movement operation. If the # remains in the final structure, a violation occurs. However, if some later operation such as ellipsis erases the #, then no information about a possible violation is retained in the final representation. Bošković (2011c) explores this idea in the context of copy deletion and suggests that (1) can be reduced to the ‘‘repair-by-deletion’’ scheme if one assumes that the # (Bo'kovic’ and other authors use a star * instead) is assigned not to an island boundary (e.g., CP) but to its head. In particular, in (13b) the star is placed on ti upon wh-movement. PF operations delete ti, which actually is a silent copy of the moved head, thus avoiding incurring a violation.
Attributing the voiding effect to the ‘‘repair-by-deletion’’-like scenario captures obvious similarities between ellipsis and deletion of copies, but the resulting picture is unnecessarily complicated in the minimalist context, while requiring a fair amount of stipulation. It is not clear whether both syntactic and PF components should have to be involved here. Importantly, the star-assigning convention violates the Inclusiveness Condition (the ban on introducing entities not present in the numeration), a fundamental piece of the minimalist architecture.4 In addition, the account bears on rather technical issues concerning where the star is assigned, which are difficult to substantiate. Below, I outline an alternative explanation of the island-voiding effect that avoids the complicating #/*-assigning convention altogether.
3 Islands Not Headed by Traces
(1) is based on the underlying assumption that HM leaves a trace/copy in the base position. The trace assumption is carried over from the Government-Binding framework and reflects general considerations of phrase structure preservation and derivational history. In particular, if some head α has projected to a phrase αP and then moved away from its phrase to adjoin to head α, the trace/copy symbol ensures that the derivation retains the information that αP is a projection of α.
(14) [βP . . . α+β . . . [αP . . . tα . . . ]]
Several points about this analysis are noteworthy. Apart from the issues pointed out in section 1, a major difference between HM and XP-movement is that under HM, the element that moves is also the one that leaves behind a phrase carrying its label. In a sense, after movement head α belongs in two places simultaneously: αP and βP. This intuition was recognized a while ago and was encoded in the Government-Binding framework. One prominent example is Baker’s (1988) Government Transparency Corollary, whereby a lexical category with an item incorporated into it is assumed to govern everything that the incorporated item governed in its original structural position.
A similar intuition may in principle be reformulated in minimalism, if HM leaves a trace/copy behind (see Den Dikken 2007 and Gallego 2010 for recent, relevant discussion). Consider, however, the possibility that it does not. First, the entire line of thought attributing HM to the PF component implicitly entails that HM traces do not play a significant syntactic role. Second, previous discussion of the relation between the traces of moved elements and their LF interpretation—in particular, in the context of reconstruction—has concentrated mostly on XPs. Chomsky (1995:chap. 3) notes that reconstruction is a general property of Ā-movement creating operatorvariable chains. With respect to A-movement, Chomsky notes the absence of reconstruction effects in general and suggests that traces of A-movement, unlike those of Ā-movement, are ignored by interpretive components such as LF. Lasnik (1999) further explores this argument, evaluating it on a wide range of relevant constructions, and suggests that the lack of reconstruction effects with A-movement indicates the absence of relevant traces/copies of A-movement altogether. This move is also appealing given the minimalist consideration of conceptual necessity: if an element plays no role in the computation, it may not be there at all.5
This reasoning is relevant in the case of HM to the extent that base copies of HM do not play in a role in LF interpretation either. Two cases of HM have recently been claimed to have an effect on LF interpretation (see also Lambova 2004 for the claim that heads can undergo focus movement). One case concerns licensing negative polarity items (NPIs) under HM (Kayne 2000: 44, Roberts 2011), as illustrated in (15).
*Which one of them does anybody not like?
Which one of them doesn’t anybody t like?
Under the standard assumption that NPIs must be c-commanded by their licensers at LF, movement of negation changes c-command relations in a way that yields a well-formed LF representation. The other case concerns positioning of certain modals with respect to clausemate quantifiers, discussed in detail by Lechner (2006) and illustrated in (16).
In (16), the modal can take scope over the quantifier, resulting in the inverse scope configuration. Lechner further argues that the position in which the quantifier is interpreted (tQP in (17)) is above the base-generated position of the modal.
(17) . . . QP . . . Mod . . . tQP . . . tMod . . . tQP
It follows that the modal must be interpreted in the derived position. Note, however, that in both these cases what gets interpreted under this analysis is the higher copy of the moved head. The lower copy that would correspond to the trace/copy still plays no role for LF interpretation.6
(18) [βP . . . α+β . . . [αP . . . ]]
But this configuration simply cannot arise, because it violates the fundamentally endocentric character of phrases. Unlike the case of A-movement, which originates from a specifier or complement position of some head α heading a phrase αP (i.e., from a nonprojecting position), traceless HM of head α does affect the core head-centered skeleton of αP in a fatal way: it ceases to exist. This is straightforward in the bare phrase structure framework, where labels (if they exist at all) are relationally determined. If α projects, the label of the resulting object is identical to the label of the head (i.e., α). When α moves away, the projection of α in the base position collapses, along with its label.
Consider abstractly what this implies for the structure of the vanishing phrase. Suppose a head α takes YP as a complement, forming αP, which is then selected by another head β. β probes for α, and α raises. αP collapses, and YP becomes a complement of the conglomerate head, as depicted in (19).8
HM of α to β may proceed either as simple head adjunction as in (19) or by formation of a complex α+β cluster and possible relabeling of the phrase, (α+β)P. The choice between these options in each particular case is possibly constrained by considerations of selection, feature sharing, θ-role assignment, and so on. For the moment, this is immaterial. The crucial point is the disappearance of αP. If αP happens to be an island—in particular, a DP or a PP island—then the island-voiding effect follows directly from the traceless HM.
In the end, Bošković’s (2011c) account of the voiding effect in (1) converges on the same conclusion: that traces do not head islands. But while for Bošković this results from a series of manipulations involving both syntax (leaving the trace, inspecting the island, assigning a star to its head) and the PF component (deleting the trace and the star), I suggest that it may not be necessary to use this set of conceptual tools to create a trace and then delete it. Rather, the voiding effect with HM largely follows from endocentricity and the relational character of labels in bare phrase structure, a minimal assumption if bare phrase structure reflects an optimal language design. The voiding effect arises exactly in those cases of HM that leave no trace (perhaps all), thus make no use of alleged lower copies.
Recall that properties (d) and (e) of HM mentioned in section 1 as oddities for minimalist syntax have to do with traces that it supposedly leaves: they are not c-commanded by the moving element, and they are not licensed (violating the Head Movement Constraint or the Empty Category Principle, depending on one’s view) in excorporation contexts that would otherwise appear to be natural HM counterparts of successive-cyclic XP-movement. If syntactic HM does not leave a trace/copy at all, then properties (d) and (e) may no longer be relevant in light of minimalist guidelines, and can be removed from the list of oddities for syntactic HM altogether.
There are reasons to think that the same can be said about the remaining properties (a)–(c). Property (a) is challenged by the ‘‘gate-opening’’ character of incorporation, illustrated in basic cases like (2) and (3) (as already pointed out by Uriagereka (1988)), as well as by Roberts’s (2011) and Lechner’s (2006) cases of NPI-licensing and modals, respectively. Concerning (c), note that there is no acyclicity problem from the probe-goal perspective, since the probing head c-commands the head it is probing (see Epstein 2001). Finally, the triggers argument (b) can be questioned as well. For instance, one may imagine postulating a trigger for syntactic V-to-T movement (e.g., T- or V-feature), which does not seem to be an obviously worse mechanism than, for instance, the [+wh] feature for wh-movement.
4 Residual Wh-Scope Marking
Stepanov and Stateva (2006) suggest an account much along the lines of (19) in their theory of successive-cyclic wh-movement in long-distance wh-questions as in (20) ( possibly extendable to other Ā-movement contexts).
(20) Whoi do you believe ti Peter likes ti?
Stepanov and Stateva propose that the successive-cyclic property of long-distance wh-movement in languages like English is due to a residual wh-scope-marking structure in these languages. Consider the following typical wh-scope-marking question:
In (21), there are two clause-bound wh-dependencies, one of which is headed by the ‘‘wh-scope marker’’ was ‘what’, which, as the translation of (21) suggests, appears to mark the high (matrix) scope of the other wh-phrase. Under the analysis known as indirect dependency, however, the ‘‘wh-scope marker’’ is an independent contentful wh-phrase itself. Under some versions of indirect dependency, the wh-scope marker is a wh-head W that forms a constituent with the embedded wh-clause at D-Structure (and has a semantic type function that takes the embedded question as an argument at LF; see, e.g., Dayal 1996, 2000, Mahajan 2000, Stepanov 2000), as illustrated in (22).
(22) [CP[+Q] . . . V [WP W [CP[+Q] whi . . . ti . . . ]]]
Note that the structure of WP is closely reminiscent of a complex DP island and is also in line with the views treating finite complementation in terms of an NP/DP-shell (Bayer 1996, Müller and Sternefeld 1995, Stepanov 2001). The main relevance of this type of question in the context of successive cyclicity is that in constructions such as (21) all CPs are marked [+Q], which provides a potentially relevant context for successive checking of all [+Q] features by a single wh-phrase, rather than locally by different elements (e.g., a wh-phrase and a wh-scope marker). The challenge is thus to circumvent the island. Stepanov and Stateva suggest that the D-Structure in wh-scope-marking languages as well as in ‘‘long-distance wh-movement’’ languages is basically the same, namely, that in (22). In the course of the derivation, W can either overtly move to the matrix CP domain, as in German, Russian, and Hungarian; stay in situ, as in Hindi; or incorporate into the selecting V ( propositional attitude verb), as in English and other long-distance movement languages. Whether W incorporates into V or not depends on the morphological status of W: if W is an affix, it incorporates; if not (like German was), it does not. The incorporation option is realized along the lines of (19). In particular, with the derivation of (20) proceeding from the bottom up, an embedded-question CP is formed and local wh-movement takes place.
(23) [CP[+Q] whoi Peter likes ti]
Then a W (in English, phonologically null) merges, taking the CP in (23) as a complement, forming a WP; and the matrix V merges. At this point, W undergoes traceless HM, incorporating into V and forming a complex predicate. The WP ceases to exist.
believe [WP W [CP[+Q] whoi Peter likes ti]] →
VP believe + W [CP[+Q] whoi Peter likes ti]]
The wh-phrase in the specifier of the embedded CP can now take a further step to the matrix Spec,CP (Stepanov and Stateva assume that the wh-phrase can check its [+wh] feature more than once). Successive checking of [+Q] features of Cs along the way is thus responsible for the successive-cyclic effect. The main advantage of this island-voiding perspective of successive cyclicity is that it allows one to unify seemingly unrelated types of interrogative constructions such as wh-scope-marking and long-distance wh-questions under a common derivational history and associate general principles of structure building with language-specific morphology (namely, the makeup of W), deriving the relevant patterns across a wide range of crosslinguistic material (see Stepanov and Stateva 2006 for details).9
5 Specifiers, Adjuncts, and Flattening Constituent Structure
One further consequence of traceless HM has to do with an interesting phrase-structural effect concerning disappearance of a phrase after incorporation of its head. Note that when the original αP in (19) collapses, the YP—the complement of α—automatically becomes a complement of the new conglomerate head that I designated as α+β. The question now arises, what happens in a more complex case when the original αP has a richer structure including not only a complement YP, but also specifier(s) and/or adjoined XPs?
These more complicated cases naturally fall into the ‘‘domain-flattening’’ account proposed by É. Kiss (2008) (though not directly into the generalization in (1)). É. Kiss proposes the following generalization:
When a V is moved into a functional head, the maximal constituents in its internal domain become freely permutable sister nodes.
(É. Kiss 2008:459)
In line with Chomsky 1995, the internal domain of a V-chain includes the complement of V, the specifiers of intermediate verbal projections, and anything adjoined to intermediate verbal projections (but not subdomains thereof ). (25) allows for a straightforward account of the otherwise puzzling patterns in Hungarian whereby word order is fixed in the preverbal domain but is free postverbally. Furthermore, É. Kiss shows that the free postverbal order correlates with a flat structure of verbal constituents that can be probed, in particular, by Condition C.10 É. Kiss argues that raising the verb as high as T (the verbal particle being in Spec,TP) leaves a verbal projection (PredP for É. Kiss) headless and causes its collapse; as a result, its major constituents are linearized at random in the syntactic component.11
‘The boys fell out with each other.’
(É. Kiss 2008:459, (54))
When the head of a phase is moved into the head position of the next higher phase, the silent copies of the moved head and their projections are pruned.
(É. Kiss 2008:462)
Pruning is an extra operation, an add-on to bare phrase structure similar to PF deletion in Bošković’s (2011c) system. But under the proposed traceless HM, introduction of this operation, thus stipulating either (25) or (27), can be avoided without loss of generality, since there is trivially no need to worry about silent copies. The proposal I outline here thus unifies É. Kiss’s domainflattening account and Bošković’s generalization in (1). Furthermore, if DPs and PPs can be shown to be phases, as they are often claimed to be (see, e.g., Abels 2003, Chomsky 2007, Svenonius 2004, Van Riemsdijk 1978), then É. Kiss’s conjecture about the relevance of phases can be straightforwardly reinterpreted so as to pertain to traceless HM in general. This seems a promising avenue for further exploration.
6 Concluding Remarks
At least some cases of HM can be beneficially viewed as part of narrow syntax. This claim, under the bare phrase structure encoding of endocentricity and the idea that syntactic HM does not leave traces/copies, provides a natural account of collapsing XP domains responsible for the islandvoiding effect. Put in other words, the island-voiding effect, seen in conjunction with the domainflattening effect, suggests that if HM exists in the syntax, it may very well be traceless.
I thank Josef Bayer, Penka Stateva, and two anonymous LI reviewers for their helpful comments and suggestions. I was supported by Heisenberg Stipend STE 1758/2-1 from the German Research Foundation (DFG) during preparation of this work.
1 See also Bayer 2008 for an argument that the verb-second phenomenon in certain varieties of German involves displacement only of a phonological matrix of the verb.
3 This includes, for instance, P-incorporation out of manner adverbials in Kinyarwanda (i), but also the possibility of N/P-incorporation in reason adverbials in Chichewa, and passive by-phrases in Southern Tiwa and other languages, discussed in Baker 1988.
Bošković (2011b) uses these data to argue, contra Baker, that HM/incorporation out of islands, including adjuncts, is in fact possible, and that the previous explanations banning this kind of HM overlooked certain intervening factors. Note that among other things, this opens a potential way of accounting for stranding of an adjunct preposition under whmovement in terms of the P-incorporation analysis of P-stranding (cf. Who did you come with?).
4 See Lasnik 2001 for a proposal dealing with this problem under minimalist assumptions.
5 See also Fox 1999. Fox argues that A-movement does reconstruct (and lower copies are interpreted) in some cases. He proposes that A-movement leaves a trace (not a copy) only in those cases. For present purposes, even this weaker version of A-movement reconstruction theory will suffice. Boeckx (2001) argues that cases of apparent reconstruction with A-movement can be accounted for by literal lowering, rather than interpreting the lower copy.
6 Note also that work on the semantics of incorporation in the framework of transparent LF suggests that it is the incorporated, not base, positions that are interpreted for semantic purposes—in particular, in the context of theories of indefinites (Van Geenhoven 1998, Wharram 2003). As an LI reviewer points out, Lechner (2006) also notes instances such as You can always [can] count on me where, as he claims, the wide scope of always with respect to can cannot be derived by raising always at LF. Whether the scope of adverbials like always can be reliably derived via their LF movement in the first place is not entirely clear, but instances like these should be taken into account in any case when ultimately delimiting the extent to which HM may be traceless.
10 For instance, the Hungarian counterpart of Johni’s mother loves himi does not allow the indicated coreference, suggesting that the genitive specifier and verbal object c-command each other. The same is true of various V′ adjuncts. However, when a verb-related constituent is focused by moving it to a special Focus position preverbally, the usual asymmetric tests for constituent structure hold.
11 É. Kiss argues, more precisely, that when syntax does not force a particular linear order of constituents, the latter are ordered in Hungarian by increasing phonological weight. See É. Kiss 2008 for details. Note also that the domainflattening account of V-to-T movement appears in line not with the more traditional ‘‘configurational’’ view of θ-role assignment (in this case, by V), but with the alternative perspective seeing θ-roles as features (e.g., Hornstein 2001, Lasnik 1995) that can be checked, for instance, prior to V-movement.
The question arises why other V-raising languages do not display the domain-flattening effect, similarly freeing the order of their constituents (É. Kiss tentatively mentions Scandinavian in this regard; an anonymous LI reviewer also mentions French). É. Kiss suggests that in those languages, unlike in Hungarian, nominative case checking requires the subject to move to Spec,IP/TP, which ensures that it will occupy a structurally higher position even after the flattening. Pursuing this line further, we may in principle conceive a similar scenario for direct and/or indirect objects, at least some of which must evacuate the VP, moving into higher functional projections (e.g., Agr-related) that are themselves hierarchically ordered, as for example in earlier minimalist conceptions of clause structure. See also Hoffman 1995 for a theory of scrambling along lines similar to that of É. Kiss.