A number of works have attempted to account for the interaction between movement and ellipsis in terms of an economy condition Max- Elide. We show that the elimination of MaxElide leads to an empirically superior account of these interactions. We show that a number of the core effects attributed to MaxElide can be accounted for with a parallelism condition on ellipsis. The remaining cases are then treated with a generalized economy condition that favors shorter derivations over longer ones. The resulting analysis has no need for the ellipsisspecific economy constraint MaxElide.
A number of works have analyzed the interactions between ellipsis and movement in terms of MaxElide, an economy constraint that ensures that the biggest deletable constituent is elided within a given domain (e.g., Takahashi and Fox 2005, Merchant 2008, Hartman 2011). In this remark, we show that eliminating MaxElide allows us to provide an empirically superior account of restrictions on extraction from ellipsis. We show that a number of the core effects attributed to MaxElide can be explained in terms of parallelism alone; then, picking up on Fox and Lasnik’s (2003) proposal that movement may proceed in one fell swoop in certain elliptical contexts, we argue that the core MaxElide effects follow from more general conditions on the economy of derivations, with ellipsis derivations sometimes requiring fewer steps of movement than others. The resulting analysis has no need for the ellipsis-specific economy constraint MaxElide. Importantly, the analysis makes reference only to traces left by Ā-movement and head movement, with no place for A-movement. We argue that this fits with previous work on reconstruction, which has shown that A-movement often seems not to leave a trace (Chomsky 1995, Lasnik 1998, Fox 1999b, Takahashi and Hulsey 2009). Our analysis makes crucial use of a syntactic parallelism condition like the one proposed by Fiengo and May (1994) and Griffiths and Lipták (2014), since the semantic parallelism constraint assumed in the previous MaxElide literature proves insufficiently restrictive.
1 Previous Accounts
We begin by outlining the important details of Takahashi and Fox’s (2005) account of MaxElide and Hartman’s (2011) extension of it.1 The core fact that analyses of MaxElide aim to account for is illustrated in (1), where VP-ellipsis seems to be blocked when sluicing would also be possible in the same configuration.
(1) Mary was kissing someone, but I don’t know who (*she was).
Takahashi and Fox (2005) propose an account in terms of ellipsis parallelism, an identity condition on ellipsis, and MaxElide, a constraint that favors deletion of the largest constituent possible; crucially, the domain of application of MaxElide is the domain defined by the parallelism constraint, called a parallelism domain (PD). The Parallelism Condition (hereafter Parallelism) is given in (2)–(3), and MaxElide in (4).
(2) For ellipsis of EC [elided constituent] to be licensed, there must exist a constituent, which reflexively dominates EC, and satisfies the parallelism condition in (3). [Call this constituent the parallelism domain (PD).]
PD satisfies the parallelism condition if PD is semantically identical to another constituent AC, modulo focus-marked constituents.
Elide the biggest deletable constituent reflexively dominated by the PD.
Together with the assumption that wh-traces are interpreted as bound variables, this set of constraints derives (1): the variable in the object position ensures that the smallest possible PD is the constituent immediately dominated by its binder, since a free variable could not be semantically identical to the corresponding element in the AC,2 and applying MaxElide in this domain will ensure that the biggest deletable constituent is elided; in this PD, both VP-ellipsis and sluicing (TP-ellipsis) are possible in principle, so MaxElide chooses sluicing and VP-ellipsis is blocked. This is schematized for (1) in (5).
Thus, on this analysis VP-ellipsis is blocked because it is in competition with sluicing.
Before proceeding, we need to outline a few important aspects of this competition-based account. First, it predicts that VP-ellipsis should be possible when sluicing is ruled out. Takahashi and Fox (2005) note that this seems to be correct, since examples like (1) can be rendered grammatical if either the subject or the auxiliary is focused, as shown in (6); assuming that focus cannot be deleted, because of its prosodic prominence, this renders TP-ellipsis impossible, and thus application of MaxElide predicts the availability of VP-ellipsis.
I don’t know who JOHN will kiss, but I know who SUSAN will.
Mary doesn’t know who we can invite, but she knows who we CANNOT.
Second, MaxElide only predicts competition to arise between different ellipsis options in a given PD, so it does not force ellipsis of a larger constituent whenever it is possible. Thus, examples like (7) are predicted to be grammatical even though ellipsis of the matrix VP is possible; this is because the lower VP is itself a PD, and so (2)–(3) are satisfied when it is elided.
(7) Mary [VP said you would [VP leave]], and Sue also [VP said you would [VP
For the most part, the conditions in (2)–(4) only have an effect in cases like (1), where there is a variable that is bound from outside a constituent that is a potential target for ellipsis, a configuration Takahashi and Fox (2005) call rebinding. Whether or not MaxElide applies in a given domain thus depends on the distribution of variables and their binders, and this in turn depends on which movement dependencies are posited to leave variables that could refine rebinding configurations.3
Hartman (2011) extends this system by using the distribution of MaxElide effects—that is, where sluicing blocks VP-ellipsis—to diagnose the distribution of rebinding configurations; considering the wider dataset, he concludes that traces of all kinds of movement must leave variables that count for calculating parallelism. The key data come from wh-adverbial questions. Building on work by Schuyler (2001) and Merchant (2008), Hartman observes that MaxElide effects are not observed with embedded wh-adverbial questions like (8), but they are observed in matrix wh-adverbial questions like (9), and he provides evidence from dialectal variation in English to indicate that the crucial difference between (8) and (9) is the T-to-C movement in the latter. But embedded wh-adverbial questions do show MaxElide effects when the adverbial is extracted from within the elided VP; that is, (10b), involving VP-ellipsis, is unacceptable on the reading where the question is about the time of leaving, where the adverbial is extracted from the lowest clause. The sluicing example in (10a), on the other hand, allows this reading.
You say you’ll pay me back, but you haven’t told me when (you will).
We know Anna is going to resign. The only question is: when (*will she)?
The interpretational possibilities in (10a) follow from the fact that the wh-adverbial is extracted from the VP that is the target for ellipsis; this makes the VP a rebinding configuration much like (1), and so the smallest possible PD is the one containing the binder of the wh-adverbial in Spec,CP. A simplified schematic is given in (11), with the base position of the wh-adverbial shown as a TP-adjunction position.
(11) [CP when λx [TP John [VP said [CP [TP x [TP he [T′ would [VP leave]]]]]]]]
Applying MaxElide here will derive sluicing and block VP-ellipsis, just as it does for wh-object questions, which also involve extraction from VP.
The account of (8)–(9) is more interesting. For (8), Hartman proposes analyzing the EC as in (12): the wh-adverbial is generated as an adjunct to TP, and so the binder of the trace of the raised subject demarcates a PD in which MaxElide can apply to derive VP-ellipsis (the smallest possible PD is underlined). The analysis for (9) (given in (13)) is broadly similar but with one crucial difference: the auxiliary moves from T to C and leaves a variable in TP that is rebound from C′. The interleaving of binding paths leads to a situation where the smallest PD is the one demarcated by the binder of the wh-trace. Applying MaxElide to this domain derives sluicing, as a result of which VP-ellipsis is blocked.
(12) [CP when λx [TP x [TP you λy [T′ will [VP y pay me back]]]]]
(13) [CP when λx [C′ will λy [TP x [TP she λz [T′ y [VP z resign]]]]]]
Importantly, all of the different trace types are implicated in (13): without representation of the A-trace, the VP would be a potential PD and applying MaxElide in this domain would incorrectly derive VP-ellipsis as an option for (9); without representation of the verb movement trace, (12) and (13) would be indistinguishable and again VP-ellipsis would incorrectly be predicted to be possible in (13), just as it is in (12).
2 Further Restrictions on Extraction from VP-Ellipsis
Hartman’s (2011) article closes by reflecting on a puzzle posed for the analysis by the role of intervening focus. Recall that VP-ellipsis was said to be grammatical in (6a–b) because the presence of focus in the IP domain ruled out sluicing as an ellipsis option, and so applying MaxElide to the PD (which encompassed the scope of the operator in Spec,CP) yielded deletion of VP, since this was the largest constituent that could be licitly deleted. Hartman’s account predicts that this ought to hold for all the other cases where sluicing would otherwise block VPellipsis. Hartman shows that this prediction is borne out with matrix wh-adverbial questions; this is illustrated in (14), where (14b) is an only slightly altered version of (9).
Mary woke up at 7:00. When did JOHN?
If Anna isn’t going to resign today, then when WILL she?
However, as Hartman himself notes, this account runs into trouble with matrix wh-object questions, 4 as these seem to show no amelioration effect with intervening focus, as (15a–b) illustrate (note that examples like these also disallow VP-ellipsis without intervening focus).
Mary will kiss Bill. Who will JOHN *(kiss)?
If you aren’t drinking water, then what ARE you *(drinking)?
Hartman then observes that matrix wh-adverbial questions show the same behavior when they are extracted from VP (as with (10a–b): intervening focus in the IP domain seems not to save the VP-ellipsis option with the low construal, as shown in (16).
This seems to indicate that matrix extractions from VP cannot be ameliorated by intervening focus, contrary to what Hartman’s analysis of these data predicts.
Here we note four further problems facing Hartman’s MaxElide account. First, the problem with matrix wh-object questions seen in (15) has the character of a Parallelism violation, yet this is not expected on Hartman’s approach. To begin with, as (17a–b) show, VP-ellipsis is possible with matrix wh-object questions when parallel wh-movement and head movement take place in the AC.
A: What’s he told you?
B: What HASN’T he?5
Who will Bill kiss, and who will JOHN?
This tells us that there is no fundamental incompatibility between matrix wh-object extraction and VP-ellipsis, and one interpretation of the facts is that the problem with (15a–b) is that Parallelism is violated, and that this is alleviated by overt parallel extraction in (17a–b). To see that this is a plausible analysis, consider the schematic in (18), which represents (15a); here we assume that focused DPs undergo Quantifier Raising (QR; Chomsky 1976, Krifka 2006), and we take it that the object John moves via QR to Spec,CP to take scope in parallel to the wh-phrase in the EC.
AC: [CP John λx [C′ [TP Mary λz [T′ will [VP z kiss x]]]]]
EC: [CP who λx [C′ will λy [TP Mary λz [T′ y [VP z kiss x]]]]]
Here, the putative PD in the EC contains a λ-operator that binds the variable left by T-to-C movement, but this is not matched by a similar binding relation in the AC; therefore, the AC and the EC are not strictly parallel with respect to the position of variables and their binders. But this is of no consequence for Hartman’s theory as formulated, since he adopts a semantic parallelism condition that would not distinguish the underlined constituents as required. Note, however, that we would be able to explain (15a–b) as Parallelism violations if we were to cast Parallelism as an LF isomorphism constraint, such as the one proposed by Griffiths and Lipták (2014).
One might propose to repair Hartman’s theory by bolstering it with an LF-isomorphismbased parallelism constraint. However, this would lead to the incorrect prediction that VP-ellipsis with a matrix wh-adverbial question would also be ruled out as a Parallelism violation. Consider again the schematic for (9), this time presented with its AC. Here, we represent the correlate of the wh-phrase in the AC as a covert indefinite roughly equivalent to ‘at some time’, which raises covertly to CP to take scope in parallel to the wh-phrase (see Chung, Ladusaw, and McCloskey 1995, Merchant 2001 for discussion of implicit indefinite correlates in sluicing).
AC: [CP at-some-time λx [C′ [TP x [TP she λz [T′ will [VP z resign]]]]]]
EC: [CP when λx [C′ will λy [TP x [TP she λz [T′ y [VP z resign]]]]]]
As in (18), here the underlined PD in the EC is not identical to that in the AC, since there is no T-to-C movement in the AC. If parallelism were indeed syntactic, and the schematic here correctly represented the LF structure for (9), then these examples would be predicted to be ungrammatical, contrary to fact. In what follows, we will argue that it’s the schematic in (19) that is incorrect, and that the syntactic parallelism constraint has an empirical advantage over the semantic one when it comes to explaining the restrictions on extraction from VP-ellipsis we have observed. That is, we will argue that it is indeed correct to analyze the matrix wh-object question examples as parallelism violations, and that the ungrammaticality of matrix wh-adverbial questions like (9) (where there is no intervening focus to save the VP-ellipsis option) is to be explained by different means. This takes us in the direction of a nonuniform analysis of restrictions on extraction from VP-ellipsis, with the cases where intervening focus has no effect on the grammaticality of VPellipsis having a parallelism-based analysis and the cases where intervening focus does have an effect being explained in another way. This would group Takahashi and Fox’s (2005) original cases in (6) with the matrix wh-adverbials in (9) and the cases in (17), while separating them from cases like those in (15)–(16). For terminological convenience, we will henceforth refer to the class of cases where intervening focus renders the VP-ellipsis option in a rebinding configuration grammatical as salvageable, picking up on the loose intuition that focus somehow “salvages” the VP-ellipsis option.6 We will therefore call the other class unsalvageable.
A second problem comes from considering the role of successive-cyclic movement. Hartman assumes (2011:374n11) that each step of successive-cyclic movement creates a new binder; moreover, he argues ( p. 384) that the fact that there is no competition between high and low VPellipsis in raising clauses like (20) provides evidence for successive-cyclic A-movement through the embedded TP, as this would ensure that there is a PD in the infinitival complement, as in (21).
(20) John is likely to attend, and Bill is (likely to), also.
(21) Bill λx is likely [TP x λx′ to [VP x′ attend]]
While this account works for A-movement, it runs into a number of problems when it is applied to Ā-movement (which has been the primary source of evidence for successive cyclicity in the literature to date). For instance, if we assume that long-distance wh-movement passes through vP and CP (Chomsky 1986, 2000, Fox 1999b, Van Urk and Richards 2015), then we predict that VP-ellipsis ought to be possible in an embedded clause whenever there is extraction from that clause, since the λ-operator introduced by successive-cyclic movement through the embedded CP would create a PD in which the application of MaxElide would derive VP-ellipsis. (22) shows that the prediction is incorrect for object extractions, and (23) provides an illustrative LF structure, where the PD in the embedded clause is underlined; since sluicing is not an option in the intermediate position, as shown in (24), applying MaxElide in this PD would incorrectly derive VP-ellipsis.
(22) *John said you spoke to someone, but I don’t know who he said you did.
(23) [CP wh λx . . . [CP x λx′ . . . [VP V x′]]]
(24) *John said you spoke to someone, but I don’t know who he said [CP t
[you spoke to t]].
Interestingly, the same effect can be seen with embedded subject extractions, as in (25). This is doubly surprising for the MaxElide approach, since not only does wh-movement through the embedded Spec,CP create a PD, A-movement of the subject to the embedded Spec,TP ought to do so as well, as (26) illustrates.
(25) *John thinks one of the teachers is leaving, but I don’t know which one he thinks is.
(26) [CP which one λx he thinks [CP x λx′ [TP x′ λx″ is [VP x″ leaving]]]]
This indicates that sluicing blocks VP-ellipsis in a wider set of situations than can be defined in terms of PDs as in the MaxElide approach.
Related to this, a third problem is that various kinds of nonlocal extractions from VP-ellipsis are unsalvageable. First, consider extraction from embedded clauses. Lasnik and Park (2013:240) observe that VP-ellipsis is not possible when long-distance extraction takes place from a clause contained within the ellipsis site, as shown by (27a); note that this is superficially similar to the regular object extraction cases like (6a–b), where intervening focus has a salvaging effect. (28) shows that such extractions from VP-ellipsis are in fact possible when there is overt parallel extraction from the AC.
*Abby said they heard about a Balkan language, but I don’t know what kind of
language BEN did.
*John thinks you should kiss SARAH, but I don’t know who BILL does.
(28) I know who JOHN thinks you should kiss, but I don’t know who BILL does.
(29) shows that nonparallel extraction of wh-adverbials from embedded clauses is unsalvageable as well, and (30) again shows that overt parallel extraction cases are different. (As expected, the matrix readings for the adjuncts are still available here.)
The generalization here seems to be that nonparallel extraction from VP-ellipsis is only salvageable if it does not cross a clause boundary. This is not quite right, as nonparallel extraction is salvageable from at least some nonfinite clausal complements, such as control complements.
(31) ?John WILL try to kiss MARY, but I don’t know who he WON’T.
Thus, the generalization may seem to be that nonparallel extraction from a finite clause is unsalvageable.
Interestingly, though, nonparallel extraction from VP-ellipsis is subject to a number of other restrictions that do not involve finite clause boundaries. For instance, Lasnik and Park (2013) observe that subextraction from a DP in the object position in VP-ellipsis is unsalvageable, even when the object is contained in the highest VP in the ellipsis site, as in (32); (33) again shows that overt parallel extraction is different.
*ABBY heard a lecture about a Balkan language, but I don’t know what kind of language BEN did.
(33) What did ABBY hear a lecture about, and what did BEN?
Similarly, nonparallel extraction from VP-ellipsis seems to be unsalvageable with certain kinds of wh-phrases, such as degree wh-phrases like how upset. Once more, extraction from VP-ellipsis is much better if there is overt parallel extraction in the AC (cf. Baltin 2011).
(34) *John became very upset, but I don’t know how upset BILL did.
(35) ?I know how upset JOHN became, but I don’t know how upset MARY did.
Thus, salvageable nonparallel extraction from VP-ellipsis seems to be restricted to a small set of local DP-extractions. This does not follow from the MaxElide account as things stand, and it indicates that we need to rethink exactly what the key factors are that make extraction from VPellipsis so limited.
Finally, there are also cases not involving Ā-extraction where the MaxElide theory incorrectly predicts that ellipsis of a larger category should block ellipsis of a smaller one. In particular, there are the simple cases where VP-ellipsis can optionally include nonfinite auxiliaries.
John has been singing, and Mary has (been), too.
John shouldn’t be drinking, and Mary shouldn’t (be), either.
In both of these cases, the MaxElide-based account incorrectly predicts that ellipsis of the larger constituent containing nonfinite be should block ellipsis of the smaller constituent. To see why, consider the following simplified schematic of the ellipsis clause in (36a) (ignoring movement of auxiliaries, which is immaterial here). As before, the smallest possible PD is underlined; applying MaxElide to this domain would derive VP-ellipsis of everything up to and including nonfinite been, with the option of retaining been being blocked.
(37) [TP Mary λx [T′ has [vP been [VP x singing]]]]
One might expect that this problem can be accounted for by breaking down the A-chain formed by movement of the subject into a number of short intermediate steps, in the spirit of Hartman’s (2011) account of the optionality of VP-ellipsis with raising structures like (20). Thus, one might propose that the correct structure is not (37) but (38), where A-movement passes through all the projections in the inflectional layer, including some other projection below been—say, VoiceP; applying MaxElide to the underlined PD would derive the option for smaller ellipsis, as desired.
(38) [TP Mary λx [T′ has [vP x λx′ [v′ been [VoiceP x′ λx″ [VP x″ singing]]]]]]
While this proposal would work for these cases, it would undermine the analysis of the crucial wh-adverbial data in (9), as schematized in (12). That is, allowing adjunction to intermediate projections would mean that it ought to be possible to analyze (12) as (39), and this would make the underlined portion a potential PD, incorrectly predicting VP-ellipsis to be an option in (9).7
(39) [CP when λx [C′ will λy [TP x [TP you λz [T′ y [VoiceP z λz′ [VP z′ pay me back]]]]]]]
Thus, it is not possible to provide both sets of facts with a unified analysis on this account.
As we see it, the crux of the matter with (36) is the proposal that A-traces count for calculating of MaxElide, as this leads us to expect a far greater number of rebinding configurations. If we remove this component of the analysis, the data in (36) no longer present an immediate problem, although then we are left without an account of the matrix wh-adverbial cases in (9). As we noted earlier, however, the analysis of (9) needs to be rethought anyway, since it seems to involve a systematic violation of Parallelism, and the fact that VP-ellipsis is in fact salvageable in related examples like (14) tells us that this is a problematic analysis; that is, a structure that violates Parallelism ought not to be salvageable, since Parallelism is an inviolable constraint. If A-traces did not count for calculating parallelism, then this problem would go away, because then the projections below TP would form a PD and thus the lack of parallelism in the CP domain would be irrelevant in cases like (14), where what is elided is just the VP, as shown by the revised schematic in (40). That is, the elided VP would be parallel to its antecedent, and the fact that the two structures differ in the higher domain would not matter.
(40) [CP when λx [C′ WILL λy [TP x [TP you [T′ y [VP pay me back]]]]]]
While this might be a step in the right direction, the difference between matrix and embedded wh-adverbial questions would still be mysterious. For those cases, it would seem to be T-to-C movement that is crucial in conditioning whether or not sluicing blocks VP-ellipsis, but this does not follow from the MaxElide account once A-movement is taken out of the picture. The question, then, is how to account for this effect of T-to-C movement on ellipsis options without A-traces.
To summarize, we have shown that the MaxElide account faces a number of technical and empirical problems. Two important empirical results have emerged from this discussion. The first result is that nonparallel extraction from VP-ellipsis is highly restricted, with many cases of VPellipsis not being salvaged when intervening focus rules out the sluicing option, contrary to the predictions of MaxElide. This indicates that a number of the effects normally attributed to MaxElide may in fact be better analyzed as involving some other hard constraint on ellipsis. The fact that these extractions from VP-ellipsis are possible with overt parallel extraction indicates that the relevant hard constraint may be Parallelism. The second result is that allowing A-traces to count for calculating MaxElide runs into trouble with simple cases of VP-ellipsis; removing Amovement from the picture removes this problem, and while it undermines aspects of the analysis of matrix wh-adverbials, these are ripe for an alternative analysis anyway, since the previous account has its own problems.
3 An Alternative Account of MaxElide Effects
As established in the previous sections, MaxElide faces many problems; in many cases, extraction from VP-ellipsis is restricted much more than MaxElide would predict. In this section, we put forth a new analysis according to which the observed restrictions on extraction from VP-ellipsis have different sources—an analysis that can handle all of the problematic data.
Our analysis divides the data into two groups: the salvageable cases, where focusing the subject or an auxiliary saves the VP-ellipsis options, and the unsalvageable cases, where focus has no saving effect (while overt parallel extraction is still possible). For the unsalvageable cases, we provide an analysis in terms of Parallelism, which we take to be a hard constraint on ellipsis. An account of this kind was already offered in section 2, where we pointed out that the ungrammaticality of matrix wh-object questions like those in (15) follows from the fact that the AC and the EC mismatch with respect to the position of binders of head movement. Here, we show that mismatches with respect to movement paths created by Ā-movement also lead to parallelism failures. Once this large class of cases is taken out of the picture, only a small set of salvageable extractions from VP-ellipsis remains: local object extractions like those in in (6) and matrix whadverbial questions like (14a–b). We analyze these in terms of generalized derivational economy. We argue that ellipsis bleeds certain movement steps, with the result that these derivations are shorter than competing derivations; this results in a preference for some elliptical derivations over others, and this preference interacts with hard constraints to derive the core MaxElide effects. The resulting analysis has no need for the ellipsis-specific constraint of MaxElide. In this analysis, moreover, only Ā-movement and head movement count for calculating parallelism.
3.1 Parallelism and Extraction from VP
In section 2, we proposed a solution to Hartman’s (2011) puzzle concerning matrix wh-object extraction out of VP-ellipsis. Hartman admits that the unsalvageability of this class of extraction under his MaxElide analysis is unexpected; however, we showed that extraction is possible when overt parallel wh-movement and head movement take place in the AC. This led us to conjecture that what is blocking VP-ellipsis with matrix object extraction is the inviolable constraint on ellipsis, Parallelism. In this section, we present a similar analysis for other cases of unsalvageable VP-ellipsis extractions; but before doing so, we must address the notion of parallelism that we adopt for this analysis to go through. As an anonymous reviewer points out, the required notion of parallelism must take into account the LF structures of the AC and the EC, not just their denotations. Parallelism as defined in section 1 does not do this. There are, however, a number of definitions of parallelism that do take structure into account (see, e.g., Fiengo and May 1994, Fox 1999a, 2000, Fox and Lasnik 2003, Thoms to appear b). For concreteness, we adopt Griffiths and Lipták’s (2014:210) definition of parallelism, stated in (41).8
Scopal parallelism in ellipsis
Variables in the antecedent and elided clause must be bound from parallel positions.
With this definition of parallelism, we can now account for all of the unsalvageable extractions from VP-ellipsis.
Recall from section 2 that extraction from VP-ellipsis is highly restricted. Lasnik and Park (2013) note that long-distance object extraction from a finite clause is impossible even with intervening focus, contra the prediction of MaxElide. The relevant example, (27a), is repeated here.
(42) *Abby said they heard about a Balkan language, but I don’t know what kind of language BEN did.
We noted previously that the same restriction does not hold for control clauses. Long-distance extraction out of VP-ellipsis with a control complement is possible, as shown in (31), repeated here.
(43) ?John WILL try to kiss MARY, but I don’t know who he WON’T.
This asymmetry between finite and control clauses does not follow from applying the MaxElide constraint as things stand: VP is a possible target for deletion in both cases, and so MaxElide would predict both (42) and (43) to be grammatical. However, we can show that these restrictions follow from the parallelism requirement on ellipsis, independent of the application of MaxElide or any other such constraint. We propose that in most cases, the crucial factor is the form and position of the XP corresponding to the extracted wh-phrase, which we call the correlate. This proposal is broadly in line with the analysis in Thoms to appear b, where similar logic is applied to analyzing very similar restrictions on pseudogapping (see also Griffiths and Lipták 2014).
Recall from the previous sections that parallelism requires that the binding relations found at LF in the EC must match those found in the AC. Assuming successive-cyclic wh-movement through vP and CP, the binding relations for the EC in (43) would be those in (44).
(44) [CP who λx [TP he [T′ won’t [vP x λx′ [v′ try [TP PRO [T′ to [vP x′ λx″ [v′ kiss x″]]]]]]]]]
On the assumption mentioned above that focused DPs undergo QR, Parallelism is satisfied in this case by QR of the focused DP Mary in the AC. The LF structure for the AC is given in (45).
(45) [CP MARY λx [TP John [T′ will [vP x λx′ [v′ try [TP PRO [T′ to [vP x′ λx″ [v′ kiss x″]]]]]]]]]
(46) [CP what kind of language λx [TP BEN [T′ did [vP x λx′ [v′ say [CP x′ λx″ [TP they [T′ T [vP x″ λx′″ [v′ heard about x′″]]]]]]]]]]
In order for Parallelism to be satisfied, matching binding relations must be created at LF for the AC. However, these binding relations cannot be created via QR of the correlate DP in the AC, because finite clauses are typically barriers to QR (May 1985, Fox 2000, Johnson 2000).9 The inability of QR to cross a finite clause boundary is demonstrated by the fact that inverse scope cannot be obtained between the matrix subject and the embedded object in (47a). Compare this with the control infinitive example (47b), which allows for the inverse scope reading, indicating that QR is possible out of the infinitive.
Since the correlate in (42) cannot escape the finite clause by QR, it cannot create binding relations parallel to those in the EC; as a result, Parallelism is violated when the EC is elided and hence (42) is ungrammatical. The same explanation applies to the case of wh-adverbial extraction in (29) as well, where the only difference is that the correlate is the adjunct at noon rather than an object. In all cases, we correctly predict that intervening focus has no effect on the ungrammaticality of VP-ellipsis, since they all involve violations of Parallelism and this is an inviolable constraint. 10
(48) Abby said they heard about a Balkan language, but I don’t know what KIND of Balkan language.
The problem is this: if the correlate must undergo QR out of the finite clause to create parallel binding relations with extraction from VP-ellipsis, then applying the same logic to sluicing ought to rule out (48) as a Parallelism violation as well, since the correlate will be just as clausebound in the antecedent to sluicing as it is in the antecedent to VP-ellipsis. Indeed, as Fox and Lasnik (2003) note, the parallelism problem goes beyond sluicing with contrastively focused correlates: even in simple cases of sluicing like (49) where the correlate is a wide-scoping indefinite, the punctuated path of movement created by wh-movement will create binding relations in the EC that will not be identical to those created by the indefinite, which Fox and Lasnik (2003) and others assume takes wide scope by virtue of an in-situ scoping mechanism like choice functions (Reinhart 1997). This is shown in the diagrams in (50) (slightly modified from Fox and Lasnik 2003:149–150), where the lack of parallelism is laid bare.
(49) Fred said that I talked to a certain girl, but I don’t know which girl.
AC: ∃ λf′ [Fred [ said [that I [ talked to f′(girl)]]]]
EC: which g
girlλg′ [Fred [g′ λg″ said [g″ λg′″ that I [g′″ λg′″ talked to g′″(girl)]]]]
Fox and Lasnik (2003) propose that this problem disappears if we assume that the wh-movement in sluicing can proceed in one fell swoop from its base position to the landing site in Spec,CP, with no stop-offs at intermediate landing sites like Spec,vP or the embedded Spec,CP. On this analysis, the in-situ scoping correlate and the wh-operator create binding configurations that indeed respect Parallelism, as (51) shows.
AC: ∃ λf′ [Fred [said [that I [talked to f′(girl)]]]]
EC: which g
girlλg′ [Fred [said [that I [talked to g′(girl)]]]]
Fox and Lasnik propose that the claim that wh-movement in sluicing can proceed in one fell swoop is justified by the well-known difference between sluicing and VP-ellipsis with respect to island amelioration (Ross 1969, Merchant 2008),11 and Fox and Pesetsky (2005) make the same claim in the context of discussing the interaction of cyclic Spell-Out and linearization. We therefore follow Fox and Lasnik (2003) and Fox and Pesetsky (2005) in assuming that this is possible here. We will show in section 3.2 that this assumption plays an important role in our account of the remaining MaxElide effects, so insofar as the analysis holds together, it provides further support for this claim about wh-movement in sluicing.
Along with the assumption that contrastive foci are like indefinites and wh-in-situ in being able to take scope by in-situ mechanisms as well as by QR (Wold 1996, Reich 2004, Krifka 2006), this analysis allows us to account for the difference between VP-ellipsis and sluicing with respect to extraction from finite clauses. With nonparallel extraction from VP-ellipsis, the path of wh-movement from the EC is punctuated ( just as it is without ellipsis), stopping off at intermediate adjunction positions like Spec,vP. In order for Parallelism to be satisfied, then, there must be parallel binding relations in the AC, and the only way for these relations to arise is for the correlate to undergo QR; it is not sufficient for the correlate to take scope by in-situ mechanisms in this case, because if it does there will be a mismatch of the kind seen in (50). As a consequence, nonparallel extraction from VP-ellipsis is tied to QR: extraction from VP-ellipsis will only be possible in those situations where the correlate can be extracted from the antecedent VP by QR. Note, however, that no such restrictions are expected to hold of overt parallel extraction from VP-ellipsis—that is, those cases where there was overt parallel wh-extraction from the antecedent as well—since the parallel extraction would of course ensure that Parallelism was satisfied: the wh-phrases in the AC and the EC will stop off at the same landing sites and will therefore create fully parallel movement paths. And of course no such restriction holds of sluicing either, since the wh-movement in sluicing may proceed in one fell swoop to the final Spec,CP, and so the correlate can take wide scope by in-situ mechanisms without violating Parallelism.
The claim that nonparallel extraction from VP-ellipsis is tied to QR makes two additional predictions. One prediction concerns finite clausal complements: if it were possible for an argument to take scope outside the embedded finite clause, then long extraction would also be available for that argument, since matching binding relations could be established in the AC by QR of the correlate. Kayne (1998) observes that when the matrix subject binds the embedded subject of an embedded finite clause, the embedded object can take scope outside the finite clause as shown in (52).12
Our account predicts that nonparallel extraction from VP-ellipsis should be possible in these configurations. (53) shows that this prediction is borne out: (53) is grammatical with extraction from VP-ellipsis, in striking contrast to the ungrammatical long extraction in (42).
(53) Johni said hei kissed MARY but I don’t know who BILLk did
say hek kissed t.
The second prediction concerns extraction out of control complements. Recall from (47b) that QR is normally possible out of control infinitives; however, as Susi Wurmbrand notes ( pers. comm., attributing the observation to Benjamin Bruening), QR appears to be blocked when the infinitive is extraposed, as demonstrated by the lack of inverse scope in (54) (cf. (47b), where inverse scope is available).
Since QR out of the complement is impossible in such configurations, we predict that nonparallel extraction will also be impossible. This prediction, too, is borne out. It is precisely in these cases that extraction from control complements of VP-ellipsis is substantially degraded, as shown in (55).
(55) *Mary tried, over the summer, to read MOBY-DICK, but I don’t know what BILL did
try, over the summer, to read t.
The above data suggest a strong correlation between the correlate’s ability to undergo QR and the ability to extract from VP-ellipsis, one that follows straightforwardly from parallelism with no appeal to MaxElide.
The parallelism-based analysis extends to the other cases in section 2 where asymmetric extraction from VP-ellipsis was impossible regardless of intervening focus. Recall that extraction out of a DP complement in VP-ellipsis is ungrammatical, as shown by (32), repeated here.
*ABBY heard a lecture about a Balkan language, but I don’t know what kind of language BEN did.
Once more, the answer lies in considering the scopal properties of the correlate: in order to create parallel binding relations in the AC, it must move out of the DP via QR, but this is not possible because DP is a scope island (Larson 1985, May 1985, Charlow 2010; cf. Sauerland 2005) just as finite CPs are. Since the correlate cannot move out of the DP via QR, it is impossible to create the necessary binding relations in the AC, and so parallelism fails. A slightly different analysis holds for the case where the wh-phrase is a degree phrase like how upset. Recall from (34), repeated here, that such degree phrases cannot be extracted from VP-ellipsis even if the extraction is local.
(57) *John became very upset, but I don’t know how upset BILL did.
We propose that parallelism fails here because the correlate here is a predicate, and predicates are nonquantificational and hence unable to undergo QR. As before, this means that the extraction in the EC is not matched by covert Ā-extraction in the AC, and so Parallelism is violated. No amount of intervening focus can salvage the VP-ellipsis option here, although the example is correctly predicted to be grammatical if parallel extraction takes place in the AC (as in (35)).13
(15) a. Mary will kiss Bill. Who will JOHN *(kiss)?
Going by the discussion at the beginning of section 2, we may assume that (15a–b) follow straightforwardly from parallelism, since there is no parallel head movement in the AC and the EC. This would indeed follow if we took the correct LF structure for the EC in (15a) to be as in (58): extraction of the object makes the VP a rebinding configuration, and since the binder of the variable in the object position is in Spec,CP, then it is this whole domain that needs to be taken into account in calculating parallelism. And since there is no parallel head movement in the AC, Parallelism will not be satisfied, even if the object undergoes QR in parallel to Spec,CP. Note that traces of A-movement need not be taken into account here.
(58) [CP who λx [C′ will λy [TP John [T′ y [VP kiss x]]]]]
However, this analysis begins to come apart at the seams once we assume, as we have done above, that wh-movement is successive-cyclic, stopping off at intermediate vP and CP projections on the way to the final scope position. If we add this assumption to our schematic, and assume as well (as Hartman (2011) does) that each intermediate step of movement creates a separate variable binding configuration, then the projection “closed off ” by the binder left in the intermediate landing site, λx′ in (59), ought to create a PD in which ellipsis would apply to derive VPellipsis. This is not what we want, since VP-ellipsis is never possible in these configurations.
(59) [CP who λx [C′ will λy [TP John [T′ y [vP x λx′ [VP kiss x′]]]]]]
Note that reintroducing A-traces into the LF structures is not the way to go: while this would give us an account of matrix wh-objects, recall that it would lead us to expect the same behavior from matrix wh-adverbials, in that the latter would also be incorrectly predicted to be unsalvageable. Finally, recall that it does indeed seem to be T-to-C movement that is implicated in making (15a) unsalvageable, since VP-ellipsis can be salvaged with embedded wh-objects, and indeed with parallel T-to-C extraction alongside wh-extraction (see (17)). Thus, what we need is an analysis in which any extraction from VP “catches” the binding path left by T-to-C movement, resulting in an unsalvageable Parallelism violation with nonparallel extractions, while extraction of TP adjuncts in matrix wh-adverbial questions does not.
(60) [CP who λx [C′ will λy [TP John [T′ y λy′ [vP x λx′ [v′ y′ [VP kiss x′]]]]]]]
Here, the trace of v-to-T movement ensures that the vP is no longer a PD, since it contains a trace that is rebound from T, and the result is that the smallest possible PD is the one created by the second step of successive-cyclic wh-movement. But since this PD contains within it the path of T-to-C movement, the result is that the AC must also contain T-to-C movement in order for Parallelism to be satisfied. Thus, we correctly predict that matrix extraction of a wh-object from VP-ellipsis where the antecedent does not also involve inversion, as in (15), will always involve a Parallelism violation, since the binders created by the moved object and the moved auxiliary will always overlap and hence “extend” the smallest putative PD up to CP, where nonparallelism with respect to T-to-C movement is found. Crucially, this does not upset our analysis of matrix wh-adverbials or wh-subjects, since the paths of wh-movement and v-to-T movement will not intersect to extend the smallest possible PD all the way to the left periphery in the same way. To see this, consider the revised LF structure for the matrix wh-adverbial question in (14b), repeated here: the domain formed by v-to-T movement is a PD containing no rebound variables, and since this excludes the trace left by T-to-C movement, we do not expect nonparallelism in this domain to lead to an unsalvageable Parallelism violation.
(14) b. If Anna isn’t going to resign today, then when WILL she?
(61) [CP when λx [C′ will λy [TP x [TP she [T′ y λy′ [vP [v′ y′ [VP resign]]]]]]]]
Thus, we neatly capture the distinction between VP-extractions on the one hand and IP domain extractions on the other. The only issue is that this solution comes at the expense of assuming that all auxiliaries, including the modals, do, and be/have, are heads of vP projections and that they uniformly move to T rather than being base-generated there. However, proposals have been made that support this view: Embick and Noyer (2001), Bjorkman (2011), and Thoms (to appear a) propose that do is a spell-out of v when it has moved to T, while Iatridou and Zeijlstra (2013) propose that the scopal interactions of modals and negation indicate that these, too, must be basegenerated in a lower position and then moved to T. We therefore take this assumption to be wellsupported and submit that this analysis of wh-object extraction, insofar as it is successful, can be taken to support this view of auxiliaries in English.
3.2 Ellipsis and Derivational Economy
To account for the remaining cases, we follow the core intuition of Merchant’s (2008) original proposal by offering an account in terms of derivational economy. However, we depart from Merchant’s proposal and subsequent ones in rejecting the ellipsis-specific economy constraint MaxElide, which states a preference for ellipsis of larger constituents over smaller ones, as we have shown that it faces several empirical problems that make it very difficult to state its domain of application in a way that doesn’t enforce larger ellipsis domains at all times. Rather, we propose that ellipsis processes and movement interact in such a way that derivations involving certain ellipsis processes can require fewer movement steps, with the result that the shorter derivation is preferred to the longer one, making the latter degraded (Chomsky 1991, 1993, Epstein 1992, Kitahara 1997).
In the context of the present analysis, this proposal is most straightforward in the case of local object extractions like (1), repeated here.
(1) Mary was kissing someone, but I don’t know who (*she was).
Recall from section 3.1 that we assume, following Fox and Lasnik (2003), that wh-movement can move in one fell swoop just in case sluicing applies; or, to put it another way, sluicing bleeds successive-cyclic movement, a fact that would follow from an approach whereby the need of moved phrases to pass through certain projections is phonological in nature (as in Fox and Pesetsky 2005; see also Bošković 2007). In the case of object extraction, this means that a derivation that involves sluicing will be more economical than one that involves VP-ellipsis, since the latter will require two steps of Ā-movement but the former will require only one, as the schematics in (62) show.14
We assume that VP-ellipsis and sluicing are competing derivations, having the same information structure properties and differing only in the size of the ellipsis site. Competition only applies when the two ellipsis options are both contained within the same PD, that is, if the target for VPellipsis is contained within the target for sluicing and it is fully parallel to the AC. If the two compete, it follows that the sluicing derivation will block the VP-ellipsis one here because it is more economical, requiring one less step of Ā-movement. Crucially, if we rule out sluicing by placing focus in the IP domain, hence making the information structure properties of sluicing and VP-ellipsis distinct, then sluicing will not compete and VP-ellipsis will be possible; hence, we correctly predict that intervening focus will save the VP-ellipsis option with embedded wh-objects and related extractions from VP.15
An important property of this account is that it only predicts sluicing to be more economical when the remnant is extracted from within the VP by successive-cyclic movement. If the remnant is extracted from the IP domain, as with subject questions or wh-adverbial questions where the adverbial modifies the IP domain, then the wh-phrase will not need to make any successivecyclic stop-offs in either the sluicing or the VP-ellipsis derivation, and so the number of steps of Ā-movement in both derivations will be the same. This predicts there will be no competition between sluicing and VP-ellipsis when the subject is extracted locally from the IP domain, as with subjects and IP-level wh-adverbial modifiers. As Schuyler (2001), Merchant (2008), and Hartman (2011) note, this is correct for cases like (63) and (64) (= (8)).
(63) Someone left, but I don’t know who (did).
You say you’ll pay me back, but you haven’t told me when (you will).
Our account also improves on the MaxElide account by not predicting competition between the different VP-ellipses in examples like (36a–b), repeated here, since there is no Ā-extraction and there seems to be no good reason to believe that one ellipsis derivation would be more economical than the other.
John has been singing, and Mary has (been), too.
John shouldn’t be drinking, and Mary shouldn’t (be), either.
Indeed, our account only predicts competition between ellipsis options when one of the options is sluicing; that is, it is not the presence of Ā-movement itself that brings about competition (as in Merchant 2008), but the one-fell-swoop derivation for sluicing. That this is so is shown by the fact that no competition arises between big and small ellipsis options in configurations where there is Ā-movement from the ellipsis targets but no option to apply sluicing; this is demonstrated by the fact that ellipsis of the VP containing the control complement in (65) (a variant of (31)) does not block the option to elide just the infinitival complement.16
(65)JOHN wants to kiss MARY, but I don’t know . . .
. . . who BILL does.
. . . who BILL wants to.
MaxElide would predict competition between (65a) and (65b), leading to the prediction that (65b) would be blocked by (65a). No such prediction is made by our account, since it predicts only competition between the economical option of one-fell-swoop sluicing and other elliptical derivations.
Now let us turn to matrix wh-adverbial questions. Recall that these differ from their embedded counterparts like (64) in requiring intervening focus to ensure that VP-ellipsis is possible; that is, sluicing seems to outcompete VP-ellipsis here in the matrix examples, but not in the embedded ones.
(9) We know Anna is going to resign. The only question is: when (*will she)?
(14) b. If Anna isn’t going to resign today, then when WILL she?
Hartman (2011) provides data from Indian and Irish dialects of English indicating that the crucial difference between (9) and (14b) is T-to-C movement: VP-ellipsis is blocked when T-to-C movement takes place, at least when both sluicing and VP-ellipsis are possible. We propose that sluicing blocks VP-ellipsis in (9) because sluicing bleeds T-to-C movement, resulting in a more economical derivation. Specifically, we argue that the landing site for T-to-C movement is within the sluicing site, and deletion of the landing site bleeds T-to-C movement since it is driven by a PF condition dictating that the null C+wh is affixal and must be supported by an overt head like T. This makes the sluicing derivation shorter than the VP-ellipsis derivation, since the latter involves an extra movement step, and so economy prefers the former, all other things being equal.
This predicts intervening focus to have an ameliorating effect with the VP-ellipsis option, and of course it also predicts that this effect will only be seen when T-to-C movement is involved.
To make this analysis convincing, we need to provide some support for its components, namely, (a) that the sluicing site contains the landing site for T-to-C movement, and (b) that T-to-C movement is motivated by PF conditions and so can be bled by ellipsis. The first component is motivated by Merchant’s (2001) “Sluicing-Comp Generalization,” which states that no nonoperator material may survive sluicing. This generalization is motivated by the fact that overt complementizers never occur to the left of the sluicing remnant, even when they can cooccur in the nonelliptical structures in the language in question (Merchant provides evidence from Slavic, Germanic, and Celtic languages), and it strongly suggests that the constituent deleted in sluicing is large enough to contain the complementizers in the CP domain. We do not dwell on the matter of how to explain the Sluicing-Comp Generalization,17 but we take it that any account of this restriction would generalize to account for missing complementizers in sluices and would also account for the absence of T-to-C movement, as suggested by Merchant (2001), and we posit that the most plausible analysis is one where the complementizers are contained in the ellipsis site. The second component of our analysis, the claim that T-to-C movement is driven by PF conditions and can be bled by ellipsis, is not new, having been advanced for English by Lasnik (1999, 2001) and for Hungarian by Van Craenenbroeck and Lipták (2008). Lasnik’s argument comes from the fact that matrix sluices do not retain the auxiliary, as in (67), but this is undermined by the Sluicing-Comp Generalization, which subsumes this effect under the deletion of complementizers.
John kissed someone.
*[CP Who [C′ did
[TP he kiss]]]?
Nevertheless, the claim that null complementizers in English are affixes that need support from some other head in the structure has been made in different forms by Pesetsky (1991), Bošković and Lasnik (2003), and Kim (2008) (see also Bruening to appear for an alternative PF-based analysis of inversion), so we take it that our assumptions about C in English are reasonably wellfounded. As for the claim that verb movement to such a target may be bled by ellipsis, Van Craenenbroeck and Lipták (2008) analyze an interesting set of facts in Hungarian that seems to provide compelling evidence for this. The evidence comes from so-called focus sluices like (68), where the remnant of sluicing is a focused non-wh XP in a yes/no question. As Van Craenenbroeck and Lipták note, the head that realizes C in yes/no questions, which normally surfaces adjoined to the verb in C, is found attached to the ellipsis remnant in the focus sluices (see (-e) in (68)). They interpret this as indicating that the verb has failed to undergo head movement to that C position because it has been bled by ellipsis.
János meghívott egy lányt, de nem tudom hogy ANNÁ T*(-e).
János invited a girl but not I.know COMP Anna-Q
‘János invited a girl, but I don’t know if it was Anna.’
Thus, it seems that our basic assumptions about T-to-C movement are reasonably well-supported, although clearly more work needs to be done to unearth further evidence for this effect.
So far, we have argued that ellipsis may bleed verb movement and successive-cyclic movement in certain contexts, and that this has consequences for the economy of derivation that are reflected in the data normally attributed to MaxElide. At this point, one may wonder whether there is a principled way of predicting which types of movement could be bled by ellipsis, as our analyses would be substantially complicated if it turned out that a larger class of movements was bled by ellipsis. The null hypothesis is that only movement that is motivated by PF constraints can be bled, and we have indicated that this may hold for the movements that were bled by ellipsis in our analyses—namely, intermediate steps in successive-cyclic Ā-movement and T-to-C movement. What other movement rules can be analyzed this way? One candidate is A-movement to Spec,TP, as there are proposals in the literature for PF-based accounts (Sauerland and Elbourne 2002, Landau 2007) and indeed it has been proposed that A-movement is bled by sluicing (Merchant 2001, Van Craenenbroeck and Den Dikken 2006). This would have implications for our analysis of subject wh-questions like (63): if ellipsis bled A-movement, then the derivation for a subject sluice might be able to omit the step of A-movement prior to wh-movement, making the sluice more economical and leading to the incorrect prediction that sluicing would outcompete VP-ellipsis. We do not take this particular case to be a problem here, since the PF-based theories of A-movement face a number of problems (see, e.g., Lasnik and Park 2003 and Barros, Elliott, and Thoms 2014 on the claim that A-movement is bled by sluicing), but it is illustrative of the wider issue for our account. We must leave this as a topic for future research, although we note ptimistically that it may be possible to turn things around and use “MaxElide” effects as a probe for identifying movement rules that are driven by PF conditions.
In this reply, we have argued against accounts of interactions between wh-movement and ellipsis in terms of MaxElide, which enforces competition between sluicing and VP-ellipsis in narrowly defined domains. We showed that extraction from VP-ellipsis is more restricted than would be expected on the basis of MaxElide alone, with many extractions remaining ungrammatical even when the competing sluicing derivations are ruled out. We argued that this large class of cases can be explained in terms of parallelism alone, which then required a reassessment of which movement types count for the calculation of parallelism, according to which only Ā-traces and head traces were taken into account. This left just a small class of extractions where ruling out sluicing did affect the grammaticality of extraction from VP-ellipsis, and we argued that these can be analyzed in terms of general derivational economy.
Our analysis has three important implications. First, it allows us to dispense with the ellipsisspecific constraint MaxElide. This is a welcome theoretical result, since it is not clear how such a constraint could be learned, and it is also difficult to see how it could be said to derive from general functional pressures to “say less,” since there are many cases where such a constraint would seem inappropriate. Second, the data analyzed here provide strong evidence for a structural notion of parallelism, as the semantic definition found in the previous literature on MaxElide appeared to be too weak to account for many of the contrasts presented here. Obviously, more work needs to be done on this front, as there are data that seem to require a less stringent definition of parallelism. We believe that these data can be accounted for with a structural approach to parallelism that appeals to the mechanism of accommodation (see, e.g., Fox 1999a, Thoms 2015). Third, the data discussed here indicate that A-traces do not count for calculating parallelism, contrary to Hartman’s (2011) central claim that traces are uniform with respect to how they are interpreted at the syntax-semantics interface. Although a uniform analysis of movement is a laudable aim, we believe that this separation of A-movement from the other movement types is justified, as it is well-known that with respect to reconstruction, A-movement often behaves as if it does not leave a trace (Chomsky 1995, Lasnik 1998), although it is now well-established that A-movement cannot be analyzed as traceless movement altogether (Fox 1999b, Lebeaux 2009, Iatridou and Sichel 2011). Figuring out how to account for this nonuniform picture is a big topic for future research, and we speculate that examining the interaction between the two empirical phenomena considered here—reconstruction and ellipsis parallelism—may be the way to go.
For helpful discussion of the topics presented here, we would like to thank Matt Barros, Željko Bošković, Patrick Elliott, Jon Gajewski, Craig Sailor, and Susi Wurmbrand. Thanks also to two anonymous reviewers for helpful comments and suggestions. All remaining errors are ours.
3Takahashi and Fox (2005) attempt to account for certain restrictions on the availability of sloppy readings of pronouns in terms of MaxElide. The data they attempt to account for are given in (i).
Takahashi and Fox (2005) assume that in order for the pronoun in the ellipsis site to be interpreted as sloppy, it must be bound by the λ-abstraction that composes with the subject (i.e., John in the AC and Bill in the EC); thus, this creates a rebinding configuration just as wh-movement does in (1). MaxElide then chooses the largest ellipsis target, hence ruling out the sloppy interpretation in (ib). This type of analysis predicts that if intervening focus blocked ellipsis of the larger VP, then MaxElide would be forced to choose ellipsis of the smaller VP and the sloppy interpretation should be possible in (ib), just as focus has an ameliorating effect in (6). As (ii) shows, this prediction is incorrect; the sloppy interpretation is still impossible here, suggesting that whatever is blocking this sloppy interpretation, it is not MaxElide (see Hardt 2006 for similar examples that lack a sloppy interpretation).
Hardt (2006) and Grant (2008) present even more evidence against a MaxElide account of the contrast between (ia) and (ib); for this reason, we limit ourselves to discussing configurations involving movement and leave the constraints on the availability of sloppy interpretations as a topic of future research.
4 An anonymous reviewer reports not finding Hartman’s (15a) wholly ungrammatical, and notes that other speakers consulted felt similarly. We believe this judgment may be due to a potential ambiguity in this example (also noted by the same reviewer), where who is the subject, John is the object, and the verb is removed by pseudogapping. This ought to be controlled for in examples like (i), which we find worse than (15a).
(i) Mary is eating cake. What is JOHN *(eating)?
Note also that the same problem does not trouble (15b).
5 Based on an attested example at http://www.fanfiction.net/s/4163642/16/Death, accessed 17 June 2014.
6 We use the term salvage to distinguish this effect from repair, which has a technical sense in the literature that we wish to avoid. In the account that follows, as in that of Takahashi and Fox (2005), Merchant (2008), and Hartman (2011), focus does not actually repair a “broken” extraction; rather, it rules out a competing, more economical option.
7 We follow Hartman (2011) in representing the base position of the subject as Spec,VP here and in other schematics above as well. Things would be complicated further if we were to follow much recent work in representing the base position of the subject as the specifier of a separate VP-shell projection like vP, as this would also lead to a situation where the lower VP-shell would not contain a rebound variable, again incorrectly predicting the availability of VP-ellipsis (at least if it turned out that VP-ellipsis were not to be reanalyzed as vP-ellipsis).
8 This definition is a simplification of what is needed, as there are well-known problems with such strict views of parallelism, such as the data in (i). In (ia) and (ib), a sloppy reading is available even though the DPs in the antecedent and elided clauses occupy different positions in (ib).
John’s boss fired him and Bill’s boss did too.
b. The guy John works for fired him and Bill’s boss did too.
Rooth (1992) uses these data to argue against a structural view of parallelism; however, we believe that such examples could be made to fit within a structural view of parallelism if accommodation of a new antecedent were allowed for, as in Fox 1999a and Thoms 2015.
10 As a reviewer notes, long-distance object extractions are not unsalvageable when VP-ellipsis targets the lower VP.
(i) I know you said John spoke to someone, but I don’t know who you said MARY did.
Our analysis does in fact predict that VP-ellipsis will be salvageable in the embedded clause, since with successive-cyclic movement the embedded clause of a long-distance extraction will look broadly similar to regular local object extraction: the ellipsis clause will contain a s-binder in the Spec,CP local to the VP-ellipsis site, and in the antecedent clause the indefinite correlate will undergo QR to the embedded Spec,CP, thus satisfying Parallelism.
12 The correlation between the exceptional wide scope of universal QPs in embedded contexts and the availability of long-distance extraction from VP-ellipsis is not perfect. For instance, Farkas and Giannakidou (1996) note that universal QP subjects can take “extrawide scope” out of finite complements of predicates like make sure, yet the same complements do not allow long-distance extraction from VP-ellipsis.
It is perhaps relevant that Farkas and Giannakidou’s account of (i) is stated, not in terms of covert movement, but in terms of how the lexical semantics of the embedding predicate ensures that the embedded subject and matrix subject behave as if they were coarguments. Thus, it could be the case that these finite clauses are barriers for QR, allowing us to retain the parallelism-based account of (ii), while some other mechanism ensures inverse scope of the two quasi coarguments. Clearly, more research is needed to make this argument work, though.
13 Supporting evidence comes from the fact that DP predicates are also degraded as remnants of extraction from VP-ellipsis.
(i) *I’m sure JOHN will become A FOOTBALLER, but I don’t know what HIS BROTHER will. This indicates that it is not necessarily the categorial status of AP remnants that rules them out as remnants; rather, it is restrictions on what can undergo QR.
14 A reviewer notes that this argument only goes through if we adopt the view of derivational economy according to which the metric employed for effort counts the number of derivational steps (e.g., Chomsky 1995). As the reviewer points out, this metric may not be correct, as it is plausible that the correct notion of economy measures the length of dependencies in terms of nodes crossed, with direct consequences for our account. Furthermore, the reviewer notes that the general preference for shorter dependencies over longer ones evidenced by psycholinguistic studies (e.g., Crain and Fodor 1985, Gibson 2000, Phillips, Kazania, and Abada 2005) could be a factor that affects offline judgments of the kind discussed above. Although we agree that this issue should be considered when it comes to economy-based arguments, it is not clear to us whether the preference for shorter dependencies would necessarily lead to a preference for dependency formation derivations involving a sequence of shorter steps over ones with longer steps where the two ultimately involve creating a global dependency of the same length, as evidence for the short-step preference comes primarily from experiments that show preferences for creating shorter global dependencies than the ones that are required (i.e., preferences for subject relativization over object relativization).
15 As an anonymous reviewer notes, we must find a way for sluicing to block VP-ellipsis, but not block a sentence without any ellipsis. As (i) shows, sluicing in (ia) does not block (ib), though presumably (ib) contains more steps of movement than (ia).
John was kissing someone, but I don’t know who.
John was kissing someone, but I don’t know who he was kissing.
Intuitively, we only want derivations that include the process of ellipsis to be competitors. But how do we formalize this? Here is one way. Let’s assume that only derivations that have the same numerations compete. Let’s also assume, following Merchant (2001), that ellipsis is licensed by an E(llipsis)-feature. Departing from Merchant slightly, let’s assume that the E-feature is a morpheme that merges onto certain functional heads and triggers nonpronunciation of those heads’ complements. For both the sluicing and the VP-ellipsis derivations, the numerations would contain the ellipsis-licensing E-feature, the difference between the two being which functional head the feature merges with (C for sluicing and T for VP-ellipsis). A sentence with no ellipsis would not have the E-feature in its numeration and thus would not compete against the ellipsis derivations.
16Takahashi and Fox (2005) argue that the opposite effect holds in examples like (i), with parallel extraction in the AC and the EC.
*I don’t know which puppy you should agree to adopt, but I know which one you should NOT agree to.
I don’t know which puppy you should agree to adopt, but I know which one you should NOT.
We are unsure why there is variation in these effects. We note, however, that the contrast in (i) would not follow from any of the other accounts if they were to adopt the widely held assumption that wh-movement is cyclic, since under this assumption the λ-operators left by cyclic movement through the embedded VP or CP would demarcate PDs in which the application of MaxElide would derive (ia) as grammatical.
17 There are different ways to capture this generalization. One is to follow Rizzi (1997) in assuming that the CP domain is split into a number of different projections, and to further assume that the landing site for wh-movement is in a higher CP projection than the one that is targeted by T-to-C movement, with a head of one of the higher projections licensing sluicing. Such an approach is explored by Baltin (2010) and Van Craenenbroeck (2010). It has a few problems, though; for example, it divorces T-to-C movement from movement to the specifier of the same projection, losing the core insight of criterial accounts that hold that one movement causes the other. It also leads us to expect that in languages with highly “isolating” CP fields with overt realizations for multiple distinct heads of CP projections, the complementizer that heads the projection with the wh-phrase in its specifier will survive sluicing. This does not seem to be correct, though, as Welsh obeys the Sluicing-Comp Generalization, yet it seems to be a good candidate for a language with an isolating CP field, as it may realize up to three distinct C heads simultaneously (Hendrick 2000).
An alternative analysis of the Sluicing-Comp Generalization is provided by Thoms (2010). Thoms rejects the idea that ellipsis is licensed by a set of lexically specified heads, like T in the case of VP-ellipsis and C in the case of sluicing; instead, he proposes that ellipsis is generally licensed by overt movement, with ellipsis effectively being another way of doing copy deletion. According to this analysis, the licensor of ellipsis in sluicing is the moved wh-phrase itself, which licenses deletion of its structural complement; this includes the complementizer, thus deriving the Sluicing-Comp Generalization.