## 1 Introduction

Gapping removes the finite element (T and its host), possibly along with additional material, in the second and subsequent coordinates of a coordination structure, leaving behind two remnants. (The material that has gone missing—in (1), the finite auxiliary *had* and the main verb *ordered*—is represented with ∆.)

(1) Some had ordered mussels, and others ∆ swordfish.

Building on a long tradition of earlier work, Johnson (2009:293) identifies three unique properties that distinguish gapping from superficially similar elliptical constructions, such as pseudogapping (e.g., *Some had ordered mussels, and others had* ∆ *swordfish*). First, gapping is restricted to coordination structures (2) (Jackendoff 1971:22, Hankamer 1979:18–19). Second, the gap in gapping cannot be embedded (3) (Hankamer 1979:19). Third, the antecedent in gapping cannot be embedded (4) (Hankamer 1979:20).

(2) *Some had eaten mussels, because others ∆ shrimp.

(3) *Some had eaten mussels, and she claims that others ∆ shrimp.

(4) *She’s said Peter has eaten his peas, and Sally ∆ her green beans, so now we can have dessert. Intended: ‘She has said that Peter has eaten his peas; Sally has eaten her green beans.’ (Johnson 2009:293)

Crucially, (4) is ungrammatical under an interpretation in which only the antecedent clause—not the gapped clause—is embedded. (The sentence is grammatical with a different interpretation, one that is not relevant here, where the entire conjunction is embedded.)

There are a number of theories of gapping that successfully account for at least the first two of these properties (Coppock 2001, Lin 2001, Johnson 2004, 2009). These all share one analytical ingredient: they appeal to low coordination of small verbal constituents, such as vPs, under a single T head. (See Kubota and Levine to appear, though, for a different approach within Categorial Grammar.)^{1} T goes missing in the second and subsequent coordinates because it was never present inside them to begin with. Some other mechanism gets rid of any additional material that goes missing: either some kind of ellipsis (for Coppock and Lin) or across-the-board movement (for Johnson).

As Johnson (2009:296–300) observes, low coordination easily derives the first property of gapping, because two or more vPs can be embedded under a single T head only through coordination. It also derives the second property: a single T head cannot be shared with a vP that is embedded *inside* a coordinate. But deriving the third property is less straightforward. Under any theory of gapping that uses low coordination, the notion of an ‘‘antecedent’’ is a complex one. Since T is removed through a different mechanism than any additional material that goes missing, the antecedent of T is analytically distinct from the antecedent of additional missing material.

I will take a closer look here at the third property of gapping in light of low-coordination theories of gapping. In section 2, I show that while the syntax of low coordination can explain why the antecedent of T cannot be embedded inside the first coordinate, it has nothing to say about why the antecedent of any additional missing material also cannot be embedded. Then, in section 3, I propose a new generalization about what can be embedded inside the first coordinate, which I call the No Correlate Embedding Generalization, whose source does not lie in the syntax of low coordination. In section 4, I suggest that this generalization arises from a constraint on the focus structures of the coordination structures in which gapping appears, Low-Coordinate Parallelism. Finally, in section 5, I examine a prediction of this account: Low-Coordinate Parallelism should be satisfied when the embedding material in the first coordinate is itself contained inside a focus.

## 2 Taking a Closer Look at the Third Property

In its traditional formulation, the third property of gapping is usually stated along the following lines: the antecedent in gapping cannot be embedded inside the first coordinate (see, e.g., Hankamer 1979:20). Theories of gapping that appeal to low coordination remove T and any additional material in different ways. Consequently, the antecedent of T is analytically distinct from the antecedent of any additional missing material.

Does the third property of gapping hold for both antecedents? We can start by looking at an example in which only T goes missing. In (5), the antecedent of T cannot be located inside an embedded clause in the first coordinate. (Here and below, the antecedent of T is underlined.)

(5)

*She will say that Peter has eaten his peas, and Sally ∆ eaten her green beans.

Intended: ‘She will say that Peter has eaten his peas; Sally has eaten her green beans.’

It is easy to see how this follows from the syntax of low coordination. T

_{2}in (6), contained within the embedded clause in the first coordinate, cannot be shared with the second vP coordinate.

To test whether the antecedent of any additional missing material can be embedded, we need an example like Johnson’s (2009) sentence in (4). In the simplified version in (7), the additional missing material in the second coordinate—the main verb *eaten*—is identical to material that is contained within an embedded clause in the first coordinate. (The antecedent of additional missing material is boxed.)

Since (7) is ungrammatical under the intended interpretation, it appears that any additional material that goes missing also cannot be embedded inside the first coordinate.

Importantly, the syntax of low coordination does not account for this fact. In (7), the second coordinate can share the T of the matrix clause. So, the restriction on embedding the antecedent of additional missing material must arise from another source. This might, in principle, be a product of the operation that removes this material, namely, ellipsis or across-the-board movement. But I suggest below that it is in fact a specific case of a more general constraint on gapping, which prevents *correlates* from being embedded inside the first coordinate.

## 3 The No Correlate Embedding Generalization

With this understanding from the syntax of low coordination of why T’s antecedent cannot be embedded inside the first coordinate, the sentence in (8)—which is parallel to (7) except that the main verb does not go missing in the second coordinate—should be fully grammatical.

(8)

*She has said that Peter has eaten his peas, and Sally ∆ drunk her milk.

Intended: ‘She has said that Peter has eaten his peas; Sally has drunk her milk.’

Since T of the *matrix* clause in the first coordinate—the auxiliary *has*—selects for participial morphology, it can be shared with the second coordinate. (In more traditional terminology, the T that has gone missing can find as its antecedent the T of the matrix clause in the first coordinate.)

So why isn’t (8) grammatical? I propose that the traditional generalization about embedding the antecedent of gapping should be subsumed by a more inclusive generalization. The remnants stand in contrastive relationships with elements in the first coordinate: *Peter* and *eaten his peas* contrast with the correlates *Sally* and *drunk her milk.* As shown in (9), each of these can bear a pitch accent.

(9)

*She has said that [PETER] has [eaten his PEAS], and [SALLY] ∆ [drunk her MILK].

Intended: ‘She has said that Peter has eaten his peas; Sally has drunk her milk.’

I propose that it is the *correlates* that cannot be embedded under material that is present only in the first coordinate.

(10)

No Correlate Embedding GeneralizationThe correlates in gapping cannot be embedded inside the first coordinate.

It is not clear how the syntax of low coordination could ever account for the No Correlate Embedding Generalization, as the correlates do not form a natural class syntactically. They are simply the elements in the first coordinate that stand in a contrastive relationship with the remnants.

## 4 The Role of Parallelism

Where else could the source for the No Correlate Embedding Generalization lie? I will suggest that the generalization arises from the same grammatical principles that enforce a contrastive relationship between each remnant and its correlate. As many have observed, gapping has a characteristic information structure that is reflected in its prosody (Kuno 1976:310, Sag 1976:287, Hankamer 1979:183–184, Levin and Prince 1986, Hartmann 2001:162–166, Kehler 2002:81–100, Winkler 2005:191–194, Repp 2009:83–148).

(11)

[SOME] had ordered [MUSSELS],

and [OTHERS]

~~ordered~~[SWORDFISH].

In (11), the remnants *others* and *swordfish* bear a pitch accent and contrast with the corresponding phrases in the first coordinate. These correlates are often new information and can also be realized with pitch accents, as *some* and *mussels* are in (11).

I propose that this contrastive relationship arises from a more general constraint on the focus structures of the coordination structures in which gapping occurs. This constraint can be stated using Rooth’s (1985, 1992) alternative semantics for focus.

(12)

Low-Coordinate ParallelismFor vPs α and β if α and β are coordinated, 〚α〛 ϵ

ALT(β) and 〚β〛 ϵALT(α).

The alternative set for a linguistic expression, given by the function *ALT*, is the set of ordinary meanings derived by replacing a focus-marked constituent with any expression of the same type. Low-Coordinate Parallelism thus requires that the coordinates in a low-coordination structure be alternatives to one another.

As a consequence, assuming that each remnant contains a focus, the coordinates in a low coordination will have to have parallel focus structures. In addition, any nonfocused material they contain will have to be semantically identical. To illustrate this, the focus structure for (11) is given in (13).

Low-Coordinate Parallelism is satisfied, because the first coordinate is in the alternative set of the second coordinate, and vice versa.^{2}

(14)

- a.
〚vP

_{1}〛 =order(mussels)(some) ϵALT(vP_{2}) = {order(x)(y) |x,yϵD}_{e}- b.
〚vP

_{2}〛 =order(swordfish)(others) ϵALT(vP_{1}) = {order(x)(y) |x,yϵD}_{e}

In fact, the alternative sets for the two coordinates—*ALT*(vP1) and *ALT*(vP2)—are the same, since they have the same focus structure and contain semantically identical nonfocused material.

It should now be clear how Low-Coordinate Parallelism derives the No Correlate Embedding Generalization. If the first coordinate contains any nonfocused material that is not present in another coordinate, the parallelism constraint will not be satisfied. To illustrate this, the focus structure for the ungrammatical gapping sentence in (9), in which the correlates are embedded in the first coordinate, is given in (15).

There is some nonfocused material present in the first coordinate that is not found in the second coordinate. The sentence thus violates Low-Coordinate Parallelism and is ungrammatical.

(16)

- a.
〚vP

_{1}〛 =say(eat(his-peas)(peter))(she) ϵ

ALT(vP2) = {f(x) |xϵD_{e}^fϵD,_{〈e}_{t}_{〉}}- b.
〚vP

_{2}〛 =drink(her-milk)(sally) )

ALT(vP_{1}) = {say(f(x))(she) |xϵD^_{e}fϵD,_{〈e}}_{t〉}

Since there is a focus on the entire VP remnant, the first coordinate is an alternative to the second coordinate. The proposition that she said that Peter ate his peas is a proposition of the form ‘‘*x f*,’’ where *x* is some individual and *f* is some property of individuals. However, the second coordinate is *not* an alternative to the first coordinate: it is not a proposition of the form ‘‘she said that *x f*.’’

Low-Coordinate Parallelism imposes the same parallelism constraint that Rooth’s (1992:102–107) squiggle operator (~) does on so-called bare remnant ellipsis (or stripping). In Rooth’s proposal, one of these operators is adjoined to the clause containing the ellipsis and another is adjoined to the antecedent clause, thus requiring the two clauses to be alternatives to one another. For this reason, it is possible to view Low-Coordinate Parallelism as simply a statement about where squiggle operators must adjoin: there has to be one operator adjoined to each vP coordinate in a low-coordination structure. This derives the parallelism between the coordinates in a more general fashion, in terms of the distribution of Roothian squiggle operators.

However this parallelism constraint arises, it probably holds only of low-coordination structures. As Johnson (2009:293) shows, the antecedent of VP-ellipsis in pseudogapping can be embedded relatively easily inside the first coordinate.

(17)

?She’s said Peter has eaten his peas, and Sally has Δ her green beans, so now we can have dessert. ‘She has said that Peter has eaten his peas; Sally has eaten her green beans.’

## 5 A Prediction

Low-Coordinate Parallelism would seem to predict, for the gapping sentence in (9), that if the embedding material in the first coordinate were contained inside a focus, it would become well-formed. For instance, the correlates in the first coordinate could be the higher subject DP—replaced by the proper name *Mike* so it bears a pitch accent more easily—and the larger VP *said that Peter has eaten his peas*. (Broad foci like this one are often associated with more than one pitch accent, even though it is in principle possible for a single pitch accent to project focus onto such a large phrase.)

(18) ??[MIKE]

_{F}has [SAID that PETER has eaten his PEAS]F, and [SALLY]_{F}[drunk her MILK]_{F}. Intended: ‘Mike has said that Peter has eaten his peas; Sally has drunk her milk.’

With this focus structure, Low-Coordinate Parallelism is satisfied. The embedding predicate in the first coordinate is contained inside a correlate, so that all nonfocused material in both coordinates is semantically identical.

(19)

- a.
〚vP

_{1}〛 =say(eat(his-peas)(peter))(mike) ϵ

ALT(vP_{2}) = {f(x) |xϵD^_{e}fϵD,_{〈e}}_{t〉}- b.
〚vP

_{2}〛 =drink(her-milk)(sally) ϵ

ALT(vP_{1}) = {f(x) |xϵD^_{e}fϵD,_{〈e}}_{t〉}

As can easily be verified, each coordinate here expresses a proposition of the form ‘‘*x f*.’’

But is (18) actually felicitous? Four native speakers of English who I consulted did not generally accept it. On a 7-point scale (where ϵ is completely ungrammatical and 7 is completely grammatical), three rated it as 1 or 2, and the fourth as 4. This is somewhat surprising since (18) has the same focus structure as the uncontroversial ‘‘simple gap’’ in (20); see the parallel examples in Siegel 1987:54 and Johnson 2004:33. (Three of the same speakers rated the sentence in (20) as 6 or 7, including the fourth speaker mentioned above; the final speaker rated it as 5.)

(20) [PETER]

_{F}has [eaten his PEAS]_{F}, and [SALLY]_{F}[drunk her MILK]_{F}.

Just as in (18), one remnant in (20) is a VP that corresponds to a correlate that is a VP. It satisfies Low-Coordinate Parallelism in the same way, but it is clearly well-formed.

There are two plausible reasons why speakers identify a contrast between (18) and (20), both of which are independent of Low-Coordinate Parallelism. First, as one reviewer suggests, gapping might require small correlates and remnants, each bearing a narrow focus. This might even be a more general constraint on information structure, since it evokes Schwarzschild’s (1999) AVOIDF constraint, which favors foci that are as small as possible.

Second, it seems likely to me that the VP *eaten his peas* is a more salient alternative to the remnant VP *drunk her milk* than the larger VP *said that Peter has eaten his peas*. The availability of such an alternative in the first coordinate in (18) might interfere with a speaker’s ability to infer the necessary contrastive relation, contributing to its degraded status. Indeed, when the larger VP in the first coordinate is a more salient alternative than the smaller VP contained within it, judgments improve significantly. In (21), the context makes the VPs *ask whether he can get a deadline extension* and *submit her essay* contrasting alternatives.

(21)

Q: What will each student do when they see the professor?

A: [Sam]

_{F}will [ask whether he can get a deadline extension]_{F}, and [Alex]_{F}[submit her essay]_{F}.

Three of four speakers rated the answer as 6 (out of 7). The fourth speaker, who also judged the basic gapping sentence in (20) to be less grammatical than the others, rated it as 3. This is exactly what Low-Coordinate Parallelism predicts.

## 6 Conclusion

After taking a closer look at the third property of gapping mentioned in section 1, I argued that it exhibits a hitherto unnoticed property: the No Correlate Embedding Generalization. The correlates cannot be embedded just inside the first coordinate. I proposed that this generalization might arise because of a constraint on the focus structures of the coordination structures in which gapping appears. This constraint, which I called Low-Coordinate Parallelism, requires vP coordinates to be focus alternatives to one another. It remains now to figure out why low coordinations might be subject to such an information-structural constraint in the first place, and how this constraint would interact with other constraints on focus.

## Notes

I am grateful to Nate Clair, Elizabeth Coppock, Annahita Farudi, Danny Fox, Kyle Johnson, Laura Kertz, Ben Mericli, David Pesetsky, Craig Sailor, Bern Samko, and audiences at Cornell University, MIT, Northwestern University, the University of California, Santa Cruz, the University of Rochester, and Wayne State University for their helpful questions and suggestions. I also appreciate the comments of two anonymous reviewers, which greatly improved the squib. This research was assisted by a New Faculty Fellowship from the American Council of Learned Societies, funded by the Andrew W. Mellon Foundation.

^{1} As a consequence, the subject of the first coordinate must raise asymmetrically to Spec,TP, while the subjects of other coordinates stay in situ. This accounts for Siegel’s (1987) observation that the subject of the first coordinate c-commands—and hence can bind into—the subject of subsequent coordinates.

(i) No woman

_{1}can join the army, and her1 girlfriend l the navy. (Johnson 2009:293)

Lin (2002:58–94) offers one way of understanding this asymmetrical subject movement (see also Johnson 2004:41–49): if the Coordinate Structure Constraint holds of LF representations (Fox 2000:51–58), and if A-movement can reconstruct, it will not be subject to the island constraint in the first place.

^{2} Importantly, this calculation requires the subject of the first coordinate to be interpreted within the first coordinate, as shown in (11). But for the purposes of variable binding, it must also take scope over the entire coordination, so that it can bind into the second coordinate (see footnote 1). Erlewine (2014:101–105) offers one way of resolving this apparent paradox. He argues that it is possible to interpret a lower copy of movement, like the one the subject leaves in Spec,vP, for calculating the presuppositional meaning of the focus-sensitive operator *even*. This does not affect the sentence’s at-issue content, which arises through composition with the highest copy of the subject, as usual. By analogy, Low-Coordinate Parallelism could be treated as a presupposition that is calculated at the vP level, using the copy of the subject present within the first coordinate.