We make these observations: (a) The direct embedding of a syntactic category X in itself (X-within-X) is surprisingly rare in human language, if it exists at all. (b) Indirect self-embedding (mediated by a sequence of other categories, and usually a phase boundary) systematically goes along with intensionality effects; the embedding and the embedded XP exhibit different behavior at the semantic interface. We argue that these constraints on recursion follow from the way in which single-cycle derivations organize semantic information in grammar.
Recursivity has long been regarded as a core feature of human language design and more recently as the major innovation on the evolutionary path to language (Hauser, Chomsky, and Fitch 2002). In recent Minimalist Program work (Chomsky 2007, 2008), recursion is encapsulated in the primitive combinatorial operation Merge. Our narrow focus here is not recursion as such but recursion in the specific sense of the apparent occurrence of “self-similar” structures involving the recurrence of some syntactic category X within itself.1 Merge defined simply as the formation of binary sets is silent on type recursion in this sense: there could have been hierarchical structures in language generated by Merge without any recurrence in the sense illustrated in (1), where it appears that an object of one category (CP) is hierarchically embedded in an object of the same type, which in turn is embedded in another object of the same type, and so on.
[CP I saw the woman [CP that saw the woman [CP that saw the woman . . . ]]]
[CP [The window [[CP the neighbor [CP the dog bit]] broke]] fell down].
[CP John knew [CP that Peter believed [CP that Mary liked him]]].
In short, Merge, which amounts to discrete infinity, and which consists in an analysis of discrete infinity that has been claimed to be “minimal,” tells us no more and no less than that language is recursive. The question of why language is recursive in the particular ways in which it is must come from elsewhere. Here we argue that the answer lies in how forms of grammatical complexity mediate the organization of grammatical meaning. Linguistic recursion arises from how narrow syntax and semantic interpretation interlock, not from narrow syntax as such.2
Under the assumption of syntactic autonomy, grammatical objects are described as hierarchically organized and generated by such operations as concatenation, Merge, and Label, disregarding semantic interpretation, which is outsourced from grammar and relegated to the operations of an independent “conceptual-intentional (C-I) system” (or the “semantic component”). On the other hand, syntactic structures are sometimes also viewed as instructions for building or constructing certain concepts expressed in language (Pietroski 2011), making them partially generative for a certain (grammatical) semantics as opposed to merely answering the constraints imposed by an independent “semantic component,” in which all forms of semantic complexity already exist. In line with this broad perspective (Hinzen 2006), we argue here that rather than answering constraints from an independent semantics, the grammar imposes its forms on linguistically specific meanings.
In section 2, we survey a number of surprising restrictions on recursion in human language and interpret them against the background of a phase-based grammatical architecture. In section 3, we show how the architecture of single-phase derivation (Chomsky 2007, 2008) makes the rarity of the X-within-X configuration expected, given a specific semantic conception that we present of what phases are. We contrast this account with a Merge-based syntax combined with an appeal to (often un(der)specified and highly hypothetical) “interface demands,” as well as the traditional notion of selection. Section 4 concludes.
2 Restrictions on Recursion in Human Language
In this article, we distinguish between (a) the “direct” (or “immediate”) embedding of a syntactic category X within the same category X and (b) the “indirect” embedding of X within X, where the occurrence of X within X is mediated by a different category, Y. Our central observations are these:
The direct embedding of X within X is extremely rare or absent in human grammar.
Even indirect recursion is never strict: the embedding and embedded categories never behave strictly alike, as their behavior at the interfaces and their syntax suggest.
2.1 On the Direct Embedding of a Category X in Itself: C, D, N, v, and V
Starting with X = C, note that C never embeds in C, directly. A sequence in which Cs occur within Cs really is a [C- v . . . [C- v . . . [C- v]]] sequence, as seen in (2a), or even a [C- v-D . . . [C- v-D . . . [C- v-D . . . ]]] sequence, as seen in (2b).
[CP Allegedly, [TP John will [VP deny [CP that [TP Bill has ever [VP said [CP that . . . ]]]]]]]
[CP Allegedly, [TP John will [VP deny [DP the very possibility [CP that [TP Bill has ever [VP defended [DP the claim [CP that . . . ]]]]]]]]]
So the occurrence of C within C is rigorously mediated by a rigid, linear, and finite sequence of non-Cs. Before a lower C embeds in a higher C, first V (and perhaps also N) has to reach the end of its extended projection (v and D, respectively).3 A similar argument can be made for other, higher, functional categories, such as D and v. This is trivially true if D and v are defined as functional projections of the lexical heads N and V, respectively, with which they form single phasal units in a minimalist framework such as that outlined in Chomsky 2008. Empirically speaking, counterexamples, which would involve adjacent articles (the a book), recursive transitivity (John kissed Mary Bill), or recursive causation (Bathing mudded the pigs the floor), are not found in human language.
The situation is initially less clear with the lexical categories V and N. Consider serial verb constructions. The process of serialization is clearly not unbounded: a limited number of verbs may appear in one serial verb construction. Moreover, serialization is of a templatic nature: not by accident, the term used for it includes the word construction (Pawley (2006:3), with respect to Kalam, argues that “there are certain advantages in a construction-based treatment” of all types of serial verb constructions). Finally, standard syntactic analyses (e.g., Baker and Stewart 2002, Muysken and Veenstra 2006) assign serial verb constructions a structure in which each verb is represented by a separate VP, where VPs adjoin to each other or are paratactically conjoined—hence, are not directly embedded.
Observe the structure in (3), however. Let us suppose (not necessarily plausibly) that no complex underlying structure (e.g., with reduced predicative relatives) is involved and that a set of bare nominals simply enters a strictly recursive structure—without cyclic boundaries or intervening (unlexicalized) categories.4
(3) [N[N[N[N warN filmN] studioN] committeeN] sessionN]
Every round of Merge then applies between two elements of the same category and produces another entity of the same category, effectively deriving a binary version of the arithmetical successor function with which Chomsky (2007) analogizes recursion/Merge in language.
This situation is implied by the analysis of Roeper and Snyder (2005), under which a compound like (3) is generated as follows. First, war is base-generated as an abstract clitic complement of the head film. It then moves from this complement position to left-adjoin to the head, and its trace is deleted. The abstract clitic complement position is made available for the whole structure again; studio is inserted there and remerges at the root of the tree. The same procedure recurs for each new noun that enters the structure (committee, session, etc.). Even this analysis establishes a link between direct unbounded recursion and cyclicity in its traditional sense, because it involves countercyclic generation: recursive Merge in this case targets the bottom position of a structure that has already projected. The core of our argument is that the cyclicity characteristic of syntax bans direct unbounded recursion. Roeper and Snyder’s analysis thus points to the interesting possibility that there are countercyclic operations in syntax—perhaps only at the lexical category level, which thus emerges as a domain in which direct unbounded recursion is possible.
Note however that in principle, and no matter the number of rounds of embedding, a structure at the lexical category level can never have a referential use that is deictic in the sense that a speaker (grammatically) “points” to a specific object located in the speech context or discourse: this requires the projection of the functional layer on top of the structure generated in this fashion.5 And that functional layer will fully obey the principles we argue for here. In short, the one example we have found so far where one might conceivably argue that the “same embeds in the same” (directly) is also an example where the deictic potential of language is not yet fully realized. This anticipates our conclusion that the “antirecursion” effects investigated here precisely have to do with this deictic dimension of language.
So far, we have shown that a wide range of examples, including potential counterexamples to our argument, suggest that direct type recursion is banned in narrow syntax (or, if not banned, then very rare and limited to nondeictic domains). In other words, instances of type recursion in grammar tend to involve, and perhaps always involve, the intervention of other categories between two instances of the same category. One or more of the intervening categories usually are (in current terms) syntactic phases, which means that Transfer to semantics/discourse intervenes between a category and its recurrence; recurrence of category X in the same cycle/phase is therefore impossible. The recursive structure is thus not present within narrow cyclic syntax, a fact that anticipates our central claim elaborated below: syntactic type recursion is necessarily mediated by semantic access.
2.2 Type Recursion as Mediated by the Phase
In what follows, we in essence assume Chomsky’s notion of a phase, according to which phases are “(close to) functionally headed XPs” (2001:14). Beginning from a lexical root, a phase specifies a sequence of functional projections that strictly depends on the lexical category specified for the root in question (e.g., Fin would rarely if ever be encountered in a nominal phase, and a phase beginning with T is not completed by D). In this way, a given functional projection and a given lexical category form a unit of grammatical organization: D cannot occur without N, nor v without V. Note that a head T counts as a “lexical” head on this model (see Chomsky 2008), so C-T forms the same kind of unit as, say, v-V. The semantic task of functional layers on top of lexical categories is to regulate the embedding of lexical concepts in the speaker’s spatiotemporal and discursive frame. In particular, in the nominal domain they regulate the localization of the denotation of a given lexical head in relation to the speaker’s spatial location (the speaker’s Here; e.g., that man over there). In turn, verbal Aspect regulates the embedding of events described by verbal lexical heads in relation to the speaker’s speech time (the speaker’s Now). Finally, the C field is commonly considered to regulate how a configured proposition is embedded in the discourse.6
Given this phasal framework, we note that there does not appear to be phase-internal direct recursion. Indeed, every phase is a radically bounded object, with a lexical root at the bottom and a (short) finite and predetermined sequence of functional projections governing its use in discourse. Could there be indirect phase-internal type recursion?
Three prominent strategies are available to account for phase-internal structure. First, Richards (2007) argues that in a phase theory of syntax, the syntactic structure must grow as shown in (4), where only heads are represented, P stands for a phase head, and N stands for a nonphase head.
(4) P – N – P – N . . .
Here we are dealing with three relevant elements: the nonphase head, the phase head, and the phase edge (which is not represented in (4)). The phase head is a head that closes off the phase; the edge includes one or more specifiers of this head and its projection. On this view, there is no place for phase-internal type recursion.
The opposite of this strategy is to adopt the view of cartographic approaches (e.g., Rizzi 2004) and identify a sequence of up to several dozens of functional projections potentially appearing within each phase. Each projection bears a certain category, which specifies the projection’s contribution to the reference and the embedding in discourse of the lexical head in question. According to the rich empirical research conducted in this program, however, within a given sequence of functional projections, each category appears at most once—with a few potential exceptions.7
A third possible strategy lies somewhere between the two extremes just sketched. It would allow for more than one functional projection per phase, yet without a cartographic expansion. An approach like this, less sensitive to the particularities of each individual projection, could use the same label, hence assign the same category, to several or even all of the projections within the domain of one phase. However, this view is not able to capture the empirical base of our inquiry: typical instances of type recursion such as relative and complement clauses or possessive modification, and their relevant properties.
We conclude that the place to look for productive forms of recursion is between phases: recurrence of a category mediated by a phase boundary such as C, v, or D. Before we give a principled account of such recursion in section 3, we review other nonrecursive aspects of grammar.
2.3 Intensionality Effects in Indirect, Between-Phase Embedding
The semantic significance of the category (or field) C consists among other things in the specification of illocutionary force (e.g., assertion) and possibly additional speech-act-related material. But if one syntactic object of a certain category embeds another of the same category, something curious happens. If C embeds C, the ways in which embedded C can interface with the discourse and speech context are systematically restricted.8 In particular, no clause asserted by the speaker occurs embedded: the embedded clause John left in Bill said that John left, say, merely provides a descriptive condition for an event identified in the matrix clause, namely, an event of SAYING that John left. That this description holds of the event is what is asserted by the speaker.9
On the other hand, if C does not embed, or is a root, it necessarily does carry a specification of its speech-act-related status. Consequently, every assertoric utterance, even if it contains many clauses, necessarily has only one truth value. As Davidson (2005:87) notes, “Only whole sentences have a truth value.” Though any of the embedded propositions in (5) would have assertoric force if occurring as a root, and may be understood as potential assertions by a certain grammatical subject (such as John or Mary), only (5) as a whole is asserted by the speaker.
(5) John suspected [that Mary believed [that he was a policeman]].
Moreover, whether or not the matrix clause is true does not in any way depend on whether or not Mary believed that John was a policeman, or whether or not he indeed was—in essence, Frege’s (1892/1952, 1918/1956) observation about the intensionality of embedded clauses.10 The truth value of an expression is not composed from the truth values that its constituent clauses have when occurring as independent expressions. The embedded clauses do not carry any truth values into the evaluation of the truth value of the whole expression. In this predicament, compositionality can be technically restored by redefining the semantic values of clauses as functions from possible worlds to truth values, and by defining a new operation of Intensional Semantic Composition designed for such combinations (Heim and Kratzer 1998:chap. 4). In this case, matrix and embedded clauses are assigned the same semantic value. However, the explanatory question remains unanswered: why, when clauses occur hypotactically embedded, are they systematically deficient in the sense that they are evaluated only against possible worlds rather than the actual world—an observation essentially equivalent to the one we started out with.
Moreover, the answer to this question does not seem to lie in the semantic values we assign to lexical ingredients of the relevant constructions; rather, it seems to lie in their grammar. Evidence for this claim is that the degree of intensionality of a clause systematically changes or even disappears when we reduce the degree of syntactic “connectivity” between the embedded and the embedding XP, as when moving from clauses that occur as syntactic arguments to clauses that occur as adjuncts, as coordinated clauses, or in a paratactic construction. Hence, intensionality plausibly is a consequence of a syntactic fact: that of (indirect, phasal) recursion in the domain of argument structure.11 For illustration, consider (6). Here the same proposition, that no one shot a bullet, is expressed in three different ways: as an argument (complement of a verb, (6a)); as an adjunct (modifying a clause, (6b)); and as a clause coordinated with another one (6c). While in (6a) the proposition’s truth value has no direct effect on the truth value of the entire sentence, in (6b) it restricts the worlds in which the main clause is true (but is not asserted). In (6c) it has its own independent truth value that is subject to assertion.12
The witness believed that no one shot a bullet.
If no one shot a bullet, then the witness was right.
No one shot a bullet and the witness was right.
Finally, the phenomenon in question is not restricted to matrix verbs taking clausal complements; rather, it extends to other referential elements of language under the same structural conditions, including (finite) Tense and determiners (DPs). Taking the latter first, D is commonly regarded as regulating object reference (the equivalent of force/truth in the nominal domain). Observe what happens when D cyclically embeds in D: the embedded D, though undoubtedly a referential expression, functions very differently from its embedder at the interface.
The vase on the table was green.
John’s mother plays basketball.
The sentence in (7a) by no means entails that the table is green, or that some object composed of the vase and the table is of this color. The italicized expression in (7a) clearly can only be used to deictically refer to the object in the speech context described as the vase, and not (also) to the object described by the embedded DP the table. Similarly, the sentence in (7b) entails only that the mother plays basketball; it does not entail that John does so, too. Of course, the referent of the table in (7a) needs to be computed first, so that the referent of the root is identified. But, occurring embedded in the complex DP in (7a), it is no longer independently referential once the matrix DP is computed: exactly as in the case of embedded clauses, it provides a descriptive condition or property (effectively, a location, understood as the upper surface of the table in question) for the identification of a referent computed in the matrix DP. And here again, if we replace the syntactic hypotaxis with a parataxis, an independent referent reappears (e.g., The vase, the table, were/*was green).
In sum, just as the truth does not embed in the truth, so reference does not embed in reference. Every sentence, no matter how complex, is assigned truth only once, at the root, and the same applies to DPs containing other DPs, with respect to reference. Analogous limitations on recursive embedding of referential expressions can be observed in the domain of Tense. Embedded Tenses do not function in general as matrix Tenses do, as sequence-of-tense effects illustrate (Giorgi 2010). Suppose Bill says, “John said he lied.” Then the truth of this sentence is consistent with John’s having said either “I lie” or “I lied.” Hence, the embedded Past does not fix this aspect of temporal interpretation, as it does when it occurs unembedded. Deictically, embedded Tenses function differently from matrix ones, by interacting with the speech context in different ways.
In conclusion, in a number of different domains, united under some such label as deictic reference (referring to an object located in space, time, or discourse, in relation to the speaker’s deictic frame), it is necessarily the case that at any point in derivation, there can be only one item that has the relevant referentiality. There is no recursiveness in this domain in this sense. It is clear that nothing in the structure of Merge makes us expect that one XP embedded in another XP as a phasal argument will function differently at the interface from the embedding XP. On the contrary, if we wed the idea of Merge to the idea of semantic compositionality, then the embedded XP in fact should preserve its semantic interpretation independently of what other syntactic object it embeds in (otherwise, the semantic value of the whole could not be strictly composed from the semantic values of its parts, and the meaning of the parts would depend on their grammatical relations).13 Formally, no Merge object is defined through its relations with any other syntactic object. Consequently, Merge-based explanations of intensionality will have to appeal to specific semantic types of certain words and novel operations of semantic composition applying to such words only. On the model we will present in section 3, the fact just noted is an immediate consequence of the architectural fact that at any point in derivation, the computational system deals with a single phase, and the phase is a unit of deictic/referential significance.
2.4 Section Summary
By and large, grammar appears to ban direct embedding of categories in themselves. The forms of recursion that exist tend to be mediated by phase boundaries. No type-recursive structure (i.e., no structure with two nodes of the same category along the same projection line) is derived within a single phase. It is therefore the phase (or the cycle) that is instrumental in the genesis of the forms of recursion that occur (in hypotactic contexts). However, when we look at these “indirect” forms of recursion in the realm of argument structure, we see that they are never strict: it is never the case that the embedding category behaves like the embedded one in regard to referential or extensional semantic interpretation, even if we assign the same formal label to both. A principled account of the structure-building process is needed to make sense of this phenomenon—that is, of why recursion is subject to certain restrictions, none of which obtain in the world of adjuncts, coordination, and parataxis, which is unrestricted and lacks precisely the intensionality effects we have noted. We now argue that contrary to a widespread perception, the phasal architecture of grammar developed in Chomsky 2008—with a twist in regard to how a phase is understood as a unit of deixis—provides exactly the elements we need to make sense of the data presented above.
3 Recursion in Single-Cycle Derivation
3.1 Merge and “Interface Explanations”
As anticipated above, a Merge-based view of narrow syntax is often wedded to strong assumptions about the demands imposed by the semantic interface. However, the linguistic specificity of a Merge-based system cannot come from systems that are, by definition, not linguistically specific. Moreover, a problem with any view that maximizes the explanatory contribution of the C-I systems is the absence of evidence that nonhuman animal thought exhibits the relevant forms of structural semantic complexity. For example, suppose, with Chomsky (2001), that the grammar has a phasal organization and that phases are “propositional” configurations, because the C-I system is propositionally structured and narrow syntax is therefore constrained to deliver propositional configurations at the interface, explaining why phase boundaries lie where they do. Then we ipso facto assume that propositionality in thought is available in the absence of specific grammatical configurations that have a propositional meaning and are truth-evaluable (in essence, CPs). Effectively, we assume a full generative system outside of the language faculty, which generates more or less what narrow syntax generates: sentential configurations.
The comparative literature provides strong evidence to the contrary (Terrace 2005, Penn, Holyoak, and Povinelli 2008, Tomasello 2008). Overall, while nonhuman animals may share forms of symbolic representation covarying with environmental features in ways necessary to their survival (Gallistel 2009), they do not seem to think like us, or have a concept of thought, allowing them to think thoughts about thoughts, of the sort that a recursive structure encodes. The primitives of their symbolic representations also do not function as words do, and they do not point or refer deictically. Grammar thus makes a difference. A principled account of what the intrusion of a grammar into the brain adds to such modes of thought is therefore needed, as it would also shed more light on the precise requirements of the interfaces. Chomsky (2008: 141) argues that “C-I incorporates a dual semantics, with generalized argument structure as one component, the other one being discourse-related and scopal properties. Language seeks to satisfy the duality in the optimal way.” No evidence for this hypothesis is provided, and prospects for finding evidence that such a semantics—in essence, all of propositional thought as it is manifest in the semantics of human languages—is present in nonlinguistic systems of thought appear slim. Clearly, the alternative proposal, that the relevant forms of thought come from grammar (rather than externally constraining it), should be explored. Indeed, the grammaticalization of the brain remains a plausible explanation for the Aurignacian revolution and the entirely novel mode of thought it reflected, if not for the human speciation event around 150,000 years ago itself (Crow 2004, 2008).
In the following section, we precisely specify the interface constraints that we argue have led to the cyclic design of derivations. As we aim to show, these are constraints that reflect the workings of grammar itself and are not external to it (Hinzen 2006, 2007). Grammar, by inventing the cycle, has fundamentally changed the format of human thought.
3.2 Deriving Recursion in a Single-Cycle Syntax
With hierarchy at its disposal, a system of thought could distinguish syntactic objects occurring in different configurational positions. Yet how these configurations will be used at the semantic interface—the form-meaning mapping—is not clear from the mere existence of a hierarchy. Concluding that the mapping is unmotivated, arbitrary, or merely conventional should be a last resort. Ideally, the organization of grammar should feed directly and systematically into grammatically specific forms of semantic organization. We argue that this is the case: grammar “carves out” semantic space. More specifically, the concept of a phase, thought of in a slightly novel way, provides exactly the right ingredients to explain the ban on direct recursion and the intensionality effects we have noted.
Nonhuman animals represent abstract entities such as numbers or numerosities via mental representations, without using words to make reference to any instances of these. Long before language, the brain has a system of tracking up to four objects (e.g., Pylyshyn 2001, Vogel, Woodman, and Luck 2001), which requires symbols for the objects being tracked. Again, though, having words for referring to these objects is a different matter, and grammar clearly vastly expands on the deictic possibilities that nonhuman animals as well as preverbal infants have. Let us think of grammar, then, as a mechanism for using concepts that have become lexicalized (i.e., words) to engage in novel acts of referring to the world, all of which are rooted in the speaker’s deictic frame (the speaker’s Self, Here, and Now at a moment of speech), while pointing arbitrarily beyond it.
A grammaticalized form of deixis arises the moment that a given percept is grammaticalized as a noun or verb. If RUN is a lexicalized root percept, say, then the noun run and the verb run will encode a crucial difference: not in what is being referred to, but in how a speaker refers to it (the mode of signification)—in this case either as an individual (a run) or as an ongoing event colocated with the point of speech, as in Mary runs. The “parts of speech,” then, once they emerge, encode a difference in how we take the external world into view: what referential perspective we adopt, in a way that does not track independently identifiable physical features of the external world.14
To grammaticalize root concepts as parts of speech is to obtain elements that as such can enter grammatical constructions. These combine the parts of speech in a restricted fashion for purposes of more complex forms of deixis that are propositional in character. It is here that the cycle, as the next evolutionary innovation in the progress toward a grammaticalized mind, comes to the fore. In any such cycle, a given lexical item will first only be a predicative root—something that can be predicated of something. Lexical items, therefore, when they are fed into a syntactic derivation, begin their grammatical life as predicates. We assume this is true for every cycle. Qua lexically stored percepts, lexical items are not predicates or arguments: the notions “predicate” and “argument” are grammatical ones, and nothing is either the one or the other before it enters a grammatical construction. A derivation then reflects the process of evaluating a given lexical head referentially by providing further structure that allows a referent to be determined for it at the syntax-discourse interface: this old man, all men, a man, man (interpreted as kind-referring, as in Man comes from Africa), and so on. When this process is finished, the predicate has become a syntactic argument: it is referentially complete. We suggest that this happens at phase boundaries: they are the points in the derivation where the process of referentially evaluating a given predicative head becomes complete and an argument is created. It is complete when the descriptive information within a head is sufficient in a given discourse context for a hearer to identify the intended referent within the speaker’s deictic frame.
A phase, therefore, is a unit of deixis. Different phases correspond to different forms of deixis. This revises the original intuition behind the notion of a phase, yet stays close to its initial motivation. The motivation was that language is efficiently organized around certain units of computation that exhibit a certain measure of “completeness” (Chomsky 2001). In Chomsky’s terms, mentioned above, they are “propositional” in some sense. Yet DP arguably is a phase, and it is not propositional. In fact, vP, an even more paradigmatic phase, is not propositional either, since no vP can be evaluated as true or false. Needed for that is, minimally, Finiteness (deictic localization in discourse), which is specified higher than vP. While vP is therefore something that, contingent on combination with other phases, can become a fully extensional unit of deictic significance, it does not have that completeness in and of itself. Unlike the notion of a proposition, deixis directly connects with language use.
From the internal structure of the phases, it is clear what the relevant forms of deixis must be. As noted, nominal phases (deverbal nominalizations aside) do not take dependents and can only deictically localize a lexical head in space. Nominal denotations are thus not ordered in relation to the speech event (as going on while it takes place or as having ended before it, for example). Hence, their reference can only be an “individual” (assuming it is in the nature of individuals that they lack participants/dependents and do not proceed, happen, or last). The denotation of a verbal phase (vP), in turn, can only be an event. Yet it will always embed at least one object (the product of a nominal phase) and presuppose a denotation of that sort: any event has participants. Again it is clear that the split into a nominal and a verbal configuration encodes a difference, not in what these configurations refer to, but in how they refer: a difference in deixis.
Given such referential dependence and the asymmetry of the relations of presupposition between CP, vP, and DP, and given the same dependence and asymmetry between their conceptual correlates ( propositions, events/states, and individuals, respectively), it transpires that the phases are ordered with regard to one another as parts are to wholes. There are essentially no choices here—either in the progression of functional categories internal to a phase in a given language (or even universally, if Cinque (2005) is right), or in how phases are ordered with respect to one another (no proposition occurs in the absence of an eventuality on which it is based).15 The entire architecture, therefore, is essentially templatic. The phases, and their asymmetrical hierarchical dependencies, are those templates.
Now, how does this phasal architecture give rise to recursion and indeed to the restricted forms of recursion we find? The answer is now fairly immediate. Phases are defined to be (a) units of computational and derivational memory, and (b) units of interpretation involving access to the interfaces. Point (a) entails that no such unit of interpretation contains another such unit—a situation that would erase the advantage of computational efficiency that phases were meant to provide. Ipso facto, there will be no phase-internal recursion of phases. If the recursions are based upon phasal units, and such units do not contain other such units, such recursion must be absent. But point (a) also entails that upon completion, such units cannot stay as wholes in the derivation: transfer of structure needs to take place; otherwise, the phase cannot close. But we know from point (b) that some semantic value needs to be determined at the boundaries of these units (they are units of deixis on the view taken here, and evidence exists independently of this specific view that all three phases involve access to the syntax-discourse interface; see Aboh 2004 for D; Jayaseelan 2008 for v). We know, moreover, that in many instances such a determination will necessarily fail to be fully extensional. As noted, this failure is immediate in the case of the v and D phasal heads; furthermore, within D, an extensional semantic value may fail to be reached, as in the case of nondefinite and nonspecific DPs. But C, too, often lacks elements that are needed to obtain a truth value, as in the case of nonfinite clauses, exceptional Case marking, and the like. Whenever such a failure takes place, the derivation, designed to yield referential completeness, cannot stop, as it would do if the whole phase were transferred, causing the derivation to terminate. In case the derivation continues, the head of the lower phase will have to remain in the workspace when the next phase is started, along with whatever material now appears at its left edge: all the material whose being transferred in the lower phase would have led to a crash, since it needs to play a role in the derivation higher up. The lower phase head will be the only element of the lower phase that the computational system now “sees.” The lower phase, as such, is gone. Therefore, at any one point in the derivation, there is only one phase.
The mechanism of phasal composition is depicted in (8). Two phases are shown, each consisting of a phase head P and an associated nonphase head N. LE is the left edge. Upon transfer of N in the first phase, LE becomes part of the second phase.
In this derivation, the lower phase head is interpreted twice: at the lower phase boundary, to determine a discourse referent (closed rectangle); and as a part of the next phase (open rectangle). But in that next phase, it can only function as a predicate that becomes part of a descriptive condition designed to identify the referent of that phase. If the lower P is D, say, as in this man, the referent computed in the higher phase may be mapped from v; that is, it will be an event, of which this man will then be a (thematic) predicate. The event might be identified as one of killing this man. Or consider our earlier example, repeated here:
(9) The vase on the table was green.
Inside the nominal phase, the table first determines a discourse referent at its own phase boundary—a particular familiar table. Then it is reintensionalized through the preposition on, now specifying a location (specifically, a surface) that serves to identify the referent of the matrix phase, the vase.
Note, then, how and why there is no recursion within narrow syntax itself. In narrow syntax, there is only ever one cycle, as (8) directly suggests: at any one point, there is only ever one rectangle. When a lower phase is spelled out, the structure inside the closed rectangle in (8) disappears, and the only remaining structure is again one cycle, depicted by the open rectangle. Generation only ever sees a limited, cycle-internal structural configuration. No more than one “active” or projecting head can ever be present in the derivational workspace at any one time. Bound to the single phase in this fashion, the computational system (narrow syntax) never computes over a recursive structure consisting of whole cycles.
It also follows that no embedded argument can ever be interpreted solely referentially (as opposed to predicatively); and this model explains why there also can only ever be a single predicate, and a single referent, within a phase. Any argument contributes a descriptive predicate in regard to the referent of the phase it becomes part of, playing a thematic role in relation to it. There are a restricted number of such roles that nominals can play in events, or that nominals can play in other nominals (such as possessor or location constructions). If the information contributed by an embedded argument that plays a further role in the derivation is always descriptive, it is also always (to some degree) “intensional”: on this model, no argument can ever be fully extensional, as part of a larger expression. This directly speaks to (neo-)Davidsonian event representations and makes syntactic sense of them: the event variable establishes reference via its predicate, which describes a particular referent (in this case an event, owing to the nature of the verbal phase); each argument specifying a participant in this event is embedded through a two-place predicate specifying its thematic role, effectively turning the argument into a predicate with respect to the event argument.16
John killed Bill.
∃e. before (e, R) & kill(e) & agent(e, John) & patient(e, Bill)
[VP . . . [AgentP [John] agent . . . [PatientP [Bill] patient]]]
In syntactic terms, as in the simplified representation in (10c), an argument can merge in a VP only if the structure contains a head that specifies a certain thematic role for its specifier, thus turning the ( potentially referential) argument in its specifier into a predicate. If there is no specification of how an argument should be turned into a predicate, the argument cannot be merged. This allows us to explain selection in terms of a more fundamental restriction: only “intensional(ized)” expressions can be subject to syntactic structure building within a phase. Full extensionality is achieved only when the last phase is reached, given the inherent limitations of nouns and verbs in establishing reference. Even a proper name, which is the maximally extensional form of reference in the nominal case and will identify a discourse referent, needs to be reintensionalized (turned into a predicate). In other cases, the embedded phase will need to get relativized, or a preposition will attach to it. Other defective phases (e.g., nonfinite clauses) never become fully extensional: they cannot anchor in time or discourse.
It is also immediately clear why truth values, once assigned, cannot recur or embed—why they cannot become arguments. Recursion in grammar is organized phasally, phases are units of deixis, and pointing to a truth is the most extensional form of reference that human language achieves. This is equivalent to saying that Transfer is now complete: it now necessarily includes the phasal head, and the derivation is gone (i.e., transferred to the interfaces).17
The above model provides a grammatical explanation of why indirect recursion goes with intensionality effects, and why recursion is indirect: why v interleaves, say, between C and another occurrence of C. Phases are units of deictic reference, and reference to a unit of discourse such as a proposition presupposes prior reference to an event on which the proposition is based. It is necessarily the case, then, that before a maximal form of deixis is reached (as in the right kind of specification of C), all weaker forms of deixis that this form of deixis architecturally presupposes will need to be computed. In understanding the recursive architecture of grammar, the instrumental factor is how grammar is organized around deictic distinctions. These reflect forms of reference that human grammar generates or enables, including a uniquely “propositional” form of positioning ourselves toward the world. From this point of view, to posit propositions outside of grammar (in the “C-I systems”) is to deprive us of the very tool that provides us with such structures.
In summary, while it looks on the surface as though cycles recursively embed in one another, this is in fact not the case in the phasal architecture, and it goes against the very conception of single-phase derivation in the sense of Chomsky 2008. In line with the Strong Minimalist Thesis, the recursions the grammar creates are not purposeless; rather, they follow a scheme for determining increasingly complex forms of language use in communication and discourse. Each cycle allows the speaker to perform an act of reference, after which the cycle is eliminated from the process of generation, while the head that corresponds to this referent stays behind and contributes a descriptive predicate to the next phase.
In a biolinguistic perspective, it is necessary to explain the specific forms of recursion that occur in the domain of language, and their inherent restrictions. We have shown that “directly” or immediately recursive structures do not seem to exist in human language. “Indirectly” recursive structures do exist but are mediated by phasing and semantic access; intensionality systematically arises in them, and extensional forms of semantic evaluation are characterized by nonrecurrence within a derivation. We have argued that these facts must find a grammatical explanation. One may try to provide this explanation by combining a generic operation Merge with language-specific restrictions on the structure-building process. The latter must then explain the cyclic design of grammar. Here, standard minimalist conceptions invoke phases, and these are supposed to have an interface rationale—that is, to derive from conditions imposed on grammar by the CI systems. In this article, we have suggested that forms of semantic complexity corresponding to the cycle in grammar are precisely what the grammar contributes to the organization of the C-I systems, reorganizing them in ways that explain a uniquely grammatical form of thought unavailable lexically or on any nonlinguistic side of the interface.
On this model, the cycle is a basic operation of the grammar itself. One could decompose it into smaller units, which the operation Merge then composes into the cycle. However, cycles thought of as phases are unified and single units of composition, and whatever parts they contain are not independently interpreted: the phase is interpreted only as a whole and acquires its referential semantic identity at the root. The proposal also leaves unexplained why cycles, with a specific deictic significance, arise. One could take it as a brute fact that some lexical heads are phasal ones that trigger internal Merge. But the lexicon, which we assume consists only of unspecified roots, is not the place where phases belong. It makes sense, therefore, that the basic unit of grammatical organization is the phase, not the phrase.
Phases are rigid and templatic progressions of functional projections from a given lexical head, which systematically regulate what form of reference is obtained at their edge. Given the templatic and finite nature of progressions of functional projections emanating from lexical heads, it makes little sense to speak of processes of “selection” within a phase (see also Chametzky 2003). If C presupposes T (or contains it), there is no point in saying that it “selects” it: it would be equally strange to say that the number 3, by containing or being built from the number 2, “selects” it. Even between phases, the operation of selection makes little conceptual sense, given that phases themselves stand in asymmetrical relationships of presupposition, reflecting a conceptual directionality. One could try to account for phases merely in terms of computational efficiency, but as we have argued, this leaves out the essential semantic contribution that phases make: they systematically organize forms of deixis, all apparently unavailable in nonlinguistic beings, or lexically.
We have therefore proposed that the basic unit of grammatical organization is the single phase. Given different lexical specifications in its initial head, a phase generates different forms of deixis and automatically, as well, the forms of recursion found in human language. Being confined to the single phase, the computational system can never compute recursively over two phases at the same time. Given hierarchical relations and semantic relations of presupposition between the phases, and the general purpose of grammatical generation (ultimately reference to a truth value), a phase will never embed a phase of the same category. Given the way a phase can become part of another phase (namely, in virtue of its head and periphery figuring as a predicate in the next phase), intensionality effects in embedded contexts are predicted.
Funding from the following grants is gratefully acknowledged: The Origins of Truth (NWO 360-20-150), Un-Cartesian Linguistics (AHRC/DFG, AH/H50009X/1), Natural Language Ontology and the Semantic Representation of Abstract Objects (MICINN, FFI2010-15006, and JCI-2008-2699). We are also deeply obliged to Eric Reuland for his continued interest in this project.
1 This appears to be the focus of discussions of recursion in linguistics (for different senses of recursion, see Fitch 2010). Thus, Everett’s claim that Pirahã is not recursive is based on the observation that Pirahã is “the only language known without embedding ( putting one phrase inside another of the same type or lower level, e.g. noun phrases in noun phrases, sentences in sentences, etc.)” (2005:622).
2 This is consistent with Fitch’s (2010:78) suggestion that only if we “combine intuitions about whether a string is grammatical or not, and whether the units are of the same type of referent” (our emphasis) can we approach some sort of empirical test for whether recursion in the sense of self-embedding is really present in language. Fitch suggests “us[ing] the correct [semantic] interpretation to probe the underlying rule system.”
4 Note that in a system like Di Sciullo’s (2005), each of these rounds of embedding would probably be followed by a morphological phase; this is consistent with what we propose below.
5 In what follows, we use the notion “deictic reference” for any nonpredicative and nonmodifying uses of expressions, insofar as these involve access to the discourse and establish relations between objects referred to and the deictic center of the speaker (the speaker’s Here, Now, and Self: see Buehler 1934). This includes quantificational uses insofar as these involve the projection of a functional layer, as well as discourse and presupposition relations. Each phase has an inherent—and inherently limited—deictic potential: for example, a nominal phase can maximally refer to a specific individual object; it can never refer to a truth.
7 One exception is the possibility that functional projections of Topic (and perhaps also Focus) may appear more than once within the C domain (Rizzi 1997, and much subsequent work). Even in such a case, the number of recurring instances of the category Topic on the projection line within one phase would be strictly limited: there can be at most two of them; moreover, they are argued to attract different kinds of topics (e.g., Westergaard 2007), hence corresponding to closely related but still not identical categories. Another exception is the appearance of two PP projections within an expanded functional sequence of PPs—for instance, Den Dikken’s (2009) locational and directional PPs, [PPDir . . . [PPLoc . . . ]]. But the above arguments apply here as well.
8 Although typical root phenomena are known to occur in embedded contexts (Heycock 2006), they are systematically restricted and can disappear completely: for example, after matrix verbs of the regret/resent type, as in *John resents that THIS BOOK Mary chose (cf. Haegeman and Ürögdi 2010:122). The model we propose here explains this asymmetry between interpretations at the root level and at nonroot levels: root-level phenomena are the properties of fully extensionalized syntactic cycles.
9 In a factive construction like John regrets that he stole the money, the speaker refers to a fact—the fact that he stole the money—that is presupposed to obtain. But the proposition that he stole the money is not asserted in this speech act. Rather, the existence of the fact in question is a felicity condition on the speech act as a whole: where no such fact obtains, the regret in question is pointless. A factive clause in this sense is an instance of a referential expression, as on the accounts of both Haegeman and Ürögdi (2010) and Sheehan and Hinzen (2011). Other factive complements allow root phenomena; however, the degree of extensionality always falls short of reaching full assertoric Force, as long as the embedded clause is a syntactic argument rather than a conjoined adjunct or paratactic constituent.
10 An anonymous reviewer notes that embedded clauses do sometimes show certain root-level effects, such as the relative first person in this example:
J̌on J̌әgna nә-ññ y
John hero be.PF-1sO 3M.say-AUX.3M
‘Johni says that hei is a hero.’
(Lit.: ‘Johni says that Ii am a hero.’)
This observation reinforces our observation above: a first person pronoun can never be bound by anyone but the speaker in root clauses, unlike what happens in embedded contexts such as (i).
11 Another reason for the same conclusion is that the lexical semantics of a matrix verb such as believes is consistent with the absence of intensionality in its complement. That is, believes could mean whatever it does, lexically, and Bob believes she is wrong could still mean ‘She is wrong and Bob believes so’. This, obviously, is the wrong semantics, but the point is that the lexical meaning of believes does not seem to be the reason.
12 Other forms of syntactic dependency are those between tags or evidentials and main clauses. These leave the assertoric force and truth value of the main clauses essentially unaffected—exactly as one would predict on the basis of the assumption that neither tags nor evidentials take main clauses as syntactic arguments.
14 The external data could be the same, yet on one occasion we use a noun, on another a verb. So a form of stimulus independence arises together with this grammatical “trick”—but not with Merge, which has no intrinsic connection to a creative use of language at all, as it could obviously be implemented in a computer or a robot.
15 While there are essentially no choices, there can be some delay: a v can take another v, before it takes an N, say.
16 In this sense, phase boundaries are syntactic correlates of generalized Davidsonian arguments: an individual, event, and proposition variable is introduced at every phase boundary and is sent to the discourse interface to identify a referent, on the basis of the description specified by the predicate it is assigned.
17 Some further elements of structure can be added at this point, such as tags or evidentials, but none of these adds any compositional content to the proposition already configured, none is recursive, none creates an argument structure, and none changes the truth value already assigned.