This article argues that there can only be one chunk-defining device in grammar: a theory cannot afford to have the same work done twice, once by phases, a second time by prosodic constituency. As it stands, however, phase theory is unable to describe all phonologically relevant chunks; these are too small and too diverse to be delineated. To qualify as the only chunk-defining device in grammar, phase theory therefore needs to be made more flexible—that is, to be adapted to the demands of phonology. To allow phase theory to describe all phonologically relevant chunks, we propose the separation of the Spell-Out operation from the Phase Impenetrability Condition (PIC). When Spell-Out occurs, every access point may or may not be associated with a PIC at PF, and the same optional endowment with a PIC holds for syntax. This is what we call Modular PIC. Empirically, on the basis of Abruzzese raddoppiamento fonosintattico and data from Bantu, we show that PIC effects in syntax and phonology are entirely independent: a given Spell-Out operation may leave traces in both modules, in either one, or in neither.
Domains of phonological computation (i.e., strings that are computed by phonology in one go) can be identified in two ways: derivationally and representationally. In earlier models of the architecture of grammar, the derivational way of identifying domains is carried out by cyclic derivation (the transformational cycle in Chomsky and Halle’s (1968),Sound Pattern of English, henceforth SPE) or by levels in Lexical Phonology (Pesetsky 1979, Kiparsky 1982 and later works); under current theories, it is achieved through strata in Stratal Optimality Theory (Kiparsky 2000, Bermúdez-Otero to appear) or phases (Chomsky 2000 and later works). In the representational approach, phonologically relevant chunks are expressed as juncture phonemes in structuralism, as hash marks (#) in SPE, and, since the early 1980s, as prosodic constituency (Prosodic Word, etc.) (Selkirk 1981 , 1984, Nespor and Vogel 1986).
Since SPE, both ways of defining chunks have been considered to be compatible, and since the development of Prosodic Phonology in the early 1980s, the division of labor has been roughly defined by word size: cycles determine domains below the word level, while prosodic constituents delineate domains of word size or larger. That is, roughly, Lexical Phonology can handle strings of morphemes but cannot identify larger units because postlexical phonology is noncyclic (Kiparsky 1982:131). Strings of words are therefore structured by prosodic constituency. This complementary distribution of competences is made explicit by Hayes (1989 :207), among others.1
If phase theory is on the right track, this division of labor must be wrong: the very essence of phase theory is to define items that are bigger than the word, and to send them to PF (and LF). Against this backdrop, our first goal is to show that there can only be one chunk-defining device: a theory cannot afford to have the same work done twice. We argue that this unification must be in favor of the derivational mechanism: phase theory has independent syntactic motivation, while prosodic constituency on the phonological side does not.
Our second goal follows from this idea. As it stands, phase theory is unable to describe all phonologically relevant chunks, as these are too small and too diverse to be delineated in the current system. To qualify as the only chunk-defining device in grammar, phase theory needs to be made more flexible; in other words, it needs to be adapted to the demands of phonology.
Both goals are Minimalist in kind: parallel and competing grammatical devices are shrunk into one, and a central piece of current syntactic thinking, phase theory, is adapted to interface conditions. In sum, the existence of phase theory triggers a substantial modification of the phonological landscape (prosodic constituency has to go), and this result then circles back to syntax to require amending phase theory itself. This back and forth is expected in an interface-driven environment. It also arbitrates between competing views in one module by bringing the properties of another module to bear (intermodular argumentation; Scheer 2008, 2009).
To allow phase theory to describe all phonologically relevant stretches of the linear string, we propose separating the Spell-Out operation from the Phase Impenetrability Condition (PIC). The way phase theory currently works, the two necessarily cooccur. In the amended version outlined below, there is a language-specific set of phase heads, which we call the phase skeleton. When Spell-Out occurs, every individual access point may or may not be associated with a PIC at PF, and the same holds for syntax.
As a matter of fact, Spell-Out itself does not leave any trace in phonology or syntax; it is only when it is endowed with a freezing effect that distinguishes ‘‘old’’ (already computed) from ‘‘new’’ (not yet computed) strings that an opacity effect is observed. This freezing effect is brought about by the PIC. It follows that the system is bicompositional and in principle allows for Spell-Out to occur vacuously—in other words, without enforcing the PIC. That is, it is possible for Spell-Out not to leave any footprint. It remains true, however, that the (phonological and syntactic) opacity effects of cyclic derivation are necessarily caused by a Spell-Out operation. We show in section 3.2 that a great many situations in phonology correspond to this description and that PF-neutral Spell-Out is actually the unmarked case. In section 3.3, we also recall that the absence of any footprint left by Spell-Out in either syntax or phonology is quite a trivial situation that is encountered under current assumptions.
We refer to this modified version of phase theory as Modular PIC because it allows for the PIC to produce an effect in one module, but not in another. The module-specificity of the PIC is illustrated by the threefold pattern of the Abruzzese data discussed in section 4: a PIC effect on raddoppiamento fonosintattico (RF) occurs in both syntax and phonology (passives), or in neither (actives), or in phonology but not in syntax (unaccusatives). The fourth logical possibility is that a PIC effect constrains syntax at some access point, but not phonology. An example of this is English t-flapping, which operates across any word boundary, including vP (see section 3.2).
The second set of data, discussed in section 5, comes from Bantu. Cheng and Downing (2007, 2009, 2012) use relevant configurations to demonstrate that phase theory as it stands is unable to delineate some phonologically relevant chunks. We agree, but instead of interpreting this as motivation for an extra grammatical device (prosodic constituency), as Cheng and Downing do, we show that our amended version of phase theory is able to cover the patterns observed.
The structure of the article is as follows. Section 2 provides relevant background information. Section 3 outlines Modular PIC, whose workings are introduced in section 4 on the basis of Abruzzese external sandhi. Section 5 presents and reanalyzes Cheng and Downing’s data from Bantu, and section 6 provides concluding remarks.
2.1 Modularity Requires Translation: Representational and Derivational Incarnations
Translation, or mapping, is a necessary consequence of modularity—namely, the idea that the mind and grammar are organized in a number of distinct computational units, each of which works with a domain-specific vocabulary (Fodor 1983). In the generative tradition, the modular architecture of grammar manifests itself as the inverted T model (Chomsky 1965:15–18). On modular assumptions, there is no way in which phonological computation could understand, parse, or process morphosyntactic vocabulary (e.g., adjunct). This is because every computational system works with a specific vocabulary and hence cannot understand or parse any other. In cognitive science, the symbolic nature of computation is called domain specificity (e.g., Fodor 2000, Gerrans 2002).
Since the nineteenth century, translation in the modular sense has been achieved representationally. Objects were inserted into phonological representations that were nonphonological in nature but carried morphosyntactic information: structuralist juncture phonemes, SPE-type diacritics (#, +), and finally prosodic constituency (Selkirk 1981 , 1984). Derivational translation, a genuinely generative contribution to linguistic thinking, was introduced by Chomsky, Halle, and Lukoff (1956:75). Derivationally defined cycles have been given various names: the transformational cycle in SPE, the phonological cycle in the 1970s (Mascaró 1976), levels in Lexical Phonology (Kiparsky 1982 and later works), and finally phases today.
On the representational side, the central claim of Prosodic Phonology in the 1980s was Indirect Reference (i.e., the implementation of the modular requirement for translation): namely, computational instructions in phonology (rules or constraints) must not make reference to any morphosyntactic categories (‘‘dative,’’ ‘‘3rd person,’’ ‘‘adjunct,’’ etc.). A corollary of Indirect Reference was nonisomorphism (Selkirk 1981 :138, Nespor and Vogel 1986): following the idea implemented in SPE’s readjustment component, the output of morphosyntax is sometimes not ready to be used as the input to phonology and therefore needs to be readjusted. The fact that morphosyntactic and phonological structure may not coincide has been established in the literature since Chomsky and Halle 1968:371–372, where the oft-quoted cat-rat-cheese example comes from (see Samuels 2011a, Scheer 2011:sec. 416).
2.2 Prosodic Islands, Isomorphism, and Nonisomorphism
As a reaction to syntactic phase theory, Prosodic Phonology has developed prosodic islands. The idea is that the procedural and representational means to define chunks converge: the constituents of the Prosodic Hierarchy roughly correspond to syntactically defined phases. Studies following this line of thought include Dobashi 2003, Piggott and Newell 2006, Ishihara 2007, and Kahnemuyipour 2009 (Elordieta 2008 also offers an informed survey).
Kratzer and Selkirk (2007:106), for example, propose that ‘‘the highest phrase within the spellout domain is spelled out as a prosodic major phrase’’ (emphasis in original). They assume that only CP and vP are phases and that CPs and vPs therefore correspond to major phrases on the phonological side; this equivalence should be universal. Language-specific variation in prosodic phrasing is then obtained not by the syntax-phonology mapping as it was previously (see footnote 3), but purely phonologically by ‘‘prosodic markedness constraints, which operate to produce surface prosodic structures that are more nearly phonologically ideal’’ (Kratzer and Selkirk 2007: 126). This is a significant departure from a Prosodic Phonology essential: mapping becomes universal and phase-driven, while the substantial language-specific variation in prosodic phrasing (i.e., chunk definition) is achieved in the phonology by purely phonological mechanisms.
The idea that phases (which did not exist in the 1980s when Prosodic Phonology was developed) and constituents of the Prosodic Hierarchy are isomorphic may indeed seem appealing. Both delineate chunks of the linear string that serve as domains for the application of phonological processes; this is what prosodic constituency is all about.
However, the question then arises of why the chunk-defining job should be duplicated: if chunks can be defined by phases alone, then what is the purpose of prosodic constituents? It should also be noted that the position of a prosodic islands theory is exactly the reverse of both the SPE and regular Prosodic Phonology positions in claiming that morphosyntactic chunking (phases) and phonologically relevant domains (prosodic constituency) are isomorphic: as mentioned above, nonisomorphism was a central claim of Prosodic Phonology in the 1980s and 1990s.
2.3 Unlike Prosodic Constituents, Phases Have Independent Syntactic Motivation
The definition of phases has varied over the years: Chomsky characterizes them as propositional (Chomsky 2000), in terms of lexical subarrays (Chomsky 2000), as an escape hatch for movement (Chomsky 2004), or as domains for the valuation of unvalued features (Chomsky 2008). A common ground, though, is the relevance of phases for the cyclicity of movement. The relevance of phases has also recently been discussed in relation to interface conditions, with the specific proposal that phases have independent syntactic motivation because of the requirement to immediately dispose of recently valued, previously unvalued features (Richards 2007, Gallego 2010, Uriagereka 2011).
Pak (2008), in a Distributed Morphology environment, and Samuels (2011a,b) argue that unlike phases, prosodic constituency has no syntactic import, claiming that if phase structure provides all relevant information, the Prosodic Hierarchy is redundant and needs to be eliminated (see also Seidl 2001); the Prosodic Hierarchy can be reduced to phases, but phases cannot be reduced to the Prosodic Hierarchy.
2.4 Phase Theory Needs to Be More Flexible
Below, we aim to show that phase theory can potentially do everything that the Prosodic Hierarchy can do. In other words, phase structure and the prosodic constituency that has been devised in order to define phonologically relevant chunks are always isomorphic.
Our project will fail if it can be shown that prosodic constituency carries out work in phonology that could not possibly be taken over by phase theory. This is precisely the argument against prosodic islands made by Cheng and Downing (2007, 2009, 2012) on the grounds of Bantu data, particularly in their paper titled ‘‘Prosodic Domains Do Not Match Spell-Out Domains’’ (Cheng and Downing 2012). They conclude that there are phonologically relevant domains that cannot be described by phase structure as it stands. We agree, but emphasize as it stands: to be able to take over the chunk-defining function from prosodic constituency, phase theory needs to evolve. That is, phases must be small enough to allow every phonologically relevant stretch of the linear string to be described. Phonologically relevant chunks are quite diverse across languages, and this variation is classically expressed by language-specific prosodic phrasing.3
A brief look at the evolution of phasehood in recent syntactic discussion is encouraging in the sense that it converges with the phonological demand for the definition of relatively small chunks. Chomsky’s (2000) original take on phasehood identifies C and v, and maybe D (Chomsky 2005:17–18), as phase heads. Since then there has been a constant trend toward granting phasehood to smaller and smaller chunks (Den Dikken (2007:33) and Grohmann (2007) provide an overview; see also Scheer 2011:sec. 773): the idea of a DP phase head is followed, but DP-internal phases are also argued for (Matushansky 2005). TP is controversial: while Chomsky (e.g., 2000:106, 2004:124) explicitly states that TP does not qualify as a phase head (because it is not propositional), Den Dikken (2007) points out that, according to Chomsky’s own criteria, this conclusion is far from obvious. TP is indeed assumed to act as a phase head in a growing body of literature, and nodes below TP such as Voice (Baltin 2007, Aelbrecht 2008) and AspP (Hinterhölzl 2006) are also granted phasehood. The vanishing point of the atomization of phasehood is a situation in which all nodes trigger interpretation—or, in other words, where interpretation occurs upon every application of Merge. This radical position—spell-out-as-you-merge—is defended by Samuel David Epstein and colleagues (Epstein et al. 1998, Epstein and Seely 2002, 2006).
3 Modular PIC: Spell-Out May or May Not Be Associated with a PIC
3.1 Spell-Out Only Produces Effects because of the PIC
On the phonological side, phase theory eliminates the deeply rooted idea that there are no derivationally defined chunks above the word level—in other words, that postlexical phonology is noncyclic (Kiparsky 1982). In Chomsky’s (2000) initial and most conservative incarnation of phase theory, v and C are phase heads that define Spell-Out chunks that are bigger than words. From a modular perspective, it is certainly reasonable to think of a computational system as being shaped by its input conditions. It is not reasonable, therefore, to assume that phonological computation is constantly accessed by chunks of increasing size above the word level without showing any effect of this piecemeal input. Hence, if phase theory is on the right track, postlexical phonology can hardly be noncyclic.4
Spell-Out, however, does not itself produce any effect. It only leaves a footprint because of the PIC in (1).
(1) ‘‘In a phase α with head H, the domain of H is not accessible to operations outside α; only H and its edge are accessible to such operations.’’ (Chomsky 2000:108)
The PIC identifies the domains that are visible for syntactic computation: the phase head H and its ‘‘edge’’ (i.e., the set of its specifiers). It also identifies those items that are invisible to syntactic computation because they have been spelled out: the complement of the phase head (the TP for C, the VP for v). According to Chomsky (2012:5), there is a strict relation between the PIC and Transfer (i.e., Spell-Out): Transfer ensures that syntactic material is no longer in syntax and hence that it becomes invisible for syntactic computation.
(2) ‘‘PIC is guaranteed by Transfer to the interfaces of all information that would allow the interior to be modified by G. This principle must be defined with care—more care than in my own publications on the topic—to ensure that the interior, while not further modified, can nevertheless be interpreted in other positions (see Obata 201).’’ (Chomsky 2012:5)
Taking this remark as our starting point, we propose that Transfer is not the only way to ensure a PIC: there can be a PIC associated with a phase head that does not result in the disappearance of the relevant material from the module, in our case PF.
Note that the PIC is only the latest incarnation of what may be called ‘‘no look-back’’ devices: since Chomsky 1973, there has been a tradition of instruments that restrict the access of current computation to already computed strings. Relevant references on the phonological side include Kean 1974 and Mascaró 1976 (the phonological cycle), Kiparsky 1982 (the Strict Cycle Condition), Mohanan 1986 (levels and bracket erasure), and Kaye 1995 (analytic vs. nonanalytic domains). In each case, relevant domains are defined cyclically, but phonological effects are only the result of PIC-type restrictions on computation (see Scheer 2011:sec. 287).
Finally, observe that the formulation of Modular PIC is quite similar to the formulation of a ‘‘weak phase’’ in Chomsky 2001. According to Chomsky, passives and unaccusatives exhibit a weak v phase head—a head that is a phase from a propositional point of view, but that does not trigger the Spell-Out of its complement. This idea was developed in order to extend v to all verbs (in the Distributed Morphology ‘‘verbalizer’’ tradition) while keeping the contrast between a transitive, Burzio’s-Generalization-encoding head and a ‘‘defective’’ head, which can neither license an external argument nor assign accusative to its complement. Crucially, for Chomsky the PIC and Spell-Out occur together, which means that a weak phase head does not have Spell-Out, as well as not having a PIC effect. We wish to propose that Spell-Out and PIC-induced opacity effects are separate, and hence that Spell-Out does take place, even if opacity effects are not visible.
This might help solve the issues pointed out by Legate (2003) and subsequently by Richards (2011), and raised by the concept of weak phase. According to Legate, unaccusative and passive vPs also feature a phase head. On the basis of four diagnostics (the existence of a reconstruction site at the V level in unaccusatives; licensing of negative polarity items by raised quantifiers that cannot have raised to CP but must have targeted the vP; parasitic gaps; and Nuclear Stress Rule effects on moved elements), Legate concludes that v in unaccusatives and passives is a phase head, as it defines a cyclicity domain. Building on Legate’s work, Richards argues that ‘‘weak’’ v phase heads behave exactly like transitive ones—for instance, by providing a reconstruction site that is identical to those found for intermediate movement with transitive verbs, or for linearization purposes. In Richards’s view, the main problem with weak phases is that weak phase heads appear to behave exactly like strong phase heads in terms of cyclicity, while being defective for case assignment and argument licensing. Our model captures this difference, by proposing that Spell-Out does take place at every phase (for a given language), while opacity can vary. In this sense, there is always a site for reconstruction because Spell-Out has always taken place. What we do not always see is the reflex of this Spell-Out at PF (and sometimes in syntax itself, as in the case of long-distance agreement or case assignment, as discussed in section 3.3).
3.2 Selective Footprints in Phonology
Modular PIC separates the Spell-Out operation from the PIC: a PIC may or may not correlate with a phase. The effect is that phases endowed with a PIC at PF will leave a phonological trace (will be visible in phonology), while bare phases with a PIC only at syntax will not. This is parallel to what is known from the interaction of morphology and phonology: some morphological boundaries are visible to the phonology (e.g., class 2 affixes in English: párent-hood, where stress is computed only over the root), while others are invisible (e.g., class 1 affixes: parént-al, where stress is computed over the entire word, which behaves as if it were monomorphemic; see section 3.4).
That there are phases that are not associated with a PIC at PF is empirically supported. If every Spell-Out operation were associated with a PIC at PF as currently assumed, a cyclic effect would be expected at all phase boundaries: material contained in the lower Spell-Out domain should be frozen, and hence phonological processes applying across such domain boundaries should be blocked. However, external sandhi across phase boundaries is a crosslinguistically unremarkable situation. A case in point is the vP in English. In this language, t-flapping is reported to operate across all word boundaries regardless of the syntactic relationship between the words (provided the /t/ is word-final and intervocalic). Some examples from descriptions of relevant American varieties (Kahn 1976, Kiparsky 1979, Kaisse 1985:25 et passim, Nespor and Vogel 1986:46, 224, et passim) are atissue, a white owl, invite Olivia, ateleven, just the other nighta raccoon was spotted in our neighbourhood. Jensen (2000:208) specifically mentions a case where flapping applies across a vP boundary: a very dangerous wild catescaped from the zoo.5
On the other hand, several scholars have discussed the matter of a PIC associated with v in syntax, especially for successive-cyclic movement or for some sort of ‘‘structural rhythm,’’ in Uriagereka’s (2011:256) terms. Richards (2007), for instance, shows that the PIC associated with v is a necessary condition imposed by the Strong Minimalist Thesis, in order for the derivation to be able to proceed (see Gallego 2010 and Uriagereka 2011 for the PIC associated with v).
Crosslinguistically, phases that do not leave any footprint in phonology are perfectly unremarkable and are in fact probably more common than those that do produce external sandhi phenomena. It should also be noted that the evolution of phase theory described in section 2.4 significantly increases the number of phases that do not have any phonological effect.
Even though Chomsky’s (2000) original idea was that phases and the PIC are inseparable, phenomena such as English t-flapping leave us with only two logical solutions: either there is no phase/PIC at vP, or there is a phase at vP, but it is not associated with a PIC on the phonological end. It is the latter option that we wish to explore: the phase skeleton (i.e., the set of phase heads) is invariable for a given language, and for each phase head, it is decided whether or not that head is endowed with a PIC at PF (and, recall from footnote 5, this decision is also specific to each phonological process). Whether or not a particular phase head is associated with a PIC is part of its lexical properties. Thus, two languages may have the same phase skeleton (i.e., identical sets of phase heads), but differ with respect to which access points are associated with a PIC at PF. This is shown in (3).
There is reason to believe that the phasehood of a functional head is language-specific (Gallego 2010). This means that every language selects its own set (possibly marshaled by some imposed and some impossible choices), which then constitutes what we call the phase skeleton. For each language, a head can thus be specified as being a phase head or not, and as inducing a PIC at PF or not. The phasehood of heads is thus a parametric choice made by languages. This is in line with the Borer-Chomsky conjecture, as Baker (2008:156) calls it, according to which ‘‘[a]ll parameters of variation are attributable to differences in the features of particular items (e.g., the functional heads) in the lexicon’’ (see also Biberauer 2008, Roberts and Holmberg 2010). Variation is encoded on functional heads, and it is specified in the lexicon. Phasehood is one of the specifications on a functional head.
In current practice, phase heads, and hence phase structure, are determined on the basis of morphosyntactic evidence alone. From the perspective of a unified interface theory in which the same mechanism defines syntactically and phonologically relevant chunks, phonological evidence for phases needs to be taken just as seriously as syntactic evidence. As mentioned previously, a situation in which a computational system is insensitive to its input conditions—that is, never marks the boundaries of its input string—appears to be implausible. Hence, if phases transport chunks between morphosyntax and phonology, it is to be expected that they leave footprints in the latter.
That syntactic phases should be informed by phonology in fact follows from the Strong Minimalist Thesis as formulated by Chomsky.
(4) ‘‘Language is an optimal solution to legibility conditions.’’ (Chomsky 2000:96)
The Strong Minimalist Thesis outlines a methodological procedure for the definition of language, which can be understood here as the core computational system of syntax. In order for an expression π to meet legibility conditions (i.e., for it to be legible at the interfaces), it must satisfy the conditions of Full Interpretation. The question is how this Full Interpretation can be granted. The answer, we propose, is that PF instructs syntax on Full Interpretation conditions for any expression π that syntax produces. Therefore, phonologically relevant chunks can and must be reflected in syntax.
3.3 PIC Footprints Are Selective in Syntax As Well
The Abruzzese data discussed in section 4 show that the reverse configuration also exists: there are cases where Spell-Out produces a phonological effect but no syntactic effect. Hence, in terms of Modular PIC, the phase head in question is endowed with a PIC at PF, but not in syntax.
As we will show, Abruzzese also illustrates the three other logical possibilities of Spell-Out – PIC (mis)match. The overall situation is shown in table 1. Heuristically, then, in an environment where Spell-Out may be vacuous, the study of a language must identify two things: (a) the phase skeleton and (b) the association of a PIC to a given phase head in syntax and phonology.6 Evidence for (b) may be found in the footprints that are left: the presence or absence of a PIC for a given phase needs to be worked out independently for each module and must be based on evidence from that module alone. Evidence for (a) comes from the combined effects of (b): whenever there is a syntactic or a phonological footprint, there must be a phase boundary (endowed with a PIC). The reverse, however, is not true: there can be phases that are vacuous.
|PIC at syntax .||PIC at PF .||Illustration .||Phonological phenomenon .|
|+||+||Abruzzese: transitive active v||RF|
|−||+||Abruzzese: unaccusative v||RF|
|−||−||Abruzzese: passive v||RF|
|PIC at syntax .||PIC at PF .||Illustration .||Phonological phenomenon .|
|+||+||Abruzzese: transitive active v||RF|
|−||+||Abruzzese: unaccusative v||RF|
|−||−||Abruzzese: passive v||RF|
Finally, note that vacuous Spell-Out (i.e., an instance that leaves no trace in either syntax or phonology) is not an innovation introduced by Modular PIC: it also occurs on many occasions in the current model when no opacity effect is encountered in either module. One example is long-distance Case/agreement assignment in restructuring infinitive constructions in Japanese and German, analyzed by Bobaljik and Wurmbrand (2005). The authors note that the case properties of the embedded complement depend on the nature of the matrix predicate. Specifically, when the matrix verb is transitive active, the object of the embedded clause is marked accusative Case; when the matrix verb is passive or unaccusative, it is marked nominative Case. Consider for example the Japanese sentences in (5) and (6). In (5), the verb ‘eat’ is not able to assign nominative Case to its object (it assigns accusative Case). In the restructuring context in (6), however, the object receives nominative Case. On the assumption that the sentence involves a biclausal structure, Bobaljik and Wurmbrand conclude that it is the matrix verb that assigns nominative Case to the object. For present purposes, this means that v has not blocked assignment of Case from the matrix verb into the embedded clause, hence that no PIC effect has been induced by a transitive v.
Emi-ga ringo-o/*ringo-ga tabe-ta.
Emi-NOM apple-ACC/*apple-NOM eat-PST
‘Emi ate apples.’
Emi-ga ringo-ga tabe-rare-ta.
Emi-NOM apple-NOM eat-can-PST
‘Emi was able to eat apples.’
According to Bobaljik and Wurmbrand, the same reasoning holds for German long passive constructions of the sort illustrated in (7), where the matrix verb, which is marked as passive, assigns nominative Case to the object die Traktoren ‘the tractors’, whereas the embedded predicate, which is not marked as passive, could not possibly have done so.
weil die Traktoren zu reparieren versucht wurden
since the tractors.NOM to repair tried were.PL
‘since they tried to repair the tractors’
It goes without saying that transitive v in main clause contexts does show PIC effects both in German and in Japanese, as Bobaljik and Wurmbrand (2005) also argue. Their data thus show that in specific syntactic configurations, no opacity effects are associated with a phase head.7
Other long-distance contexts in which v does not induce opacity effects are quirky subject constructions in Icelandic (e.g., Zaenen, Maling, and Thráinsson 1985, Taraldsen 1995, Sigurðsson 1996) and long-distance agreement in Hindi/Urdu, exemplified in (8).
Vivek-ne [kitaab parh-nii] chaah-ii.
Vivek-ERG book.F read-INF.F want-PFV.F.SG
‘Vivek wanted to read the book.’
In (8), the matrix verb ‘want’ agrees long-distance with the object of the complement clause. Once again, neither the matrix v nor the embedded v has created any opacity effects.
3.4 Modular PIC below the Word Level
While this article is concerned with strings of pieces of word size or larger, we would briefly like to show how Modular PIC may also contribute to the understanding of cyclic structure below the word level. There is a long tradition of studying this area in phonology, especially in Lexical Phonology (Kiparsky 1982 and later works). We focus on a basic and well-known phenomenon, affix-class-based effects in English, to illustrate that Modular PIC has the potential to unify the analysis of strings of morphemes and strings of words.
The idea that morphology and syntax are run by the same computational system, rather than by two distinct modules, is a cornerstone of Distributed Morphology (DM) (single-engine approach; Marantz 1997). In DM, all category-modifying heads are phase heads (e.g., Marantz 2007, Embick and Marantz 2008:6). This proposal is at odds with regular cyclic structure inside the word. Affix-class-based phenomena prompt [parent al] (class 1 suffix) vs. [parent] hood] (class 2 suffix): the former receives transparent penultimate stress (paréntal), while the latter is stressed opaquely as if there were no suffix (párenthood). Kaye (1995) takes opacity as a diagnostic for phasehood: class 2 affixes project phase heads, while class 1 affixes do not (this is parallel to the phase edge mechanism in syntax; see Scheer 2008). Therefore, [parent al] is one single interpretational domain (to which stress assignment regularly applies), while [parent] hood] consists of two domains: [parent] is first computed on its own and receives regular penultimate stress, which is then ‘‘frozen’’ by the PIC and hence cannot be further modified when passing through the outer cycle. Hence, contrary to the DM view, class 1 -al does not appear to be a phase head, since it does not introduce opacity.
The DM view of phasehood imposes much stronger freezing effects below the word level than above it: every xP is a phase head and associated with a PIC. In this environment, stress shift should also not be found in, for example, órigin – orígin-al1– origin-ál1-ity1, because the PIC ought to freeze stress on the root-initial vowel. Again, class 1 suffixes do not behave like phase heads. Marvin (2002:56–58) therefore concludes that ( primary) stress is an exception to the PIC, which does not apply to this particular phenomenon.
This, however, is simply a different wording of Modular PIC, except that Marvin’s conclusion rests on a local and unsystematic ‘‘exception.’’ Modular PIC may thus reconcile the DM outlook on cyclic structure below the word level (every xP is a phase head) with basic affix-class-based phenomena: rather than by phasehood (as Kaye (1995) has it), class 1 and class 2 affixes are opposed by inducing (class 2) or not inducing (class 1) a PIC. Both trigger Spell-Out.
3.5 Active Memory Economy in Syntactic and Phonological Computation
Phases are the result of converging concepts developed over the last few decades. According to Chomsky (2012), phases reconcile locality requirements (which stem from the notion of minimal computation) with cyclicity (which is defined as ‘‘the intuition that properties of larger linguistic units depend on properties of their parts’’; Chomsky 2012:1). Cyclicity serves to impede infinite computation of the same element, by rendering the element itself invisible at some point. Still following Chomsky, phases have the task of distinguishing copies of an element from repetitions of that element. Ideally, all syntactic operations should take place within a phase so that copies can be distinguished from repetitions (the information on copies is available locally within a phase). Locality, on the other hand, can be seen as identifying domains of computation—that is, domains within which syntactic operations can take place, and syntactic relations hold.
Phase theory is also motivated by the Minimalist concern for (computational) economy, previously invoked in Uriagereka’s (1999) multiple Spell-Out model. Chomsky (2000:101) holds that the faculty of language has optimal design properties and operates with economy principles. This means, among other things, that computational complexity is unwarranted.
(9) ‘‘[T]here is mounting evidence that the design of FL [faculty of language] reduces computational complexity. That is no a priori requirement, but (if true) an empirical discovery, interesting and unexpected. One indication that it may be true is that principles that introduce computational complexity have repeatedly been shown to be empirically false.’’ (Chomsky 2001:15)
In the interface-driven Minimalist environment, then, Chomsky argues that the bias against computational complexity has an extralinguistic cause (i.e., a third-factor explanation): computation uses active memory (workbench memory), a very limited cognitive resource. In a linguistic derivation, a whole sentence is too big and too demanding in computational terms to be processed in one go. If sentences are built step by step, the burden imposed on active memory by the computation of successive pieces is reduced. Chomsky is explicit that this also holds for phonological computation. 8
‘‘The computational burden is further reduced if the phonological component too can ‘forget’ earlier stages of derivation.’’ (Chomsky 2001:12–13)
‘‘If such ideas prove correct, we have a further sharpening of the choices made by FL [faculty of language] within the range of design optimization: the selected conditions reduce computational burden for narrow syntax and phonology.’’ (Chomsky 2001:15)
‘‘Φ [the phonological component] is greatly simplified if it can ‘forget about’ what has been transferred to it at earlier phases; otherwise, the advantages of cyclic computation are lost. Although the assumption may be somewhat too strong, let us assume it to be basically true.’’ (Chomsky 2004:107–108)
The idea is thus that successive chunks of a sentence are computed and that the output of each computation is stored (and thereby ‘‘frozen in place’’ (Chomsky 2001:6) or ‘‘forgotten’’ (Chomsky 2001:12)), so that active memory is vacated for a new computational round. When all pieces of a sentence are computed, they are concatenated and pronounced.
The critical instrument for reducing computational complexity and saving active memory is the PIC: it is only when the already computed string can no longer participate in further computation that these goals are achieved.9 In the original conception whereby a PIC is associated with every phase, it does not matter whether the phase itself or the PIC is actually responsible for active memory economy. It does matter in a system with Modular PIC, since the bare occurrence of Spell-Out does not guarantee any economy; it may be vacuous. That is, in the absence of a PIC the chunk that is spelled out will be able to be further modified when the next higher phase is computed.
This is precisely the definition of phonologically relevant chunk: chunks are defined that are relevant for phonological computation, and these may or may not coincide with chunks that are relevant for syntactic computation. Recall from section 2.1 that this is the insight of SPE’s readjustment component.
While identifying domains for reducing syntactic computation is a worthwhile enterprise, there are no studies attesting the exact computational load for different parts of structures (see footnote 8). Defining a phase as a domain of syntactic computation, without knowing exactly what this computation amounts to, leaves an important question unanswered.
This issue is addressed by Chomsky (2008), who puts forward a definition that does not follow from active memory or computational economy, but instead builds on the content of syntactic computation. On the assumption that computation is driven by the need to eliminate features from syntax that would be uninterpretable at the interface with LF and PF (Chomsky 2008 and later works), phase heads can be defined as the loci where uninterpretable features are first-merged. As such, they are the core of computation—that is, the locus from which everything departs. Uninterpretable features are merged on phase heads and then inherited by other functional projections (e.g., T), according to a feature inheritance mechanism described by Chomsky (2008). Following this line of thought, we will assume here that phase heads are defined, in syntax, as the locus of merger of uninterpretable features. In addition, phase heads may be identified by PF mapping, as we will show.
The definition of phase heads as functional elements hosting uninterpretable features, though seemingly flexible, is not unconstrained. Features are not uninterpretable or interpretable tout court; rather, they may be so depending on the item on which they appear.
ϕ-features are, for instance, uninterpretable on ‘‘verbal’’ heads, like T or v, but interpretable on ‘‘nominal’’ heads, like N (and consequently on NPs). According to Chomsky (1995:277), ‘‘Among the Interpretable features are categorial features and the ϕ-features on nominals.’’ Tense, aspect, and mood features are instead interpretable on T. This is, according to Chomsky (1995), precisely what drives Agree: the need for uninterpretable ϕ-features on T to become interpretable before the interface level is reached. In this sense, the definition of what can be a phase head is, although linked to lexical categories, deterministic rather than random.
3.6 (A)symmetric Spell-Out
A basic (if often tacit) assumption of phase theory is that LF and PF phases are always concomitant: when a given node is spelled out, its content is sent to, and interpreted at, both LF and PF. Phase theory would be significantly weakened if it turned out that a given node could be independently spelled out at LF and PF. Chomsky (2004) is explicit in this regard.
(11) ‘‘Assume that all three components are cyclic. . . . In the worst case, the three cycles are independent; the best case is that there is a single cycle only. Assume that to be true. Then Φ [the phonological component] and Σ [the semantic component] apply to units constructed by NS [narrow syntax], and the three components of the derivation of 〈PHON, SEM〉 proceed cyclically in parallel. L [language] contains operations that transfer each unit to Φ and Σ. In the best case, these apply at the same stage of the cycle. . . . In this conception there is no LF: rather, the computation maps LA [lexical array] to 〈PHON, SEM〉 piece-by-piece cyclically.’’ (Chomsky 2004:107)
In response to empirical pressure from various sides, though, independent access to LF and PF is proposed or considered by, among others, Megerdoomian (2003), Felser (2004), Marušič (2005), Matushansky (2005), Marušič and Žaucer (2006), Den Dikken (2007), and Caha and Scheer (2008).
Modular PIC allows us to maintain strictly symmetric Spell-Out, while being able to generate the effects that the asymmetric Spell-Out literature tries to account for: the interpretation of material that is shipped to PF and LF may be vacuous. That is, the phase structure of a sentence is uniquely defined at the morphosyntactic level. Every time a phase head triggers Spell-Out, its complement is sent to both LF and PF. Every phase is thus processed by both interpretational modules, but this does not mean that an effect is systematically produced: there are some vacuous correspondences. In other words, the Spell-Out mechanism treats all phases in the same way, but not necessarily at PF and LF (or in syntax for that matter).
4 Abruzzese Raddoppiamento Fonosintattico
4.1 Raddoppiamento Fonosintattico
Raddoppiamento fonosintattico (RF) is an external sandhi phenomenon that is found in most central and southern Italian varieties as well as in Standard Italian, whereby the initial consonant of a word geminates, depending on the properties of the preceding word and/or the syntactic relationship between them. There is a significant body of literature, both descriptive and analytical, that addresses RF, most notably Rohlfs 1966, Vogel 1978, Nespor 1988, 1993, Fanciullo 1997, and Loporcaro 1997a,b, among many others. The phenomenon is best-known in its Tuscan version (Nespor and Vogel 1979, 1986, Chierchia 1986), where stress is an important conditioning factor: RF is triggered when the preceding word ends in a stressed vowel, but does not occur after unstressed vowels. This is exemplified in (12a).
In other, most notably southern varieties such as Abruzzese (spoken in Abruzzo, a central region of Italy), RF is not stress-conditioned (see (12b)); here, oxytones never trigger the gemination of the following word-initial consonant (Leone 1984, Lepschy and Lepschy 1988:67–69, Nespor 1993, Loporcaro 1997b, Borrelli 2002, and many others).
While stress may or may not trigger RF across dialects, all varieties have lexical triggers; that is, RF is observed after a lexically defined set of words. Membership in this set is arbitrary and varies from system to system in unpredictable ways; monosyllabic function words are typically involved. For example, tre ‘three’, come ‘like’, and a ‘to ( prep.)’ are lexical triggers in Tuscan: tre ccase ‘three houses’, come mme ‘like me’, a mme ‘to me’, and so on. Loporcaro (1997a,b) and Passino (2013) provide an overview of the cross-dialectal variation with respect to the lexical set of RF triggers. Hastings (2001) offers a list of lexical triggers for Tollo (11 km from Arielli, where our data originate). In Abruzzese, the set includes gne ‘like, with’, pi ‘for’, gna ‘how’, nghi ‘with’, a ‘at’, llà ‘there’ (minimal pair with the fem. sg. determiner la, which does not trigger RF), qua ‘here’, (a)ccuscì ‘so’, si ‘if ’, ni (negation).
Loporcaro (1997a,b) argues that the origin of RF is the loss of Latin word-final consonants. He postulates a period of regressive assimilation of the final consonant of the first word to the initial consonant of the second word. For example, DAT PANE(M) ‘give.3SG bread.ACC’ was pronounced dappane. When final consonants disappeared almost completely in Late Latin – Protoromance, the sequence with regressive assimilation was reanalyzed as a sequence of a truncated word (bearing stress) and a word with an initial geminate: da ppane (see also Vincent 1988 and Passino 2013).
Regarding the representational analysis of RF, we follow the classical autosegmental approach (Chierchia 1986, Loporcaro 1988, 1997a,b): lexical triggers are lexically endowed with extra syllabic space at their right edge, on which the initial consonant of the following word geminates. Nothing in the development below hinges on specific assumptions regarding representations or the kind of computation (by rules or constraints). We therefore keep our analysis as theory-neutral as possible in these two respects. On the representational side, we use minimal syllabic vocabulary that should be consensual in an autosegmental frame: x-slots without further syllabic specifications are enough to express all relevant information. The (only) difference between a lexical item that triggers RF (see (13a)) and one that does not (see (13b)), then, is that the former possesses an x-slot to the right of the last vowel, while the latter does not. In dialects where final stressed vowels trigger RF, the extra x-slot is the exponent of stress (i.e., stress materializes as syllabic space; Chierchia 1986, Ségéral and Scheer 2008).
Certain varieties have also been described as imposing syntactic conditions on RF (see, e.g., Napoli and Nespor 1979, Kaisse 1985, Nespor and Vogel 1986). The empirical validity of these descriptions, however, has been called into question by, among others, Agostiniani (1992) and Loporcaro (1997a,b).
4.2 Syntactic Conditions on RF in Abruzzese: Actives vs. Passives
In light of the discussion above, the baseline situation in Abruzzese is the presence of a lexically defined set of words that trigger RF, with stress playing no role. Below, we describe an additional syntactic filter: RF occurs with lexical triggers only when these are in a specific syntactic relationship with the following word.
The pattern discussed here was first described by Biberauer and D’Alessandro (2006) for the village of Arielli, where voice alternations are conveyed by means of RF. While active periphrastic forms do not exhibit RF between the auxiliary and the past participle, passive forms do, as exemplified in (14) and (15).
‘I have seen’
‘I am seen’
‘you have respected’
‘you are respected’
Most varieties of Abruzzese exhibit person-driven auxiliary selection, whereby the auxiliary selected to form the present perfect depends on the subject person specification. While 1st and 2nd person select BE, 3rd person invariably selects HAVE, independently of verb class (D’Alessandro and Roberts 2010). The auxiliary selected for the passive is BE. This means that underlyingly, the 1st and 2nd person present perfect and passive have the same form, BE + participle. The voice alternation, however, is still present, as it is encoded by (the sole) means of RF: while active voice does not feature RF, passive does.
The monosyllabic exponents of the auxiliary BE are RF triggers in Abruzzese: 1st, 2nd, and 3rd person singular, as well as 3rd person plural. An example of 3rd person singular/plural passive appears in (16).
‘he is seen/they are seen’
HAVE, on the other hand, is not a lexical trigger for RF, even when it appears in a monosyllabic form. (17a) and (17b) offer the full paradigm for active and passive voice of the verb warda’ ‘to look’.
Biberauer and D’Alessandro (2006) give several arguments supporting the claim that the active and passive auxiliary are the same lexical item—that is, that the voice alternation in question is entirely determined in the syntax. This means that in the presence of a lexical trigger, specific syntactic conditions need to apply for RF to take place.
For the analysis of passives, we follow the standard Minimalist view according to which they present a defective v (Chomsky 2000, 2001). The relation between actives and passives has lost its transformational flavor in the Minimalist Program; there is no way to derive one voice from the other in the syntax. Hence, v is specified as fully encoding Burzio’s Generalization in the case of active transitives, and as defective in the case of passives, and passive is in any case not derivable in the syntax (cf. the syntactic approach to passives in Baker, Johnson, and Roberts 1989). Following Richards (2007), Gallego (2010), Uriagereka (2011), and Chomsky (2012:4–5), we assume that phases are defined as domains in which structural case and unvalued features are valued (see section 3.5). Hence, transitive v and C are phase heads, while unaccusative and passive v are defective. This means that v (or Voice, if we wish to follow Collins 2005) needs to be specified as a defective head in the lexicon prior to Merge. Biberauer and D’Alessandro (2006) propose that deficiency is equivalent to the lack of the PIC: a nondefective transitive active v will be endowed with a PIC—that is, its complement will be spelled out when the next phase head up (C) is merged. On these assumptions, an active sentence such as (18) is derived as in (19).
So rəspəttatə la leggə.
am.1SG respected.SG the.F.SG law.F.SG
‘I have respected the law.’
The auxiliary and the participle are in different Spell-Out domains and therefore belong to two different chunks at PF. Since transitive active v is endowed with a PIC at PF, RF will be blocked between the auxiliary and the participle (Biberauer and D’Alessandro 2006). If however v is defective, as it is in passives, it is not a phase head. This means that, vacuously, no PIC can be associated with it, and that the whole complement of C will be spelled out in one single chunk. Consequently, the auxiliary and the participle belong to the same PF domain and RF will be able to operate. The derivation of a relevant passive sentence, (20), is shown in (21).
So rrəspəttatə (da tuttə quində).
am.1SG respected.SG by all
‘I am respected by everybody.’
The active/passive distinction can be straightforwardly analyzed as shown—that is, with a minimal inventory of phase heads and full concomitance of Spell-Out and the PIC. However, there is reason to believe that this toolbox is insufficient.
4.3 Abruzzese Unaccusatives
The nature and even existence of v for unaccusative verbs is a widely debated topic. According to Chomsky (1995:315–316), unaccusatives do not feature a v. Many studies have investigated the accuracy of this statement, and whether the nature of v in unaccusatives, if it is present, differs from the nature of v in unergatives and active transitives. The volume edited by Alexiadou, Anagnostopoulou, and Everaert (2004) is almost completely devoted to this topic, while Borer (1994), Kratzer (1996), Van Hout (1996), Marantz (1997), Ramchand (1997), Ritter and Rosen (1998), Harley and Noyer (2000), Travis (2000), and Legate (2003) all discuss the nature of the functional head(s) above the VP. Most authors agree that v is defective in unaccusatives; this assumption is sustained by θ-related, argument-structure-related, and feature-related considerations. For instance, the internal argument of unaccusative verbs shows ϕ-agreement with the finite verb and carries nominative Case, which means that it must have agreed with T. This is possibly due to the fact that v in unaccusative verbs is defective and cannot license its internal argument (nor assign case to it) (see Chomsky 2005, 2012, Gallego 2010).11 Under the assumption that Spell-Out and the PIC are concomitant, this means that unaccusatives should parallel passives as far as RF is concerned: being defective, unaccusative v is not a phase head. Spell-Out thus occurs only at C, which means that the entire TP ought to be transparent for syntactic and phonological computation.
However, contrary to expectation, RF in Abruzzese is blocked with unaccusatives, as shown in (22).
‘I have stayed’
Unaccusatives differ from passives not only in not triggering RF, but also in auxiliary selection. In the 3rd person, unaccusatives pattern with transitives in selecting the auxiliary HAVE, while passives select the auxiliary BE. Observe the contrast between (23), where the verb is unaccusative (3rd person), and (24), where the verb is transitive passive (3rd person).
‘he/she has stayed’
‘he/she is seen’
While the unaccusative verb in (23) has person-driven auxiliary selection like the transitive active verbs in (17), passives select the auxiliary BE throughout. This suggests that the feature that is connected to a PIC effect is voice, not transitivity. This PIC effect is visible only at PF, though, not at syntax.
In other words, we are facing a syntax-phonology mismatch: syntactically, unaccusatives appear to represent one single Spell-Out domain, but phonologically, they behave as if there were two. Or, translated into phase-theoretic terms, there is a PIC effect in phonology, but not in syntax. Modular PIC takes this statement literally: Spell-Out does occur at vP, and a PIC is associated with this access point at PF. In syntax, however, the Spell-Out is vacuous; no PIC is associated with v, and hence everything below C represents one single computational domain. The corresponding analysis for (22) appears in (25).12
Passives and unaccusatives are syntactically different, not only because the former do not induce a PIC effect at PF while the latter do but also, as shown in (23)–(24), because they select different auxiliaries. While unaccusatives follow the transitive pattern by selecting the auxiliary according to the subject person specification, passives always select the auxiliary BE. This suggests that the key factor in syntax, corresponding to the PIC effect at PF, is voice rather than transitivity (Biberauer and D’Alessandro (2006) arrive at the same conclusion). We take this to mean that the PIC at PF is linked to an active value for the [voice] feature on v. This feature value seems to be the syntactic correlate of the PIC effect at PF.13
4.4 The Fourth Pattern: PIC in Syntax, but No Footprint at PF
We have so far illustrated the first three lines of table 1. Abruzzese also exemplifies the fourth logical combination: a PIC associated with Spell-Out at syntax, but not at PF. This is the crosslinguistically common trivial case in which a syntactically motivated phase does not leave any footprint in phonology (as in English t-flapping). Consider (26).
Jè mmeje chə vve.
is better that come.3SG
‘It’s better that he/she comes.’
Jè mmeje chə nni vve.
is better that not come.3SG
‘It’s better that he/she doesn’t come.’
Assuming that chə occupies C14 and that C is a phase head in syntax (Chomsky 2000), RF is free to operate between the complementizer chə and the finite verb in T (26), as well as between chə and negation (27). Note that in the presence of a different complementizer, ca, which is not an RF trigger, the finite verb does not exhibit RF.
Penzə ca ve.
think that come(s)
‘I think that he/she comes/they come.’
Negation also does not geminate after ca.
Penzə ca ni vve.
think that not come(s)
‘I think that he/she doesn’t come/they don’t come.’
Further evidence for this perspective comes from relative clauses. In (30), RF applies between the relative pronoun chi and the subject in Spec,TP. Assuming that chə is in Spec,C, we must conclude that even though there is a phase boundary between the relative pronoun and the subject, it does not serve as a barrier to phonological computation, suggesting that the C head is not endowed with a PIC at PF.
lu waglionə chə ttu si vistə
the boy whom you are seen
‘the boy whom you saw’
5 Bantu: Phonological Domains Identified by Vowel Lengthening
5.1 Cheng and Downing’s Data and Analysis
In a number of papers, Cheng and Downing (2007, 2009, 2012) and Downing (2010, 2011) outline arguments in favor of nonisomorphism—namely, the absence of one-to-one correspondence between prosodic constituents and syntactic Spell-Out domains (see sections 2.1 and 2.2).
In Bantu, the right edge of phonologically relevant domains is generally marked by penultimate vowel lengthening (Kanerva 1990). This is the case for example in Zulu and Chichewa, studied by Cheng and Downing (2012): the next-to-last vowel of a domain is long (and there are no other long vowels). Hence, the presence of a long vowel indicates a domain boundary that occurs to the right of the following vowel. On the other hand, its absence ensures the absence of a domain boundary. Applying this diagnostic, (31a) shows that the verb always belongs to the same domain as its objects: there is only one long vowel in the sentence (in penultimate position). (31b) demonstrates that subjects are isolated and appear in a domain on their own when they receive a topic interpretation; in addition to the penultimate long vowel of the string, another long vowel identifies the domain of the subject. The control sentence in (31c), where the subject is not interpreted as a topic, shows that topichood is really responsible for the subject isolation in (31b): the subject remains unmarked and there is only one long vowel in the entire string.
Cheng and Downing provide examples of similar patterns in Kinyambo, where the cue for the detection of domains is High Tone Deletion, and in Luganda, where the cue is High Tone Anticipation. The patterns are essentially identical, and they all show the same thing: that domain boundaries do not occur where phase theory predicts they do. One relevant data set in this respect is relative clauses.
We invited [DP the [CPstudents [C′that [TPTracy taught to ski]]]] to visit the Alps.
Given the restrictive relative clause structure in (32), current phase theory predicts that the head and specifier of the CP (in boldface) are spelled out separately from the TP (in italics). Hence, under an isomorphic mapping system, phonological material representing TP on the one hand and C as well as Spec,C on the other should fall within separate phonological domains. However, this prediction runs afoul of the workings of Chichewa, as shown in (33).
CL6-parent 6SUBJ-PST1-give CL1.child 1-REL
([DP ndalámá zá mú-longo wáake]).
1SUBJ-PST2-6OBJ-visit CL10.money 10.of CL1-sister 1.her
‘The parents gave [the child who visited them] money for her sister.’
As before, phonologically relevant chunks are identified by penultimate vowel lengthening, which marks right domain edges. This identifies two domains (ending in chezéera and wáake). The fact that no other vowel lengthens to the left of chezéera establishes that the domain runs all the way to the beginning of the matrix sentence. Cheng and Downing conclude that phonologically relevant domains in Luganda cannot be identified by phase structure. They therefore argue in favor of the classical Prosodic Phonology architecture whereby the output of Spell-Out needs to be mapped (or readjusted, in SPE terms) by a specific mechanism that creates extra structure—that is, prosodic constituency. In the current OT-based environment, the mapping mechanism manifests itself as ALIGN constraints, which in Cheng and Downing’s analysis can be expressed as follows for Luganda:
ALIGNR (IP/TP, IntPh)
Every syntactic IP/TP is right-aligned with a prosodic Intonation Phrase.
ALIGNT (IntPh, IP/TP)
Every prosodic Intonation Phrase is right-aligned with a syntactic IP/TP.
The same holds for the simple clause. In Bantu, the finite verb moves to T, while whenever an argument leaves the VP, a clitic appears on the verb. This makes identification of the first-Merge or landing sites of arguments quite straightforward. In a simple ditransitive sentence, if the verb has moved to T while the direct and indirect objects remain in situ, we would expect the verb (in T, boldfaced in (35)) on the one hand and the direct/indirect objects (in VP, italicized) on the other to belong to different Spell-Out domains. In current isomorphic phase theory, this then also means that they must belong to two distinct phonological domains.
[CP [TPsubject verb [vP [VPIO DO]]]]
We have already seen in (31a) that contrary to the isomorphism-based prediction, the verb in Zulu belongs to the same phonological domain as IO and DO. As before, Cheng and Downing conclude that phonologically relevant chunks may be larger than what is predicted by phase structure, and therefore need to be defined independently from Spell-Out domains.
5.2 Phonologically Vacuous Spell-Out
Spell-Out and phonologically relevant domains do not coincide in Cheng and Downing’s (2012) analysis only if it is assumed that every Spell-Out necessarily leaves a trace in phonology (i.e., is associated with a PIC). On the assumptions of Modular PIC, this is not the case. Therefore, there is no need to postulate any mapping or transformation of syntactic Spell-Out domains into extra structure ( prosodic constituency). In the case of both relative clauses and simple ditransitives, there is no PIC associated with the phase heads C and v, respectively, at PF. That is, phonology does receive two distinct chunks, but these are computed together. In the case of (31a), for instance, Spell-Out takes place in syntax, meaning that both the IO and the DO become invisible for further syntactic computation and can no longer move out of the vP, but at PF they are still visible to the verb and the subject. Therefore, there is no additional long vowel identifying the Spell-Out domain that Cheng and Downing’s analysis expects to find.
Note that the Modular PIC analysis may be falsified language-internally. If a particular phenomenon suggests that a phase head—say, v—lacks or is endowed with a PIC at PF, the PIC is expected to be lacking (or to be present) in all constructions involving the head and that phenomenon in this particular language. This means that in Bantu we expect v never to produce domain-identifying long vowels, while in Abruzzese we expect active v always to block RF when expressing voice. This prediction seems to be borne out.
Modular PIC is a means of advancing phase theory in the sense of the Strong Minimalist Thesis: it is adapted to the needs of an interface, PF. Its benefits are also Minimalist in kind: a unified interface theory emerges in which the number of devices required to run grammar is cut down. Rather than two chunk-defining devices, there is only one—phases—for both syntax and phonology, which means that the same work is not done twice. The same is true for the corresponding structure: instead of phase structure and parallel prosodic constituency, only the former is operational.
Modular PIC makes phase theory more flexible, which allows us to gain ground on the Minimalist side while maintaining the empirical insight that syntactically and phonologically relevant chunks may not coincide (nonisomorphism, which goes back to SPE and the 1980s). We have shown that each configuration of the four-way parametric space that is opened by Modular PIC (see table 1) has an empirical echo: phase heads are always spelled out (there is a unique phase skeleton for every language), but any given phase head may or may not be associated with a PIC in syntax, and may or may not be associated with a PIC in phonology.
Another effect that is certainly welcome in a Minimalist environment is that phonological evidence now contributes to the discovery of phase heads too: while only syntactic evidence is currently used in order to identify which nodes are phase heads, any time a footprint is left in phonology, we gain information about the makeup of the phase skeleton of the language. Note that this bimodular control of phase structure also opens the way for intermodular argumentation: competing theories in one module may be refereed by evidence from another module (Scheer 2008, 2009).
Looked at from below (i.e., from phonology), Modular PIC shifts the perspective to an angle where the interface is conceived in terms of visibility: just as in morphology, where boundaries may or may not be visible to the phonology, the language makes a choice about whether to flag or not flag a given phase boundary on the (PF) surface.
Modular PIC also supports the Chomsky-Borer conjecture (e.g., Baker 2008, Biberauer 2008, Roberts and Holmberg 2010). Whether or not cyclic derivation leaves a footprint in either syntax or phonology depends only on lexical information: whether or not a given item is a phase head, and whether or not a given phase head is endowed with a PIC (at syntax and/or at phonology). Note that the current means of expressing nonisomorphism (the noncoincidence of syntactic and phonological domains) is computational, rather than lexical: an extra mapping mechanism (ALIGN constraints in OT) defines which syntactic information is transformed into prosodic constituency.
Finally, consequences for phonological theory are significant: besides the fact that postlexical phonology (i.e., phonology across word boundaries) may make reference to cyclic structure (while it has been thought to be noncyclic since Kiparsky 1982), the entire Prosodic Hierarchy is superfluous and has to go. Its function is taken over by a version of phase theory made more flexible. A side effect of this move provides another, wider benefit, though one that may produce shockwaves in the architecture of OT: ALIGN constraints are a major source of modularity violations (Scheer 2011:sec. 523). Unlike in the original architecture of Prosodic Phonology, they put mapping (i.e., the translation of morphosyntactic structure into phonological structure) into the phonology. ALIGN constraints make regular and necessary reference to morphosyntactic categories, but are a piece of phonological computation (they are interspersed with other phonological constraints in the same constraint hierarchy). Hence, they violate Indirect Reference, which is the incarnation of the modular requirement of domain specificity (see section 2.1). Modular PIC eliminates prosodic constituency and the extra computation by which it is created (i.e., ALIGN constraints in OT). ALIGN being a central device in OT—in historical terms too (e.g., Itô and Mester 1999, McCarthy and Prince 2001:vii)—its elimination offers an opportunity to reposition the theory in a modular perspective, but certainly represents a challenge: what would OT look like without ALIGN?
This research was sponsored by the NWO (Netherlands Organisation for Scientific Research) VIDI program, project 276-70-021 (Splitting and Clustering Grammatical Information), which is hereby acknowledged.
The authors would like to thank the audiences of NELS 43, the Exploring the Interface workshop at McGill University, and OCP 9 in Berlin. We also wish to thank Marc van Oostendorp, Diana Passino, and two anonymous reviewers for their useful suggestions.
1 See Scheer 2011:sec. 423, 2012 for a more detailed study of how cyclic and prosodic chunk definition coexisted in the 1980s and 1990s.
2 In addition to introducing morphosyntactic information into phonology, prosodic constituency is held to be responsible for eurhythmy and rhythm by, for example, Selkirk (1981 :126–128, 1984:8–22, 2000) and Ghini (2001). This position is debated, though: there is reason to doubt that rhythm is a linguistic property. For example, Hayes (1984) holds that rhythm is an emanation of metrical poetry and music, rather than of the linguistic system. He writes that ‘‘grids are not strictly speaking a linguistic representation at all’’ (p. 65) and concludes (p. 69) that rhythm and linguistic structure such as stress or the Prosodic Hierarchy belong to separate cognitive domains. Nespor (1988:228) adds that rhythm ‘‘is, in fact, not properly a phenomenon of language, but rather of all temporally organized events.’’ This view is also expressed by Nespor and Vogel (1986, 1989:87–88) and Selkirk (1986): rhythmic structure materializes as the metrical grid, which is produced by a secondary mapping that takes the Prosodic Hierarchy as an input.
4 An anonymous reviewer asks about the input conditions to syntactic computation. To our knowledge, this issue is hardly discussed in the literature, one exception being Uriagereka’s (2002) and Uriagereka and Pietroski’s (2002),warping operation in the faculty of language—that is, the ‘‘topological’’ ability to shift dimensions (e.g., from Euclidean spaces to non-Euclidean ones, or from two-dimensional geometries to three-dimensional ones), which the faculty of language shares with the arithmetic and geometric systems. On the basis of this idea, Munakata (2006) proposes that the split CP is a consequence of the need to realize the fourth dimension (discourse) in a two-dimensional (or three-dimensional) syntax. This whole line of reasoning is mainly based on syntactic and semantic considerations. Phases, in this respect, could be a way of mapping different dimensions into narrow syntax (first two-dimensional, then three-dimensional, and finally four-dimensional elements), or they could even be marking dimensions (from argument structure, v, to discourse, C). We leave this argument aside, as we wish to focus on the syntax-PF interface, and further elaboration would be too speculative.
5 A point that needs to be made in this context can only be briefly mentioned in this footnote: in addition to being module- and phase-head-specific, the PIC is process-specific. This is a well-known (but often unmentioned) fact about sandhi phonology. In (relevant varieties of) English, for example, t-flapping is unbounded by morphosyntactic divisions, but other phenomena such as word stress assignment apply only within words. Hence, the visibility of the word boundary is process-specific. This is more generally true for all boundaries and all processes in all languages: typically, a given morphosyntactic division blocks some phonological processes (word stress assignment in our example), while being permeable to others (t-flapping). Hence, phonological processes need to ‘‘know’’ whether or not they can apply across any given morphosyntactic division. Classically in Lexical Phonology (and Optimality Theory (OT) versions thereof), the process specificity of boundaries is dealt with by assuming distinct computational systems: in the English case, the rule assigning word stress is present in the lexicon (which assesses strings of morphemes), but absent from postlexical phonology (where strings of words are computed). By contrast, t-flapping is present in both rule systems. What is at stake and requires discussion is thus the opportunity to split phonological computation into several independent computational systems according to the size (lexical vs. postlexical) and/or the nature (lexical strata) of the pieces involved. This touches upon the much-debated question of whether morphology and syntax are run by the same or different computational systems (see the DM position discussed in section 3.4). Detailed discussion of this question is beyond the scope of the present article (see Scheer 2011:sec. 823 for more detail).
6 Following the logic of the system, Spell-Out should also be able to be vacuous at LF, but this is pure speculation at this point.
7 An anonymous reviewer observes that, crucially, Bobaljik and Wurmbrand (2005) assume that v is not projected at all in restructuring contexts. We assume instead that it is, and that in one case (simple clause) it is associated with a PIC effect, while in another (restructuring) it is not. This is precisely the point we are trying to make: the PIC effect can depend on the structure. That is, it is usually specified on a head, given its nature (e.g., a transitive v), but it can be structure-sensitive. According to Cinque (2008:12), for instance, ‘‘only those verbs that happen to match semantically the context of a certain functional head admit of two distinct possibilities’’; that is, there must be some sort of semantic agreement between the v of the modal and the v of the main verb in order for restructuring to occur. We take this to mean that the phasehood of the lower v can be influenced by interaction with the phasehood of the higher v. Another option could be that in restructuring, because of this ‘‘agreement’’ between the two vs, some sort of phase sliding is at work (see Gallego 2010), whereby the phasehood of the lower v is transferred to the higher one.
8 It should be noted that Chomsky’s (2001) claim that an entire CP is computationally too complex to be processed by active memory in one go is by and large speculative. No measure is available today that assesses the computational complexity of a CP (in absolute numbers), and the capacity of the human active memory is also far from being computable (in absolute numbers: see, e.g., Mathy and Feldman 2012). The situation in phonology is much the same. In OT, computational complexity has always been an issue. The theory is known to be quite expensive on this side (and sometimes claimed to be uncomputable; see Idsardi 2006), but there is also no overall calculus available that would produce absolute numbers (owing among other things to the GEN function, which in classical OT is supposed to create an infinite candidate set; see Frank and Satta 1998).
9 Note that in Chomsky’s (2001) original view, the goal that the PIC is supposed to achieve is not to reduce computational complexity to an absolute minimum (this would coincide with Epstein and Seely’s (2006) program). Rather, the reduction is supposed to make computation doable in active memory. The precise capacities and limitations of active memory being unclear at the present stage of our understanding (see footnote 8), the appropriate complexity may well define pieces that are bigger than those that are delineated by Epstein and Seely’s spell-out-as-you-merge. In other words, anything that is within the range of what active memory can compute is a possible phase.
10 Unless otherwise stated, the examples from Abruzzese belong to the variety spoken in Arielli (Chieti), Abruzzo, classified as an eastern upper-southern Italian dialect.
11 By contrast, Legate (2003) argues for the fully phasal nature of what she calls the VP, meaning some sort of vP associated with unaccusatives and passives.
12 In (25) and below, we use an asterisk (here, v*) to mark phase heads that are lexically endowed with a PIC at PF.
13 Recall, however, that it is not always possible to find a featural syntactic correlate for a PIC effect at PF. To account for the same data, Biberauer and D’Alessandro (2006) propose the existence of a Voice head that, when active, is a phase and has an effect on PF. Under this view, unaccusatives are expected to pattern with transitive actives and not with passives. On the basis of participial agreement and the distribution of auxiliaries, however, D’Alessandro and Roberts (2010) and D’Alessandro and Ledgeway (2010) show that the Voice head (or the higher v, as they call it) is not a phase head. We therefore follow the argumental view according to which passives and unaccusatives share an internal argument and it is simply the value specification of the [voice] feature on v that correlates with the PIC at PF.
14 Abruzzese has three complementizers (Rohlfs 1969:190, 1983, Ledgeway 2009:sec. 4.3, D’Alessandro and Ledgeway 2010): the declarative ca, the irrealis chə, and the jussive ocche. Chə is an irrealis complementizer introducing unselected clauses. We will not pursue a fine-structural analysis of the left periphery of the clause, and simply assume that both chə and ca occupy C (or a C head).