Although in many interface theories, the domains of phrasal phonological processes are defined in terms of prosodic constituents, D’Alessandro and Scheer (2015) argue that their proposed modification of phase theory, Modular PIC, renders prosodic constituents superfluous. Phrasal phonological domains can instead be defined directly in the syntax. In this response, we argue that Modular PIC does not provide a convincing new approach to the syntax-phonology interface, as it is both too powerful and too restrictive. We show that the analysis offered of raddoppiamento fonosintattico in Eastern Abruzzese does not justify the loss of restrictiveness Modular PIC brings to phase theory. We also show that Modular PIC is too restrictive to account for phenomena, from Bantu languages and others, that have received satisfactory analyses within interface theories that appeal to prosodic constituents. We conclude that Modular PIC does not successfully replace prosodic constituent approaches to the interface.
Since the late 1980s (see, e.g., Chen 1987, Hayes 1989, Nespor and Vogel 1986, Selkirk 1986), many theories of the syntax-phonology interface have argued for an indirect-reference approach to defining the domains of phrasal phonological processes. The central argument for this approach is that there is often a mismatch between domains of phonological processes and domains defined by the syntax. To account for these mismatches, it is proposed that phonology accesses syntax only indirectly. A mapping procedure allows restricted aspects of syntactic structure (as well as prosodic principles) to define constituents of the Prosodic Hierarchy, like the phonological phrase (φ), the intonational phrase (ι), and the utterance. Only those prosodic constituents, and not any specific syntactic information (constituents, categories, or features), can be referred to in the context of application of any given phonological process. However, also since the late 1980s (see, e.g., Kaisse 1985, Odden 1995), there has been a countervailing tradition of work on the interface arguing that phonology must have direct access to specific kinds of syntactic information and that the prosodic constituents in the Prosodic Hierarchy are insufficient and even superfluous in accounting for phrasal domains. Instead, in a direct-reference approach, the constituents, features, and relations provided by the syntax are all that are needed to define the domains for phonological processes. (See Elordieta 2007b, 2008 and Selkirk 2011 for critical overviews of work on the interface.)
As Elordieta (2007b, 2008) observes, phase-based syntax (Chomsky 2001) has brought renewed interest to exploring the limits of the direct-reference approach. In phase theory, phonological constituents are derived through the operation Spell-Out (Transfer), the idea being that once a relevant syntactic chunk has been computed, it is sent to the interfaces to be interpreted. A reasonable hypothesis, as defended by Cheng and Downing (2012b, 2016), Dobashi (2004, 2009), Ishihara (2003, 2007), Kratzer and Selkirk (2007), and Selkirk (2011), among others, is that phases are necessarily relevant for phonological constituency and play a crucial role in the determination of prosodic domains. A more radical hypothesis—put forward in work by Adger (2007), Kahnemuyipour (2009), Pak (2008), and Seidl (2001), among others—is that the domains the phonology must refer to are directly delimited by phases alone, and that therefore both the Prosodic Hierarchy and reference to prosodic constituents are superfluous. This is the hypothesis adopted by D’Alessandro and Scheer (D&S) (2015).
A number of works (e.g., Cheng and Downing 2007, 2009, 2012a,b, 2016, Dobashi 2010, Selkirk 2011) have, however, demonstrated mismatches between Spell-Out domains and phonological domains, arguing that it is necessary to parse the string into prosodic constituents, such as intonational phrase or phonological phrase, to account for the attested mismatches. D&S’s goal is to motivate an approach within phase theory that can account for the same range of data without appealing to prosodic constituents. To achieve this goal, D&S propose that the Spell-Out operation should be separated from the Phase Impenetrability Condition (PIC). They call their approach Modular PIC. This response provides a critique of Modular PIC.
The response is organized as follows. In section 2, we first lay out the alternations concerning raddoppiamento fonosintattico in Eastern Abruzzese (ARF) that constitute the only well-developed argument D&S provide for Modular PIC; we then summarize the key innovative aspects of Modular PIC, showing how they allow phase theory to account for the ARF data. The remainder of the response takes up a critique of Modular PIC. In section 3, we argue that Modular PIC is unnecessarily powerful: it introduces an excessive lack of restrictiveness into phase theory, while not improving on existing alternative analyses of ARF. In section 4, we argue that, at the same time, Modular PIC is too restrictive because it cannot actually account for the Bantu language data D&S cite in support of their proposal. Furthermore, it ignores the role of well-documented nonsyntactic factors in determining the domains for phrasal phonological processes. In section 5, we sum up the main points of the response.
2 Modular PIC and Its Empirical Motivation
2.1 Raddoppiamento Fonosintattico in Eastern Abruzzese
The only empirical argument for Modular PIC that D&S work out in detail comes from raddoppiamento fonosintattico in Eastern Abruzzese, an Italo-Romance variety spoken in central Italy. Unlike the better-known case of raddoppiamento in Tuscan, in which gemination (raddoppiamento) is both stress-conditioned and lexically conditioned, in Eastern Abruzzese and other central-southern varieties, it is conditioned only lexically. This means that RF only affects the initial consonant of a word that appears after a closed set of lexical items, like llà ‘there’, so ‘am’, ni ‘not’, chə ‘that’. This is illustrated in (1), where the affected consonant appears in boldface.
‘I am seen’
‘she/he doesn’t/they don’t come’
In formulating the phonological process, D&S assume a more or less standard approach, according to which there is an extra timing slot at the end of these lexical items (llà, so, ni, etc.) in their underlying representation. This extra slot associates phonologically to the initial consonant of the following word when both trigger and target are within the appropriate domain. For convenience, we represent this lexical specification using X, as in (2a); compare these examples with the ones in (2b), which lack the timing slot. All relevant items are italicized.
Examples of lexical triggers for RF
Examples of nontriggers
Furthermore, D&S suggest that RF in Arielli Abruzzese (henceforth, ARF) occurs with these lexical triggers only when they are in a specific syntactic, phase-related relationship with the following word. While the participle following a passive auxiliary shows gemination (3a), the same participle following the perfect auxiliary does not, for either transitives or unaccusatives (3b–c). Some complementizers also trigger ARF, as shown in (3d); see also (2aii). In these examples, relevant potential triggers of ARF are shown with the associated extra X-slot assumed in D&S’s analysis, and actual occurrences of gemination are boldfaced.
‘I am respected.’
SoX rəspəttatə la leggə.
am.1SG respected.SG the.F.SG law.F.SG
‘I have respected the law.’
‘I have stayed.’
Jè mmeje chəX vve.
is better that come.3SG
‘It’s better that he/she comes.’
Raddoppiamento can take place only if the X-slot and the initial consonant of the following word are both within the same domain. In a direct-reference approach that relies on phases as the only “chunk-defining device” (D&S 2015:594), that means that the trigger and the target of gemination (i.e., the final X-slot of a word and the initial consonant of the following word) must be within the same Spell-Out domain. This is because Chomsky’s (2001) Phase Impenetrability Condition (PIC) renders Spell-Out domains (designated domains that start with the complement of a phase head) impenetrable to subsequent operations involving both higher domains and the particular Spell-Out domain. Thus, in [ZP Z . . . [HP α [H YP]]], if H is a phase head with YP its complement, “The domain of H [i.e., YP] is not accessible to operations outside HP; only H and its edge [α] are accessible to such operations” (Chomsky 2001:13).1
The relevant syntactic structures corresponding to (3) are shown in (4), with the Spell-Out domains expected in a standard phase-based system (Chomsky 2001; see also Chomsky 2008) shown in boldface.2 The PF domains that would be derived in this standard view of phases and Spell-Out are also shown in (4). The last two columns in (4) compare the presence or absence of gemination expected in this standard phase-based system with the actual output.
As (4a) indicates, because the VP complement would be subject to Spell-Out in the case of transitives, the v (with the preceding material) and its complement are in two separate domains at PF: [ . . . soX] [rəspəttatə]. Gemination would therefore not be expected because the initial r of the participle, [rəspəttatə], is not accessible to the X of [soX]. As shown in (4b–c), in the case of passives and unaccusatives, which have a defective v,3 the auxiliary and the participle do belong to the same Spell-Out domain: [ . . . soX rəspəttatə] and [ . . . soX rəmastə], respectively. Therefore, gemination is expected. Finally, in (4d), the complementizer chə would also trigger Spell-Out of its complement, TP. Hence, the following verb would not be expected to show gemination, as the two are in separate Spell-Out domains.
The view of Spell-Out domains assumed in (4) correctly predicts lack of gemination in (4a) and presence of gemination in (4b), but incorrectly predicts presence of gemination with unaccusatives as in (4c) ([ . . . soX rəmastə] → *so rrəmastə) and absence of gemination after complementizers as in (4d) ([chəX] [ve] → *chə ve). This is exactly the opposite of the pattern found in the data (cf. (3c–d)).
2.2 Modular PIC and Its Application to Eastern Abruzzese
As mentioned above, in the spirit of direct approaches to the interface, D&S’s goal is to motivate a single domain-defining mechanism for both syntax and phonology, in order to dispense with appealing to prosodic constituents, like phonological phrase or intonation phrase, distinct from syntactic ones. To account for challenges like the ones that ARF poses for standard phase-based interface analyses, D&S modify phase theory in three crucial ways: (a) there is no restriction on phase heads; (b) the PIC is parameterized to apply in the syntax, the phonology, or both; and (c) this parameterization can make reference to specific syntactic features. We elaborate on these points in turn.
Beginning with the question of restrictions on phase heads, since Chomsky’s (2000) original proposal that only C and v are phase heads, subsequent work has expanded the number of phase heads (see Den Dikken 2007 and Grohmann 2007, as well as the discussion in D&S 2015). D&S take this expansion to its logical conclusion, proposing that any head can be a phase head, the choice depending on the language.
With respect to the parameterization of the PIC, phases in D&S’s sense can have not only syntactic effects (and motivation) but also purely phonological ones. This is achieved through a new kind of lexical marking on heads (diacritic features), which determines whether a given phase head in this new sense is endowed with a syntactic PIC effect or not, and whether the same given phase head is endowed with a phonological PIC effect or not.4 We will use the notation [±PICsyn] for the former and [±PICpho] for the latter. In (5), we show the four possibilities that these features define. The set of heads, with values for these features, is what D&S call the phase skeleton.
(5) Possibilities for any given phase head H
[+PICsyn]: the domain of H is impenetrable to syntax
[−PICsyn]: the domain of H is accessible to syntax
[+PICpho]: the domain of H is impenetrable to PF
[−PICpho]: the domain of H is accessible to PF
Note that (5a) corresponds to the standard notion of phase heads in syntax, and (5b) to any head that is not a syntactic phase head, also a possibility within the standard view. In contrast, (5c–d) formalize D&S’s expansion of phase theory to include “phonological phases.” According to D&S, all possible combinations of these four parameters are attested: [+PICsyn, +PICpho], [+PICsyn, −PICpho], [−PICsyn, +PICpho], [−PICsyn, −PICpho].5 The combinations [+PICsyn, +PICpho] and [−PICsyn, −PICpho] would be the expected options in other direct-reference approaches, with coinciding syntactic and phonological chunks. However, the combinations [−PICsyn, +PICpho] and [+PICsyn, −PICpho] for a given head can give rise to nonmatching syntactic and phonological domains because the chunking seen in the syntax can be ignored in the phonology, and vice versa.6 Recall from section 1 that a mismatch between prosodic and syntactic constituents has been a central motivation in indirect approaches for mapping syntactic structure to prosodic constituents. (More on this in section 4.) With the settings [−PICsyn, +PICpho] and [+PICsyn, −PICpho], Modular PIC provides a way of formalizing (some types of) nonmatching that does not require recourse to prosodic constituents. Contrary to most indirect approaches, Modular PIC can refer to specific syntactic categories (and even specific features; see below).
Though introduced only in a footnote (D&S 2015:600n5), a third element of the theoretical architecture turns out to be crucial for Modular PIC: for any given language, any given phase head, in addition to its specification as [±PICsyn] and [±PICpho], must specify which phonological process this information is relevant for. Therefore, the typology in (5) is actually much more complex, especially on the phonological side. This is shown in (6), where different phase heads (Hi, Hj) are specified for different sets of phonological processes (P1, P2, . . .).7
(6) Specifications for different phase heads with [+PICpho]
Hi: [+PICpho] for P1, P3, P4, . . .
Hj: [+PICpho] for P1, P5, P6, . . .
As illustrated in (6), not only can a given phase head be specified for different phonological processes—a given phonological process can be encoded in more than one phase head, as well.
To summarize Modular PIC, we list in (7), in the form of questions, the options that are relevant for each functional head H.
(7) Options for each functional head
Q1 Is H a phase head or not?
If it is a phase head,
Q2 Does H induce the PIC in syntax?
Q3 Does H induce the PIC in phonology?
If H does induce the PIC in phonology,
Q4 For which phonological processes P1, P2, . . . , Pn does H induce the PIC in phonology?
Q5 For every process P1, P2, . . . , Pn, which syntactic features does H need to carry in order for H to have a prosodic effect?
Next, we show how Modular PIC deals with the ARF cases exemplified in (4), repeated in (8). Recall that the last column reflects the actual realization, while the penultimate column shows the expected results for a direct-reference approach based on a more standard theory of phases, like Chomsky’s (2008).
D&S propose to account for all the cases in (8) by assigning the features shown in (9), where the values for [±PICsyn] coincide with those widely assumed in other syntactic approaches. The notation vdef is used for passives and unaccusatives, which are weak phase heads in Chomsky 2001.
For (9a–b), the values for [±PICsyn] and [±PICpho] coincide; hence, the domain of gemination coincides with the one predicted by a more standard theory of phases, as reflected in (8a–b). The mismatches between phonology and syntax are defined in (9c–d) and exemplified in (8c–d). In (8c), the vP head, v, has a [+PICpho] feature (see (9c)), despite being [−PICsyn]. Therefore, even though there is no PIC effect in syntax, phonologically the VP is impenetrable, and lack of gemination is expected: so and rəmastə belong to two different domains and hence the participle cannot geminate (*rrəmastə). In the case of (8d), while the TP is syntactically a phase complement and the PIC should block ARF, the presence of a [−PICpho] feature on C (see (9d)) makes it a phonologically transparent domain and gemination correctly takes place (vve). Recall that the inaccessibility effect created by a [+PICpho] head is also process-specific in Modular PIC: phase heads must be marked with this feature with respect to the ARF process, but they could have different values for other phonological processes. In addition, ARF is conditioned by the syntactic features of the head: v heads with “an active value for the [voice] feature on v” (D&S 2015:614), like (9a,c), are specified as [+PICpho] for ARF, while v heads with the opposite value, like (9b), are specified as [−PICpho] for ARF.
In section 3, we argue that Modular PIC is too unrestricted—and unnecessarily so, since the ARF data can be reanalyzed without having to modify phase theory.
3 Modular PIC Is Unnecessarily Powerful
3.1 Excessive Power Illustrated
Even though Modular PIC provides a coherent account of the ARF data, an immediate concern is its excessive power, given that for any phonological process, any head (with some specific feature) can have any of the four combinations provided by the features [±PICsyn] and [±PICpho].
The loss of restrictiveness can be seen by considering the typological consequences of variation in feature specifications for a phase head like v. In a standard phasal analysis, nondefective v induces a PIC effect in the phonology and defective v does not. In Modular PIC, v [active], vdef (unaccusatives), and vdef ssive] can—on a language-particular basis—induce or not induce a phonological PIC effect, as shown in (10). In addition to the ARF system, seven other typological possibilities are predicted.
Thus, in addition to the ARF outcomes so rəspəttatə, so rrəspəttatə, and so rəmastə, we predict a variety with so rəspəttatə, so rəspəttatə, so rəmastə; another one with so rəspəttatə, so rəspəttatə, so rrəmastə; another one with so rəspəttatə, so rrəspəttatə, so rrəmastə; and so on. At the same time, ARF could be linked to other syntactic features, possibly feature combinations, further expanding the factorial typological possibilities. In effect, Modular PIC allows a phonological process like ARF to be made specific to individual syntactic constructions.
The concern raised in (10) for raddoppiamento generalizes to any type of phenomenon. To illustrate the unrestricted power of Modular PIC, let us take a language L with the phase skeleton [C, v, D, P] and a phonological process Ph, and let the domain of application of Ph vary as predicted by Modular PIC. For example, v can be specified either as [−PICpho] or as [+PICpho], and in this case it can be restricted to v’s with specific syntactic featural specifications like [active], possibly combined with other features. Hence, the number of variations in the contexts in language L of process Ph with respect to v is Nv+2 (Nv, that is, the number of features on v; plus N with no feature; plus the lack of v in the skeleton), and the total number of options for L will be (Nc+2) × (Nv+2) × (ND+2) × (NP+2). This vast number of possible variations is excluded in theories based on the Prosodic Hierarchy, where phrasal processes apply within a restricted set of prosodic constituents, which are defined with reference to a restricted set of general syntactic constituent types. (See, for example, Selkirk 2011 for discussion.)
Due to this lack of restrictiveness, Modular PIC predicts patterns that are unattested, as far as we know, in human language. To give some examples, for a given external sandhi process sP, we would expect to find a language in which all (phase) heads are [−PICpho] except for v, which is [+PICpho]. Assuming that the verb moves to v, in that language sP would apply across the board except between a verb and its complement, where it would be blocked. Similarly, we would expect a language with the same situation but with only D as [+PICpho]. Here, the process would again apply across the board except between a determiner and its complement. In a language where, instead, all heads were [+PICpho] except for v, the process sP would only apply between a verb and the first word of its complement (assuming again that the verb moves to v). We would also expect languages in which both D and v are [−PICpho] while the remaining heads are [+PICpho]. In such a language, sP would apply across D and across v but not across other heads.
In addition, a language could have a phonological process applying across the board except across passive v—that is, a head with a specific feature. Many other examples can be constructed with similar specifications based on other heads or on any combination of heads, and the patterns that most of them predict do not seem to exist. Many external sandhi processes, like flapping in English (mentioned in D&S 2015), spirantization in Spanish (see, e.g., Mascaró 1984, Piñeros 2002), or several types of assimilation, apply essentially across the board. While other processes, like raddoppiamento or French liaison, have a more restricted and complex distribution, it is not of the kinds described above.
3.2 Alternative Analyses and Empirical Problems
Perhaps this loss of restrictiveness would have to be considered an acceptable price to pay, if there were no alternative account available for patterns like ARF. But ARF can, in fact, be reanalyzed in at least two ways without having to resort to the drastic modification of phase theory that Modular PIC embodies.
D&S’s argument depends crucially on the assumption that so ‘am’ and si ‘are.2SG’ in passives and in the present perfect (transitives or unaccusatives) are instances of the same lexical verb, which in turn depends on the existence of auxiliary selection by person. According to D&S, in passive constructions the auxiliary is always the verb BE. However, as shown in (11), in the present perfect, the 1SG /PL and 2SG /PL select BE while the 3SG /PL selects HAVE.
We can safely assume that in Eastern Abruzzese, both passive and copular constructions contain the same verb BE, as is generally the case in Romance: so rrəspəttatə ‘I am respected’, so vvikkjə ‘I am old’.8 But in the present perfect, it is not clear that synchronically the auxiliaries should be considered members of two different verb paradigms, syntactically selected by person, even though diachronically the auxiliaries derive from both BE and HAVE.9 The verb HAVE indicating possession (Italian avere) is a different verb in Eastern Abruzzese, tenə, so the auxiliary HAVE seems to consist of the single form a of the 3SG /PL.10 Furthermore, serious doubts have been cast on the syntactic character of the selection of auxiliary by person; see Bentley 2006:55–59, 61–64, Bentley and Eythórsson 2001, and the detailed studies in Loporcaro 2001, 2007 for arguments against a syntactic analysis.11
Instead of assuming that synchronically Ariellese, and Abruzzese in general, uses two different auxiliaries for the perfect, with selection by person, one could plausibly assume that, as in English and other Romance languages, there is a single perfect auxiliary, different from the one used in passives and in copular constructions, as shown in (12). This option is considered by Biberauer and D’Alessandro (2006), but rejected on grounds of complexity that are not very convincing.12
In (12), two forms of the HAVE auxiliary are identical to the corresponding forms of BE (semə, setə), two forms differ in the presence or absence of the extra timing slot (soX/so, siX/si), and the other two are totally different (jèX/a). Under this alternative interpretation of the auxiliary system, the presence or absence of gemination would have a straightforward explanation in terms of lexical marking, needed anyway, and would have nothing to do with phases. There is gemination in the passive because the passive auxiliaries soX, siX, and jèX have the X timing slot; there is no gemination in the present perfect because its auxiliary forms lack the X-slot.13
Another alternative analysis, suggested by Van Oostendorp (2015), is to consider that there is indeed auxiliary selection by person, but that the difference between the presence of raddoppiamento in passives and its absence in actives (i.e., transitives, unaccusatives) is the manifestation of a floating mora related to a specific morphosyntactic featural specification, like ssive]. The auxiliary itself need not be lexically specified for an extra timing slot (or mora). Under this view, again there is no need to resort to the notion of phase in any of its versions. Further support for an analysis based on floating moras as the realization of morphosyntactic features comes from another Italo-Romance variety spoken in Calvello (Basilicata), where a masculine noun surfaces with an initial geminate consonant when it has a mass interpretation and is preceded by the definite article; otherwise, it surfaces with a singleton consonant. The data in (13), from Gioscio 1985, are analyzed in Mascaró 2016 in terms of a floating timing slot (a C, or alternatively a mora) that constitutes the realization of the morphosyntactic feature [MASS].
lu panə lu ppanə ‘the bread’
lu fjerrə lu ffjerrə ‘the iron’
There are other problems of an empirical sort. If we had a general analysis of the domains of application of ARF that followed from Modular PIC, we would have a piece of evidence in favor of the theory. But the evidence presented regarding ARF is at best dubious given that it is limited to the verbal forms so and si, a very small subset of the lexical elements triggering ARF. In his analysis of Tollo (15 km from Arielli), Hastings (2001:271–272) lists 37 triggers belonging to different classes: auxiliary forms, complementizers, imperatives, adverbs, prepositions, articles, demonstratives, numerals, negation, and quantifiers. These data are particularly relevant as a test for Modular PIC, given that gemination in ARF, and in central-southern varieties in general, has been described as restricted to a small prosodic domain that does not appear to be amenable to syntactic analysis as a phase complement domain. Indeed, according to Fanciullo (1986:87–88; see also 82–83, 85–86), “lexical elements, which constantly produce reinforcement in the center-south dialects, occupy, in the phrase, a well-defined place: they are not conceivable if they are not connected rather rigidly to the items to which they refer, with which they come to form a minimal phrase . . . —a kind of hierarchically superior word.”14 Before it can be concluded that ARF is more adequately and straightforwardly accounted for by Modular PIC than by alternative analyses, it is necessary to test the theory on all the lexical triggers of raddoppiamento, not only the few cases discussed in D&S 2015.
As we have shown in this section, the empirical evidence that motivates Modular PIC reduces to a small subset of the relevant evidence in Eastern Abruzzese and is amenable to alternative analyses. For Modular PIC to be considered a tenable approach to the interface, it must also be shown that a wide range of the data analyzed within indirect-reference models for the last 30 years is amenable to a Modular PIC analysis. In the next section we argue, with data from Bantu languages and others, that Modular PIC is too restrictive to account for a number of familiar cases.
4 Modular PIC Is Too Restrictive
As we have just demonstrated, Modular PIC is too powerful. In this section we argue that, paradoxically, Modular PIC is also too restrictive. While D&S claim that Modular PIC provides sufficient machinery to determine the domain of all phrasal phonological processes, so that, as D&S put it, “the entire Prosodic Hierarchy is superfluous and has to go” (D&S 2015:618), in this section we show that Modular PIC makes wrong predictions in four different types of cases that receive a straightforward analysis in prosodic constituent theory. We begin with data from some Bantu languages, the focus of D&S’s section 5.
4.1 The Problem of Edges: Data from Zulu and Chicheŵa
To understand the problem for Modular PIC presented by the Bantu languages Zulu (S.30) and Chicheŵa (N.31), cited in D&S 2015, it is necessary to properly understand the core data. (D&S 2015 contains quite a number of factual mistakes; see footnote 15.) In both Zulu and Chicheŵa, vowel length is not contrastive; penultimate vowels are lengthened as a correlate of phrasal stress.15 More specifically, a prosodic phrasal domain boundary (indicated by parentheses) follows the word that has penultimate vowel lengthening. Consider first the phrasing in the simple sentences in (14) and (15), where lengthened penultimate vowels are boldfaced.16
(ú-Síph’ ú-phékél’ ú-Thánd’ in-kúukhu)
CL1-Sipho 1SBJ-cooked.for CL1-Thandi CL9-chicken
‘Sipho cooked chicken for Thandi.’
((bá-ník’ ú-Síph’ í-bhayisékiili) namhláanje)
2SBJ-gave CL1-Sipho CL5-bicycle today
‘They gave Sipho a bicycle today.’
(ma-kóló a-na-pátsíra mwaná ndalámá zá mú-longo wáake)
CL6-parent 6SBJ-TAM-give CL1.child CL10.money 10.of CL1-sister CL1.her
‘The parents gave the child money for her sister.’
(Báanda) ((a-ná-wá-ona a-leéndó) dzuulo)
CL1.Banda 1SBJ-PST2-2OBJ-see CL2-visitor yesterday
‘Banda saw the visitors yesterday.’
A simplified syntactic structure for (14) and (15) is given in (16), where // indicates a Spell-Out domain based on the complement of a phase head, as Modular PIC assumes.17 (IO = indirect object, DO = direct object)
(16) [CP C0 // ([TP Subject T0 verb [vP v0 // ([VP IO DO]]) // Adv] // ])
What (14) and (15) show is that (a) the subject typically phrases with the remainder of the sentence;18 (b) the verb plus following (nonmodified) complements are phrased together in a prosodic phrase; (c) temporal adjuncts phrase separately from the constituents in the verb phrase. From (16), it is clear that basing prosodic domains on Spell-Out domains (i.e., on complements of a phase head) would predict that the verb is phrased separately from the objects. In other words, the prosodic phrases in Zulu and Chicheŵa are typically bigger than phases. It should be noted that even if Spell-Out domains are based on the whole phase (and not just the complement of the phase head), the outcome is still too restrictive; for instance, the verb and the subject would still be expected to be phrased separately from the two objects. (See Cheng and Downing 2016 for detailed discussion.)
Turning to more complex sentences, consider the Zulu relative clause in (17) and its corresponding schematic structure in (18), again with the Spell-Out domains (using complements of a phase head, following D&S) indicated by //, assuming that D0 is also a phase head.
((ín-dod’ é-gqokê ísí-gqooko) í-boné ízi-vakááshi)
CL9-man REL.9SBJ-wear CL7-hat 9SBJ-see CL8-visitor
‘The man who is wearing a hat saw the visitors.’
(18) [[DP D0 // [CP Head N C0 // [TP . . . T0 verb [vP v0 // [VP DO]]]]] T0 verb [vP v0 // [VP DO]]]
As (17) shows, the head noun of the relative clause is phrased together with the whole relative clause, as penultimate vowel lengthening is found only at the right edge of the relative clause (and the right edge of the matrix clause). In contrast, (18) shows that a Spell-Out domain account would split the head noun from the rest of the relative clause, and split the verb from its object, giving the wrong predictions. The relative clause example illustrates again that the prosodic domains defined by penultimate vowel lengthening in Zulu are bigger than the Spell-Out domains.19
Prosodic phrasing can be conditioned by phases, and phonology accesses the final output of the syntactic representation (with phase edges).
Spell-Out domains do not match prosodic domains.
This mismatch holds true regardless of whether a Spell-Out domain corresponds only to the complement of a phase head or to the XP headed by the phase head. (See Cheng and Downing 2016 for detailed discussion.)
From (14)–(18), we see that the standard conception of Spell-Out domains as prosodic domains will parse the simple sentences and relative clauses incorrectly. Essentially, it predicts prosodic boundaries where there are none. What Cheng and Downing (2012b, 2016) argue for, following other work on the phonology-syntax interface like that of Dobashi (2010) and Selkirk (2011), is nonisomorphism between syntactic and prosodic structure.
D&S acknowledge the nonisomorphism and claim that Modular PIC is able to account for the type of nonisomorphism found in Bantu language data like those just cited. Let us now evaluate their claim. The data reviewed above are schematized in (20), where the elements that actually have penultimate vowel lengthening are subscripted with PVL. We indicate the prosodic boundaries expected given the occurrence of penultimate vowel lengthening with a superscript π.
[CP C0 [TP Subject T0 verb [vP v0 [VP IO DOPVL]π] AdvPVL]π]
[[DP D0 [CP Head N C0 [TP . . . T0 V [vP v0 [VP DOPVL]π]]]] T0 V [vP v0 [VP DOPVL]π]]
D&S focus on the absence of lengthening where more conventional phase theory would predict its presence: on the verb in (16)/(20a) and on the head noun (or the relative prefix) in (18)/(20b). To account for these types of Bantu data, D&S propose that within Modular PIC it can be assumed that both C0 and v0 are [−PICpho] for the lengthening process. Then it follows that in (20a) there is no lengthening on the verb because at PF both v0 and its complement are part of the same phonological domain. In (20b), there is no lengthening on the head noun (or the relative prefix) because C0 does not cause phonological chunking either. In other words, given Modular PIC, even if C0 and v0 are subject to Spell-Out in narrow syntax, they are not endowed with a PIC feature at PF, and hence the phonology will not consider them domain-final.
However, D&S’s proposal does not account for the Bantu data, because it fails to consider the presence of lengthening before the adverb in (20a) and the presence of lengthening at the end of the relative clause in (20b). If both C0 and v0 are [−PICpho], neither the direct object in (20a) nor the direct object in (20b) would be expected to be lengthened. To account for the penultimate vowel lengthening of the direct objects and the adverb, D&S could try to say that it is the whole phase (not the complement of the phase head) that matters and that both vP and CP are [+PICpho]. This would account for the presence of lengthening at the end of the vP (before the adverb) in (20a) and also for its presence at the end of the relative clause in (20b). But this would also predict, contrary to fact, that there should be lengthening before the vP in (20a)—that is, on the verb and on the head noun. If one doubts that the verb has raised so high, the same question can be asked about the subject in (20a), or the head noun in (20b). What is clear from the data above, and from (20a–b), is that penultimate vowel lengthening occurs only on words at the right edges of phases.
The problem that we raise here has to do with this edge asymmetry. That is, the Bantu data above show that in the prosodic phrasing in Zulu and Chicheŵa, only the right edge appears to be active, in the sense that only the right edge conditions phonological processes such as penultimate vowel lengthening. Such an edge asymmetry is problematic for the Modular PIC approach (and for other direct-reference approaches). As Cheng and Downing (2016:186) note, “[T]he problem does not just involve a lack of direct mapping between a phase-cycle and a prosodic cycle; rather, there is also an asymmetry between the left and right edges of the phase.” This is because linking any prosodic phrasing directly to phases that are associated with the PIC predicts that the whole phase (or the whole complement of the phase head) is relevant when it comes to the PIC, rather than just its right or left edge. In short, given the data, we cannot conclude that phases form prosodic islands, because it is not the whole phase that functions as an island. Rather, it is only the right (or, less commonly, left) edge of a phase that conditions a prosodic phrase boundary. In the case of Zulu and Chicheŵa, left edges of phases do not play a systematic role in conditioning prosodic boundaries.20 Modular PIC is too restrictive to allow for this asymmetry.
4.2 Prosodic Influence on the Size of Domains
Modular PIC, like other direct-reference theories, is also too restrictive in the sense that it does not allow for prosodic—as opposed to purely morphosyntactic—information to condition prosodic domain formation. In this section, we illustrate the importance of prosodic information with Lekeitio Basque (for more detailed descriptions and for analyses assuming the Prosodic Hierarchy, see Elordieta 1997, 2007a, 2015, and Selkirk 2011, among others); Tokyo Japanese shows a similar pattern (see Kawahara 2015 for a detailed description).
In Lekeitio Basque, a lexical distinction is made between accented words and unaccented words. Phonological domains are determined as follows. Each accented word constitutes a prosodic domain. Unaccented words, however, must be grouped into a single prosodic domain, called accentual phrases in Elordieta 1997 and φmin in Elordieta 2015. Examples (21a–b), adapted from Elordieta 1997:18, illustrate different prosodic groupings with two DPs that are segmentally homophonous (similar examples can be found in Selkirk 2011). Each prosodic domain is marked with parentheses. Following Elordieta 1997, syllable boundaries are indicated with periods.
L% H* L L% H*L
(la . gú. nen) (di . ru . a)
‘the friends’ money’
(la . gu . nen di . ru . a)
‘the friend’s money’
Accented words have a falling pitch accent, H*L, on the penultimate syllable; prosodic domains are characterized by an L% boundary tone on the first syllable plus the lexical H*L tone. In (21a), a lexically accented word (a genitive plural noun, lagúnen ‘of the friends’) is followed by an unaccented word (dirua ‘money’). Since lagúnen is accented, it has the H*L pitch accent on its penultimate syllable and constitutes a prosodic domain. Consequently, the rest of the utterance, dirua, is a separate prosodic domain. The unaccented word dirua also appears with a H*L tone, mandatory in all prosodic domains of type φmin; however, unlike with lexically accented words, in this case the contour tone is located on the last syllable, not the penultimate one. In (21b), both words are unaccented, lagunen being genitive singular here. In this case, each word constitutes a single prosodic domain.21 The domain starts with the L% boundary tone and ends with the H*L on the last syllable.
Example (22) contains several unaccented words. In such cases, there is a single prosodic domain up to the verb, which, together with the auxiliary, constitutes a separate prosodic domain.
L% H*L L% H*L
(nire amen dirua) (galdu dot)
my mother.GEN.SG money.ABS.SG lose have.1SG
‘I have lost my mother’s money.’
As Elordieta (1997, 2007a, 2015) and Selkirk (2011) argue, prosodic domains at different levels in Lekeitio Basque are largely determined by syntactic structure but, crucially, also by the lexical prosodic factors outlined here. Within Modular PIC, for (21a–b) one could tentatively say that a Number head with the feature [plural] (not the singular) is [+PICpho], hence causes the following word to be phrased separately in (21a). But the presence or absence of accent is also an idiosyncratic property of roots: while lagun ‘friend’ (in (21a–b)) and etxe ‘house’ are unaccented and thus do not trigger the formation of a separate domain in the singular, as in (21b), other roots, like léku ‘place’ and átze ‘back’, are accented and trigger the formation of a separate prosodic domain, even in the singular. Modular PIC, with its rejection of the Prosodic Hierarchy and its reliance on heads and phases as the sole chunk-defining device, cannot refer to idiosyncratic phonological properties such as the presence or absence of accent on roots and affixes, hence cannot account for prosodic domains in Lekeitio Basque.
4.3 The Problem of Modifiers
Modular PIC also cannot account for differences in the prosodic phrasing of modified vs. unmodified noun phrases, which challenge D&S’s claim that “[i]f a particular phenomenon suggests that a phase head—say, v—lacks or is endowed with a PIC at PF, the PIC is expected to be lacking (or to be present) in all constructions involving the head and that phenomenon in this particular language” (D&S 2015:617). First, consider data from Kinyambo (a Bantu language spoken in Tanzania). In this language, the process of High tone deletion is conditioned by prosodic domains: a High tone is deleted if followed by a High tone in the following word in the same prosodic phrase. (See Bickmore 1990 for detailed discussion.) In (23), the vowel bearing the High tone that undergoes deletion is underlined. As Bickmore (1990) shows, in cases like (23a) the High tone of the unmodified subject noun is deleted before the verb, providing evidence that the subject is phrased with the verb. In cases like (23b), only the High tone of the subject noun, but not that of the following modifier, undergoes High tone deletion, providing evidence for a phrase break between the subject noun + modifier phrase and the verb.
(23) Kinyambo phrasing
/aba-kózi bá-ka-júna / → (abakozi bákajúna)
‘The workers helped.’
/aba-kózi bakúru bá-ka-júna / → (abakozi bakúru) (bákajúna)
CL2-workers 2.mature 2SBJ-TAM-help
‘The mature workers helped.’
The contrast between (23a) and (23b) clearly shows that D&S’s claim is incorrect, as modified nominal subjects (23b) phrase differently from nonmodified ones (23a). Under Modular PIC, the phenomenon of High tone deletion in (23a) tells us that if D0 is a phase head, it is endowed with [−PICpho], since the subject DP and the verb belong to one prosodic phrase. However, Modular PIC runs into trouble in (23b): though the D0 is [−PICpho], the subject is nonetheless phrased separately from the verb. The only possible way for D&S to ensure that the additional modifier in (23b) yields a prosodic break is to allow the modifier to somehow induce a [PICpho]. However, this move would again face the “edge asymmetry problem” because then High tone deletion would incorrectly be predicted to be blocked at the left edge of the modifier. As (23b) demonstrates, though, High tone deletion is only blocked after the modifier, not before it. This is the sort of language-internal evidence that falsifies Modular PIC, according to D&S’s own criteria.
The same problem arises in accounting for the distribution of penultimate vowel lengthening in Chicheŵa. As (24a) shows, the indirect object noun phrase anyaní ‘baboon’ is not separated from the direct object nsómba ‘fish’ by a prosodic boundary in a neutral sentence. However, when the indirect object noun phrase is modified, as in (24b), the noun phrase is prosodically parsed separately from the direct object. (Recall that penultimate vowel lengthening (boldfaced) is the correlate of prosodic phrasing.)
(a-lendó a-na-dyétsa a-nyaní nsóomba)
CL2-visitor 2SBJ-TAM-feed CL2-baboon CL10-fish
‘The visitors fed the baboons fish.’
(a-lendó a-na-dyétsá a-nyaní á-saanu) (nsóomba)
CL2-visitor 2SBJ-TAM-feed CL2-baboon CL2-five CL10-fish
‘The visitors fed five baboons fish.’
As Cheng and Downing (2016) point out, it is quite common in Bantu languages for modified nouns to have different phrasing properties from unmodified ones. Besides the examples cited here, the effect of modifiers on phrasing in Tsonga has received attention, as it forms a central case study in Selkirk’s (2011) handbook chapter. Outside of Bantu languages, works such as D’Imperio et al. 2005, Elordieta, Frota, and Vigário 2005, Ghini 1993, Nespor and Vogel 1986, Prieto 2005, Sandalo and Truckenbrodt 2002, and Selkirk 2000 have shown the effect of nominal modifiers on the phrasing of subject and object nominal phrases in various Romance languages. However, D&S do not take this sort of data into account in testing Modular PIC.
It is, in fact, a challenge for direct-reference approaches to the interface in general to account for such data. For Modular PIC to be able to account for this pattern, a DP must be a phase when it is modified, but not a phase when it is not modified. This makes an account in terms of phases more and more difficult.22 In contrast, as Cheng and Downing (2016) point out, the interaction between syntactic and prosodic factors (like minimality, branchingness of a nominal phrase, or eurhythmy in a prosodic parse) is easy to model in indirect approaches, as one expects prosodic constituent formation to be subject to prosodic well-formedness constraints. (See Bickmore 1990, Downing and Mtenje 2011b, Nespor and Vogel 1986, Prieto 2005, and Selkirk 2011 for a variety of indirect-reference approaches to this problem.) Finally, it should be noted that in Chicheŵa, the left edge of the DP (either direct or indirect object) boundary is not active in (24a–b), illustrating again the edge asymmetry problem mentioned in the preceding sections.
4.4 The Problem of Chimwiini
D&S’s discussion of “Bantu” focuses on languages where the prosodic phrasing is typically larger than a phase. However, as D&S must be aware from work like Selkirk 1986, cited in their references, one also finds Bantu languages like Chimwiini where prosodic phrasing is smaller than a phase. (See, too, Kisseberth 2005, 2010a,b, 2017, Kisseberth and Abasheikh 1974, 2004, Truckenbrodt 1995, 1999.) The cue to phrasing in Chimwiini is the (potential) occurrence of a long vowel and obligatory accent (marked with an acute accent). As shown in (25), not every word has an accent; only words at the right edge of lexical XPs do so. Vowel length is also not freely distributed on the surface; rather, contrastively long vowels (boldfaced) can surface only in the antepenultimate or penultimate syllable of a prosodic phrase.
(25) Distribution of accent and vowel length in Chimwiini
On the basis of these patterns, Kisseberth (2017 and elsewhere) along with Selkirk (1986) and Truckenbrodt (1995, 1999), has argued that a prosodic phrase break follows every lexical XP; more examples illustrating this point are given in (26).23
(26) Prosodic phrasing in Chimwiini
(sultani úyu) ((sulile m-loza mw-aanáwe) mú-ke)
‘This sultan wanted to marry his son (to) a woman.’
((ni-wa-pele w-aaná) maandá)
‘I gave the children bread.’
(Hamádi) (((mw-andikilile mw-áana) xáti) ka Núuru)
‘Hamadi wrote for the child a letter to Nuuru.’
As the examples indicate, there is always a phrase break separating the subject and the verb, and there is always a phrase break separating postverbal complements. As Cheng and Downing (2016) (following others) show, these are the breaks predicted by a constraint aligning prosodic phrase edges with lexical XP edges (Kisseberth 2010a,b, Selkirk 2000, Truckenbrodt 1995, 1999).
ALIGNR(XP, Phonological Phrase)
Align the right edge of a lexical XP with the right edge of a Phonological Phrase.
However, these are clearly not the phrase breaks predicted by a classic Spell-Out domain approach to prosodic phrasing, as a Spell-Out domain potentially includes more than one lexical XP (see the examples from Zulu and Chicheŵa above).
Even though D&S (p. 597) propose that “phases must be small enough to allow every phonologically relevant stretch of the linear string to be described,” we think that Modular PIC cannot account for the Chimwiini phrasing pattern shown in (26). If D&S assume that every D0 is a phase head (and its NP complement would be the Spell-Out domain) in Chimwiini, we would not be able to explain why the DP sultani úyu ‘this sultan’ would be phrased together: the NP sultani should constitute the Spell-Out domain. Furthermore, we would still have the edge asymmetry problem: mwaanáwe ‘his son’, waaná ‘the children’, and mwáana ‘the child’ in (26a–c) are not separated on their left edge from the verb (but they are separated on their right edge from the other elements in the verb phrase). If we take only C0 and v0 to be phase heads, assuming that the verb has at least moved to v0, we would expect (a) the verb to be phrased separately from the objects, as the latter belong to a separate Spell-Out domain, and (b) the verb and the subject to be phrased together. Finally, assuming a combination of D0 and v0 to be phase heads also does not help: we would still have the problems pointed out earlier connected to D0 being a phase head, as well as the edge asymmetry problem.
A further problem raised by the Chimwiini data, as Kisseberth (2005, 2010a,b, 2017) makes clear, is that of recursive assignment of final tonal accent. The position of tonal accent in Chimwiini is morphologically determined. Accent is assigned to the final syllable of verbs with a 1st or 2nd person subject prefix,24 and to the penult elsewhere.
(28) Accent assignment in Chimwiini
1st person n-jiilé ‘I ate’ chi-jiilé ‘we ate’
2nd person jiilé ‘you ate’ ni-jiilé ‘you pl. ate’
3rd person jíile ‘she/he ate’ wa-jíile ‘they ate’
While the accent distinction is motivated by morphological properties of the verb, the actual accent need not be realized on the verb. Instead, it is realized on the final vowel of every phonological phrase within the scope of the verb phrase. As Kisseberth (2010a,b, 2017) argues, this pattern of accent assignment is best accounted for by an appeal to recursive phonological phrasing. Final accent assignment then targets every phrase-final vowel of a recursive phonological phrase that includes the trigger verb, as in (29d).
(29) Chimwiini recursive phrasing
‘I ate meat.’
(sí) (chi-lele masku mazimá)
we we-slept night whole
‘We slept the whole night.’
((ni-m-lisile mweenziwá) deení)
I-1OBJ-pay.to 1.friend debt
‘I paid my friend the debt.’
(((ni-m-tindilile mwaaná) namá) kaa chisú)
I-1OBJ-cut.for 1.child meat with knife
‘I cut for the child meat with a knife.’
The Chimwiini data are problematic for Modular PIC, then, not only in terms of having prosodic phrases that are smaller than Spell-Out domains, but also in terms of recursive processes such as final accent assignment, which would require that the grammar look inside each Spell-Out domain, in contradiction to the PIC. In relying on phases to define prosodic domains, Modular PIC is thus too restrictive to account for the data.
To sum up this section, because Modular PIC puts the focus on phase heads (in a very unrestrictive version of phase theory), it cannot account for phonological phrasing determined by other types of syntactic information, like phase edges, DP edges, or presence vs. absence of branching. Nor can Modular PIC account for phrasing determined, at least in part, by prosodic conditions and prosodic information, like presence of lexical accent, phrase length, or balance between prosodic constituents. (For further discussion of these kinds of issues and additional references, see Kentner and Féry 2013 and D’Imperio et al. 2005, among many others.)
It is uncontroversial that there is some transfer of information from syntax to phonology—phonology does not just compute a string of terminals. Phase theory creates two new types of constituents, the phase and the domain defined by the PIC, for phonology to refer to. It is reasonable to conjecture that either phases or phase complements (or both) are part of what phonology can refer to, either directly or indirectly (modulo prosodic constituent formation). Modular PIC takes a strong direct-reference stance: in a modified phase theory, phase complements are the only domains relevant in phonology, and prosodic constituents are superfluous. In this response to D&S 2015, we have demonstrated that Modular PIC does not provide a convincing alternative to prosodic-constituent-based theories of the interface, as it is both too powerful and not powerful enough. We have shown that D&S’s proposed analysis of raddoppiamento fonosintattico in Eastern Abruzzese does not justify the loss of restrictiveness Modular PIC brings to phase theory. Moreover, we have shown that Modular PIC is too restrictive to account for phenomena, from Bantu languages and others, that have received satisfactory analyses within interface theories that appeal to prosodic constituents. We conclude that Modular PIC does not successfully replace prosodic constituent approaches to the interface. Phase domains cannot constitute direct-access domains for phonology, and the PIC should retain its original, restrictive formulation (Chomsky 2001, and subsequent literature).
1 In Chomsky 2001, it is actually not very clear whether the domain of H or the phase itself is subject to Spell-Out. D&S adopt the version according to which Spell-Out affects the domain of H.
4 The diacritic nature of this parameterization is also noted by Manzini and Savoia (2016:231): “We note that ±phase or ±PIC are not lexical parameters, since they involve not bona fide properties of lexical items, but rather encode derivational instructions. In general, while the terminology of Chomsky (2001, 2007) is maintained, it is partially voided of its actual content.”
5 It is difficult to see how a phase head that is [−PICsyn, −PICpho] is different from a head that is not in the phase skeleton.
6 An anonymous reviewer points out that, because of the lack of restrictions on the possible combinations of [±PICsyn] and [±PICpho], the prosodic domains defined by [+PICpho] are only loosely related to syntactic phases. We agree, but we do not elaborate further on this point for reasons of space.
7 One could instead suppose that it is each phonological process that is specified for the [±PICpho] character of each phase head. But since the same type of specifications must be made for the syntax, it makes more sense to encode all specifications related to the PIC on phase heads.
8 D&S do not give examples of so, si in copular sentences; so vvikkjə is taken from Hastings 2001:239.
9 Most forms of the auxiliary (so, si, a, semə, setə, a) derive historically from BE (Latin sum: so, si, semə, setə) and one derives from HAVE (Lat. habeo: a).
10 In the pluperfect, the auxiliary is not a past form of BE or HAVE. Diachronically, it derives from a sequence containing both Latin sum and habeo: that is, so ’vé 1SG, si ’vé 2SG, a ’vé 3SG/PL, with person marked on the first element, and s’avemə 1PL, s’avetə 2PL, best analyzed as compound forms (D’Alessandro and Ledgeway 2010).
11 To these arguments we should add the fact that in Eastern Abruzzese, and in general in systems with selection by person, it is usually only the indicative present perfect that is affected and not all perfect tenses (unlike in the case of selection by verb type found in Italian and French, for instance).
12 “A priori the latter option [the lexicon contains homophonous so (essere) and so (avere)] is less appealing since it necessitates the postulation of a more complex lexicon, namely one containing two pairs of homophonous auxiliaries, which do not differ in any aspect of their phonological make-up, but nevertheless have different RF-triggering capacities” (Biberauer and D’Alessandro 2006:90).
13 The presence of syncretism (or homophony) across some verbs is not uncommon. In Catalan, for instance, the verb anar ‘to go’ is used as an auxiliary, with the preposition a, with a future meaning, as in English. But identical forms from the present tense of the same verb, in all persons but first and second plural, are used as an auxiliary to form the past tense. So, vaig means ‘I go’, vaig a fer means ‘I’m going to do’, and vaig fer means ‘I did’ (but in 1st plural: anem ‘we go’ and anem a fer ‘we are going to do’ vs. vam fer ‘we did’). It is unlikely that the same lexical item can be involved in the realization of both future and past.
15 D&S attribute to Kanerva (1990) the claim that “[i]n Bantu, the right edge of phonologically relevant domains is generally marked by penultimate vowel lengthening” (p. 615). They use the term Bantu as if there were only one Bantu language, or as if Bantu languages were identical with respect to the phenomena discussed. There are between 300 and 600 Bantu languages (Nurse 2006), and detailed analyses of syntactically defined prosodic domains are available for only a small number of them. It is therefore inaccurate and misleading of D&S (p. 616) to suggest that “the patterns are essentially identical” in Bantu in general. On the contrary, one finds a great deal of variety in the phonological cues to prosodic domains (see our discussion of Kinyambo and Chimwiini in sections 4.3 and 4.4, for example), in the syntactic properties of the languages, and in the typical size of the prosodic domains. One need look no further than the languages and references cited in D&S 2015 to determine how misleadingly it represents “Bantu” prosodic patterns. See Cheng and Downing 2007, 2009, 2012a,b, 2016, Downing 2010, 2011, and Downing and Mtenje 2011a,b for more detailed discussion of the Zulu and Chicheŵa patterns surveyed here.
16 The phrasings provided in this section are taken over from the sources indicated. When the difference between recursive and nonrecursive phrasing is relevant, it is discussed in the text.
20 The left edge seems to play a role when we are dealing with left-dislocated topics or adjuncts such as nonrestrictive relative clauses. See Cheng and Downing 2009 for discussion.
21 H tones spread to the left up to a specified tone, and in (21a), the second L% is downstepped. These details are omitted for clarity.
22 Note that approaches with dynamic phases, such as Bošković’s (2014), will also not be able to account for these facts. For Bošković’s approach, as for Modular PIC, the difficult fact to account for is the lack of a prosodic boundary in unmodified noun phrases.
24 Accent is also assigned to the verb in a relative clause (which has a special conjugation).
This reply originated from two separate replies, one by Bonet and Mascaró, the other by Cheng and Downing. We thank Jay Keyser for suggesting that we combine them. We thank Peter Svenonius, Antonio Fábregas, and the audiences at the MIT Phonology Circle (May 2015), the Manchester Phonology Meeting 2015, Seminari de Lingüística Teòrica of the CLT, UAB (April 2016), and the 40th GLOW workshop Syntax-Phonology Interface (March 2017) for comments and discussion. Bonet and Mascaró acknowledge support from AEI/FEDER, EU (project FFI2016-76245-C3-1-P) and Generalitat de Catalunya (2017 SGR 634). We would also like to thank the reviewers, who provided very detailed comments and suggestions.