Does Local Coherence Lead to Targeted Regressions and Illusions of Grammaticality?

Local coherence effects arise when the human sentence processor is temporarily misled by a locally grammatical but globally ungrammatical analysis (The coach smiled at the player tossed a frisbee by the opposing team). It has been suggested that such effects occur either because sentence processing occurs in a bottom-up, self-organized manner rather than under constant grammatical supervision, or because local coherence can disrupt processing due to readers maintaining uncertainty about previous input. We report the results of an eye-tracking study in which subjects read German grammatical and ungrammatical sentences that either contained a locally coherent substring or not and gave binary grammaticality judgments. In our data, local coherence affected on-line processing immediately at the point of the manipulation. There was, however, no indication that local coherence led to illusions of grammaticality (a prediction of self-organization), and only weak, inconclusive support for local coherence leading to targeted regressions to critical context words (a prediction of the uncertain-input approach). We discuss implications for self-organized and noisy-channel models of local coherence.


INTRODUCTION
Ever since their discovery by , local coherence effects have posed an interesting conundrum for standard theories of human sentence processing. The sentence in (1a) has only one correct parse, in which the noun phrase the player is modified by a reduced relative clause and the agent of the verb toss remains unrealized. The exact same parse is available for sentence (1b), where the relative clause is unreduced, and for (1c), which replaces the passive participle with that of a semantically similar verb, throw.
(1) a. The coach smiled at the player tossed a frisbee by the opposing team.
b. The coach smiled at the player who was tossed a frisbee by the opposing team. c. The coach smiled at the player thrown a frisbee by the opposing team.
In a self-paced reading experiment  found that the words tossed/thrown a frisbee took longer to process in (1a) compared to (1b) and (1c). They replicated the effect in a second self-paced reading study and in a grammaticality judgment study, where (1a) was more likely to be rejected than (1b) and (1c). This pattern is consistent with the idea that in (1a), the human sentence processor is led astray due to the substring the player tossed a frisbee being a possible active main clause, even though it appears at a position in the sentence where such an analysis is ruled out by grammar.  argue that such local coherence effects may be best explained by parsing models that relax the principle of grammatical self-consistency, which requires that the parser only considers globally legal analyses at every point during the parse. A class of models that do not assume selfconsistency are self-organized parsing models such as SOPARSE (Tabor & Hutchins, 2004) and Unification Space ( Vosse & Kempen, 2000). In these models, lexical items bring with them small pieces of syntactic structure that may freely combine at any point during processing and attempt to form a globally well-formed tree structure (Smith, 2018;Smith et al., 2018;Villata & Franck, 2020). If multiple locally well-formed attachments compete with each other, processing difficulty arises. Slowed reading in locally coherent structures is naturally accounted for by this kind of model as the underlined local analysis in (1a) competes with the globally correct analysis. If the conflict between the two cannot be resolved, parsing failure may result.
An entirely different explanation of local coherence effects, the noisy-channel model, was proposed by Levy (2008b) and tested by Levy et al. (2009). The model rests on the assumption that readers maintain some uncertainty about the words they have read, and may consider the possibility that they did not read the word at in (1a), but rather the word and or the word as, or that at may have been a typo. The main intuition is that readers may change their assumptions about the identity of previously read words when subsequent input does not easily fit with the current parse. For instance, in (1a), changing at to as only requires substitution of one letter and results in the possibility of a high-probability alternative parse in which tossed is analyzed as an active verb (The coach smiled as the player tossed a frisbee). Levy (2008b) assumes that large changes in the reader's belief about the previous input result in slowed processing and, if possible, in regressive saccades, due to lingering uncertainty about the actual identity of the critical word at. Given that the alternative parse is not available in (1b, 1c), no revision of beliefs is triggered at tossed/thrown, and the difference between conditions is thus accounted for.
Given the assumed link between uncertainty about previous input and overt regressions in the noisy-channel model, data from eye tracking during reading are potentially informative when it comes to dissociating the two proposed explanations. 1 Research on temporarily ambiguous garden-path sentences has yielded some evidence that readers engage in targeted rereading during reanalysis (Frazier & Rayner, 1982;Meseguer et al., 2002), though less systematically than had initially been believed (Mitchell et al., 2008;von der Malsburg & Vasishth, 2011. For local coherence sentences, Levy et al. (2009) found that readers made more regressive eye movements and spent more time rereading earlier parts of the sentence in (1a) compared to (1c) after having first fixated the verb tossed/thrown. In (1a), readers were also more likely to refixate the preposition at than in (1c). This latter finding is consistent with readers trying to resolve the conflict between the player having been attached as the object of smiled at and the locally coherent attachment of the player as the subject of the 1 Note that we do not provide an exhaustive review of theories about local coherence effects that have been proposed in the literature. Other accounts include those of Gibson (2006), Bicknell et al. (2009), Morgan et al. (2010), and Hale (2011). See Paape and Vasishth (2016) for discussion. would-be active verb tossed. Crucially, Levy et al. (2009) did not observe the same effect when the preposition at was replaced with toward. They argue that the asymmetry is due to toward being less easily confused with as or and, thus supporting the Levy (2008b) account over the self-organized parsing account of .
In another set of two eye-tracking studies, Christianson et al. (2017) investigated targeted regressions in locally coherent structures and classic garden-path structures. In one of their experiments, they observed additional rereading of the region containing the preposition for local coherence sentences. However, regressions to precritical regions also occurred in control sentences similar to (1b), so it is not clear whether they were triggered by local coherence per se. Christianson et al. (2017) conclude that participants in their study largely used rereading to confirm their interpretation of the previous input, as opposed to revising their analysis. Furthermore, given that comprehension accuracy was low even when rereading had occurred, Christianson et al. (2017) suggest that participants engaged in superficial or "good enough" processing (e.g., Christianson et al., 2001;Ferreira et al., 2002), possibly adopting syntactic representations of the input. They speculate that "rereading may signal … a cobbling together of several smaller interpretations based on substrings that are not fully integrated into a single global structure" (p. 23). The notion of a "cobbled-together" analysis is compatible with a self-organization mechanism, where partial parses compete with each other. Depending on the implementation, the system may settle in a state in which either two globally incompatible attachments are active at the same time, or in a state in which there is no chain of attachments that spans the entire input (Smith, 2018). However, given that Christianson et al.'s (2017) interpretation of the reading patterns is speculative and based solely on the low comprehension accuracy observed in their study, it is not clear which kinds of representations subjects ended up with, or how they were derived.
In summary, evidence from eye-tracking studies does not decisively favor either the selforganization or the uncertainty-based view of local coherence effects. This is especially true given that regressions occur relatively rarely in the eye-tracking record, constituting only 10-15% of all saccades (Rayner, 2009), and therefore relatively many observations are needed to draw reliable conclusions about targeted regressions in particular. One aim of the current study is to investigate whether the pattern of targeted regressions discovered by Levy et al. (2009) can be replicated with a different kind of local coherence in a different language, namely, German. We deviate from previous studies in that each sentence is followed by a binary grammaticality judgment as opposed to a comprehension question. In theory, such a task may lead to more regressions compared to asking comprehension questions, as formal features of the sentence become more important than meaning-related ones. We also test a prediction derived from self-organized parsing, namely, that local coherence may result in illusions of grammaticality: If a given sentence is globally ungrammatical but strongly locally coherent, the local parse may act as the syntactic and semantic representation of the entire input, leading to incorrect positive grammaticality judgments. Smith (2018) reports simulation results from a self-organized parsing system that support this prediction: Given a specific set of modeling assumptions, the system settled on the local parse in 18 out of 100 trials for the English materials of .
Our study uses sentences such as (2).

OPEN MIND: Discoveries in Cognitive Science
Verb-final word order is mandatory in German embedded clauses headed by the complementizer dass, "that," unless the verb takes a sentential object. The embedded clause contains an adjectival participle form such as enttarnte, "exposed," that is superficially identical to the third-person singular active form of the same verb. This results in a locally coherent subjectverb-object (SVO) structure in which the syntactic object enttarnte Informanten, "exposed informants," could instead be analyzed as a verb-object sequence (see also Konieczny et al., 2009;Paape & Vasishth, 2016). The sentence in (2) is globally ungrammatical, as the number feature the subject noun phrase einer der Spitzel, "one of the snitches," does not match the number feature of the verb warnten, "warned.pl." When subjects read the sentence and judge its grammaticality, they may experience processing difficulty because there are two competing syntactic analyses of the input that are somewhat attractive but are nevertheless both incorrect. The two competing analyses are shown in (3) below. Note that when enttarnte, "exposed," is analyzed as a verb, the final verb warnten, "warned.pl," is left without an attachment site. (3) We now turn to our experimental study.

Participants
Seventy-seven German native speakers were recruited from the local student population at the University of Potsdam. 2 All had normal or corrected-to-normal eyesight. They were paid 15A as compensation for their participation.

Design and Stimuli
We preregistered our design and analysis, without theory-specific predictions, at https://osf.io /ersxn. Our experiment followed a 2 × 2 design adapted from Paape and Vasishth (2016). We manipulated the factors' grammaticality (grammatical versus ungrammatical) and local coherence (locally coherent versus not locally coherent). Grammaticality was manipulated by having the final verb of the embedded clause correctly agree with the subject in number or not. The would-be SOV structure is only locally coherent when the would-be active verb enttarnte, "exposed," agrees in number with the subject, and therefore local coherence can be manipulated via the subject's number feature. Thirty-two experimental items were created according to the scheme shown in (4). Diamonds indicate region-of-interest boundaries in the example stimulus.
one of.the snitches exposed informants warned.pl Ungrammatical, Not locally coherent … einige der Spitzel } enttarnte Informanten } […] } warnte. some of.the snitches exposed informants warned.sg The elided […] in the example was occupied by a prepositional phrase, in this case mit raffinierten Tricks, "with subtle ploys." The prepositional phrase was always three words long and semantically compatible with both the global and the locally coherent analyses ("expose with subtle ploys"/"warn with subtle ploys").

Procedure
After giving informed consent, participants read the stimulus sentences and indicated after each trial whether the sentence had been grammatical or not. The experimental sentences were presented according to a Latin-square design. They were mixed with 96 filler sentences containing different kinds of grammatical violations, including agreement violations, case-marking errors, as well as missing or added words. Overall, half of the sentences in the experiment were ungrammatical. Presentation order was randomized at runtime. Participants were seated at a distance of 60 cm from the presentation screen, which had a resolution of 1920 × 1080 pixels. Sentences were presented in 16 pt black Courier font on white background. Eye movements from the right eye were recorded with an EyeLink 1000 Tower Mount eye-tracking system at a sampling frequency of 1,000 Hz. Recalibration using a 9-point grid was carried out every 15 trials or as needed. Drift correction was performed every six trials. Each trial started with the presentation of a fixation cross at the position of the initial letter of the sentence. After participants fixated the cross, the sentence was presented. Participants indicated that they wanted to continue to the grammaticality judgment by looking in the lower right corner of the screen, then gave their judgment by pressing one of two buttons on a VPixx RESPONSEPixx response box. The experiment started with six practice sentences to familiarize participants with the procedure.

Data Analysis
Due to technical failures, no eye tracking data were collected for two participants. No accuracy data were collected for six participants due to a programming error. Data from one additional participant were removed because the participant failed to reach 50% accuracy on grammaticality judgments for the experimental items. All analyses of eye-tracking measures are thus based on the data from 74 participants, while analyses of question responses are based on data from 68 participants. 3 Fixations and saccades were computed from the raw data using the algorithm of Engbert and Mergenthaler (2006). Fixations shorter than 20 ms were discarded and reading time measures for the subject, object, post-object, and verb regions were computed using the em2 package (Logac ev & Vasishth, 2013) in R (R Core Team, 2019). In an additional step that we had not preregistered, for each reading time measure, data points below 80 ms were also discarded, as 80 ms have been estimated as the minimum amount of time within which linguistic information can influence oculomotor control (Altmann, 2011). Statistical analyses were performed by fitting fully hierarchical linear mixed models using the brms package (Bürkner, 2017). We analyzed log-transformed first-pass reading times, regression-path durations and total reading times, in addition to log-transformed judgment response times. 4 The analysis of judgment response times included an additional shift parameter to account for motor execution time (Rouder, 2005). Fully hierarchical logistic regression models were used to analyze response accuracy and regression probabilities. Both experimental factors were sum-coded, with grammatical and locally coherent coded as 1 and ungrammatical and not locally coherent coded as −1. Normal(0, 10) priors were used for intercepts and Normal(0, 1) priors 5 were used for the slopes across all analyses. LKJ priors (Lewandowski et al., 2009) were used for the correlation matrices, with the ν parameter set to 2, making correlations near ±1 a priori less likely. Four sampling chains with 2,000 iterations each were run for each model. The first 1,000 samples were discarded as warmup. We treat effects as reliable if 95% of the posterior probability are either above or below zero. Code and full posterior plots for all analyses, along with our experimental data and analysis code are available at https://osf.io/f6ntk. Levy et al. (2009), Christianson et al. (2017, and Müller and Konieczny (2019) indicate that processing disruption caused by local coherence should become visible in first-pass reading times and regression-path durations. In our materials, the local coherence begins at the object region, so that the disruption is predicted to occur in this region, either because an error signal is triggered (Levy, 2008b) or because the locally coherent analysis competes with the global one .

Self-Organized Parsing Models
There exists, to our knowledge, no implemented model that spells out the assumed connections between a self-organized parser and the eye movement control system. Accordingly, any experimental outcome in which the eye movement record shows an indication of processing difficulty-longer reading times and/or regressions-in the conditions where self-organized parsing predicts increased competition is, in principle, compatible with the theory. 3 As stated in the preregistration, all participants completed an operation-span test as a measure of working memory capacity prior to the experiment. These data were collected mainly for use in later stages of the project, where we intend to build a computational model that accounts for a variety of sentence-processing phenomena. We had no specific predictions regarding interactions between memory capacity and local coherence, and entering the memory measure into our current analyses yielded no informative interactions with the experimental manipulations, so that we refrain from reporting the results here. 4 First-pass reading time is the sum of all fixation durations from first entering a region from the left prior to leaving it in any direction. Regression-path duration, also called go-past time, is first-pass reading time plus the sum of all fixations on earlier regions before passing the region to the right. Total reading time is the sum of all fixations in a region over the entire trial. 5 We deviated from the preregistered Normal(0, 5) priors here because sampling is faster for more narrow priors and effects are not realistically expected to be larger than 2 log ms.
The predictions of the self-organized model depend on when the system settles on an analysis. If the system does not settle before the entire input has been processed, readers may experience prolonged processing difficulty until the final region, due to competition between the local and global analyses. Alternatively, if the competition is short-lived and the system settles on an analysis already at the object region, no prolonged difficulty is expected. Predictions for the sentence-final region also depend on whether the system has settled on an analysis: If the parser has settled on the locally coherent parse, ungrammaticality of the global analysis presumably should not matter, and processing difficulty should thus not differ between grammatical and ungrammatical sentences. This prediction assumes that the parser can settle on sufficiently well-formed ("harmonious") structures even if not all words in the sentence have been integrated into the structure, as in the implementation of Smith (2018). However, given that the local analysis does not constitute a grammatical analysis of the entire input string, processing difficulty should nevertheless be higher than in grammatical sentences. By contrast, if the system has not settled on an analysis, competition should be greater between the local and global analyses when the global analysis is ungrammatical, because the two analyses will be closer to each other in well-formedness. In grammatical sentences, on the other hand, the globally correct analysis should outcompete the local one more quickly. Under both scenarios, a statistical interaction between local coherence and grammaticality is predicted in the final region.
Regarding the off-line judgments, if readers sometimes settle on the locally coherent analysis in ungrammatical sentences, they may experience illusions of grammaticality, leading to ungrammatical, locally coherent sentences being incorrectly judged as grammatical. Nevertheless, due to the lower harmony of the local coherence structure, there should be fewer positive grammaticality judgments than for grammatical sentences. By contrast, if readers are merely temporarily confused by the local coherence but do not adopt the incorrect parse, there should be no illusions of grammaticality. Konieczny (2005) found that participants took longer to notice ungrammaticality in locally coherent versus not locally coherent sentences, so we expect to find the same pattern in our study. Depending on when readers finish evaluating the sentence's grammaticality, the effect may either appear in the final region of the sentence or in judgment response times.

Noisy-Channel Model
Unlike self-organized parsing models, the noisy-channel model has no way of settling on ungrammatical structures, but is bound by grammatical self-consistency. Thus, the only scenario in which illusions of grammaticality could arise for ungrammatical, locally coherent sentences is one in which participants come to doubt the previous input and then fail to verify their revised belief about previous words' identity through rereading. Essentially, in such trials the participant would come to believe that they are reading a main clause, and ignore the evidence to the contrary. Importantly, illusions of grammaticality should only arise if readers also ignore the sentence-final verb, because its presence is a strong indication that a main clause analysis is not tenable, even in ungrammatical sentences. Given that the noisy-channel model as formulated by Levy (2008b) only accounts for changes to previous but not to future input, it is thus unlikely that illusions of grammaticality should occur at all.
In the noisy-channel model, an error signal is raised when readers experience changes in their belief about previous input. The larger the change in belief, the larger the error signal. However, the strength of the error signal also depends on how easily the veridical input string can be edited into a "near-neighbor" string, such as at to as. The more distant the two strings are, the smaller the change in belief, given that smaller corruptions of the input in the noisy channel are more likely than larger corruptions. The model allows for deletions, insertions and substitutions of characters, and the penalty to the belief change incurred is proportional to the Levensthein distance between the veridical string and the transformed string. 6 For our materials, there is only one set of transformations that we can think of that would license the local parse: Editing out the complementizer, adding a colon 7 and capitalizing the first letter of the subject noun phrase, which turns the subordinate clause into a main clause with SVO word order, as shown in (5).
The edit distance here is twice as large as that assumed for the largest edits in the English materials, such as inserting who between player and tossed. 8 The strength of the error signal also depends on whether enttarnte, "exposed," is much more likely to appear as a finite verb in the edited main clause than as an adjective in the unedited subordinate clause. According to our seach in the TIGER treebank (Brants et al., 2002), the conditional probability of encountering a determiner-less, adjective-modified object noun phrase after a complementizer and a subject noun phrase (NP) is about 1.2%. By comparison, the conditional probability of encountering a finite verb directly after a colon and a subject NP is about 25%. In English, using the numbers obtained by Levy (2008b), the conditional probability of encountering a reduced relative clause after an NP is about 1.5%, compared to a 91% probability of encountering a finite verb after a subject NP. The local parse given the necessary edits is thus about 60 times more likely in the English materials and about 21 times more likely in the German materials. Based on the increased edit distance and the smaller change in conditional probability, the error signal should thus be much weaker in the German materials.
The hallmark prediction of the noisy-channel approach that was tested by Levy et al. (2009) is that readers make targeted regressions to previous words whose identity they have come to doubt. For the investigation of targeted regressions, the em2 package allows the computation of so-called conditional rereading measures, that is, measures that are contingent on a certain region in the sentence having been fixated prior to the rereading of an earlier region. Fixations on the "target" region are registered until the "source" region is passed to the right. We compute conditional rereading probabilities for three theoretically interesting target and source regions in the sentence and make predictions as shown in Table 1. 6 Assuming that all edit operations are equally likely, it is irrelevant for the edit distance measure whether the veridical or the nonveridical sentence is taken as the "source" representation: From the reader's point of view, the perceived sentence may appear like a corruption of an original, intended sentence-this is the idea behind the noisy-channel model-but from the experimenter's perspective, the reader's inferred representation is a corruption of the actual presented sentence. For reasons of expository ease, we present the experimenter's perspective. 7 See Levy (2011) for another application of the noisy-channel model that includes punctuation. 8 Note that the comparison rests on the assumption by Levy (2008b) that edits are character-based. If edits are instead assumed to be word-based (Gibson et al., 2013), the contrast between our materials and the English examples is somewhat less stark.
Prediction I arises straightforwardly under the noisy-channel view, because reading the locally coherent object region may cause readers to check whether they are, in fact, reading a subordinate clause headed by dass, "that," in which case the locally coherent parse is disallowed, in analogy to rereading the problematic word at in the English stimuli. Prediction II arises because encountering a verb at the end of the sentence is again clear evidence that the locally coherent region cannot also contain a verb, which may trigger confirmatory rereading. This prediction presupposes that the error signal triggered by the locally coherent adjective does not always immediately lead to regressions, but that readers may wait until further input is processed. Prediction III arises because encountering a verb with incorrect number marking could trigger uncertainty as to what the number feature of the subject phrase was, triggering confirmatory rereading. Table 2 shows grammaticality judgment accuracy and mean response times by condition. Figure 1 shows reading measures by condition and region of interest.

Judgment Response Times and Accuracy
There was no indication that the experimental manipulations affected judgment response times, but response accuracy was lower for ungrammatical compared to grammatical sentences (Δ = −4%, CrI [credible interval]: [−7%, 0%]).

Subject Region (one/some of the snitches)
Total reading times in the subject region were longer in ungrammatical compared to grammatical sentences (Δ = 105 ms, CrI: [51 ms, 160 ms]). 9 Table 1. Predictions for conditional rereading probabilities derived from the noisy-channel model. > indicates higher probability.

Source
Target Prediction I Object Complementizer Locally coherent > Not locally coherent exposed informants that II End of sentence Object Locally coherent > Not locally coherent warned exposed informants III End of sentence Subject Ungrammatical > Grammatical warned One/some of the snitches 9 There was also some indication that first-pass reading times were increased in locally coherent sentences in the subject region, with the effect being just slightly below our predefined reliability criterion (Δ = 28 ms, CrI: [−6 ms, 61 ms]). The difference may indicate a parafoveal preview effect, but a word-by-word analysis using successive differences contrasts reveals that conditions diverge at einer/einige, "one/some," after which the difference disappears and only reappears at the would-be verb of the local coherence (see Supplementary Materials at https://osf.io /f6ntk). We speculate that the effect may be due to einer, "one," being semantically more specific than einige, "some." Object Region (exposed informants) First-pass reading times in the object region were increased in locally coherent compared to not locally coherent sentences (Δ = 73 ms, CrI: [34 ms, 111 ms]). Regression-path durations in the object region were also longer in locally coherent compared to not locally coherent sentences (Δ = 38 ms, CrI: [−3 ms, 80 ms]).

Post-Object Region ( with subtle ploys)
There was an interaction between grammaticality and local coherence at the post-object region (

Conditional Refixations
In ungrammatical sentences, regression paths starting at the final verb were more likely to include the subject region (Δ = 15%, CrI: [9%, 21%]) but less likely to include the sentence preamble (Δ = −5%, CrI: [−9%, 0%]) compared to grammatical sentences. There was no indication of the other refixation patterns predicted by the noisy-channel approach. 10 However, we also conducted an exploratory analysis of all regression paths starting at or after the object region, coding trials with at least one regression from any postsubject region versus trials with no regressions. In this analysis, there was some weak indication that the 10 Our preregistration also contained a prediction for additional targeted regressions from the object region to the subject region in locally coherent sentences, due to participants checking agreement between the subject and the would-be verb. This prediction was not borne out either.

DISCUSSION
Local coherence led to increased first-pass reading times and regression-path durations in the object region (enttarnte Informanten, "exposed informants"), where the local parse became available, in line with findings by Bicknell et al. (2009), Christianson et al. (2017, and, more recently, by Müller and Konieczny (2019). There was some weak indication of local coherence triggering targeted regressions, as predicted by the noisy-channel model: While there was no indication that the local coherence immediately triggered regressions to the complementizer dass, "that," regression paths starting at any point from the object phrase onward were more likely to include the complementizer by about 3% (CrI: [−1%, 7%]) in the locally coherent conditions. The effect is small and its credible interval includes zero, despite our sample being almost twice as large as that of Levy et al. (2009) and the paradigm being relatively efficient at triggering regressions: Overall, at least one interregion regression was made on 79% of trials for experimental sentences (CrI: [73%, 85%]), and the mean number of refixated regions per trial for these items was 2.3 (CrI: [2, 2.6]). Our paradigm was also, in principle, able to trigger targeted rereading. This is supported by the observation of more refixations of the subject region (15%, CrI: [9%, 21%]) and fewer refixations of the sentence preamble (−5%, CrI: [−9%, 0%]) in ungrammatical sentences. Participants presumably targeted the subject region to check whether the sentence was indeed ungrammatical. The finding is in line with the noisy-channel model, but could also be captured by other models, assuming that there is a strong link between attention and gaze location.
There is a question of whether a 3% increase in refixations of the complementizer provides convincing support for the noisy-channel model. Bayes factors can be used to quantify the evidence in favor of such an effect over the evidence in favor of the null hypothesis, which is that there is no effect of local coherence on refixations of the complementizer. We computed Bayes factors using bridge sampling (Meng & Wong, 1996). Given that vague priors lead to bias in favor of the null model (Lee & Wagenmakers, 2014), we also computed Bayes factors for two narrower priors, as shown in Table 3.
Assuming that previous results by Levy et al. (2009), who observed an increase of about 4% in refixations of the critical preposition, generalize to our German construction, assuming a small effect with a positive sign seems reasonable. On the other hand, if one does not subscribe to the noisy-channel model, other accounts of local coherence effects may predict fixations to be drawn away from the complementizer to other regions of the sentence, such as the beginning of the sentence or the locally coherent region, suggesting a wider prior. As expected, the null hypothesis is strongly favored under a wide prior, with the evidence tipping in favor of the alternative hypothesis under the narrowest prior. Nevertheless, the evidence in favor of the alternative hypothesis is still "weak" or "anecdotal" under this prior (Jeffreys, 1939(Jeffreys, /1998Raftery, 1995). Our findings are thus compatible with the predictions of the noisy-channel model under a broad construal, but cannot be taken to directly support it, given that the analysis was exploratory in nature. Further confirmatory research is clearly needed.
Our findings are also compatible with the notion that rereading is largely confirmatory in local coherence structures (Christianson et al., 2017;Levy et al., 2009). For instance, there is no indication in our data that local coherence resulted in a complete revision of readers' beliefs about the previous input: If readers had sometimes come to believe that they were reading a main clause, the sentence-final verb should have been highly surprising on such trials. Contrary to this possible prediction, there was no main effect of local coherence in any measure for the final region of the sentence; in fact, the numerical tendency for each of the three reading-time measures points in the opposite direction. This is broadly consistent with a version of the model in which readers always act on and eventually overcome uncertainty by "checking" previous input (overtly or covertly) when given the opportunity. 11 There is no indication in our data that the availability of a locally coherent subparse leads to illusions of grammaticality in otherwise ungrammatical sentences. The prediction of selforganized processing models that the local analysis may outcompete the global one in ungrammatical sentences was thus not borne out in our stimuli. The null result is also in line with the findings of Konieczny (2005), who also found no indication of local coherence affecting the accuracy of grammatical error detection in German, albeit using a different type of structure. While we found no effects of the experimental manipulations on judgment response times, the pattern seen in total reading times in the sentence-final region is compatible with Konieczny's (2005) finding that ungrammatical sentences take longer to reject in the presence of local coherence.

Do Grammaticality and Local Coherence Interact? Investigating a Confound
The only robust interaction between the local coherence and grammaticality manipulations appeared across measures in the final region of the sentence, where agreement of the lexical verb with the sentential subject was either correct or not. The effect of local coherence in 11 Note, however, that this interpretation may be at odds with the findings of Gibson et al. (2013), where participants had ample opportunity to reread sentences and still gave interpretations that were inconsistent with the literal input. By contrast, it is not at odds with the findings of Levy (2011), where downstream effects of possible local edits were observed, as the self-paced reading paradigm did not allow readers to overtly regress. ungrammatical sentences is compatible with competition between the ungrammatical global analysis and the locally coherent analysis, as predicted by self-organized sentence processing. On the other hand, the speedup due to local coherence in grammatical sentences is unexpected. Readers may have suppressed the locally coherent parse during first-pass reading of the object region, forming a strong prediction about the upcoming verb's number feature. In grammatical sentences, the prediction is borne out because verb carries the expected number feature, and processing is faster the locally coherent condition. In ungrammatical sentences, the prediction is not borne out and processing is slower in the locally coherent condition due to increased surprisal (Hale, 2001;Levy, 2008a).
However, this speculative explanation is called into question by the fact that our study confounded grammaticality and local coherence with the number feature of the sentence-final verb: The two slowest conditions in total reading times had plural verbs while the two faster conditions had singular verbs. Recoding the experimental factors accordingly, the grammaticality × local coherence interaction turns into a main effect of verb plurality (see following). 12 Plural verbs may be more difficult to process than singular verbs because they are orthographically longer and morphologically more complex, or because plural number is more marked than singular number (e.g., Eberhard, 1997). In order to investigate this possibility, we conducted a web-based self-paced reading experiment with 69 participants that used the same materials (including the same fillers) and the same task as our eye-tracking study, but removed all locally coherent adjectives and replaced them with either definite articles, quantifiers, or unambiguous adjectives. 13 The reading time results for the sentence-final verb are compatible with verb plurality being the source of the interaction: Reading times were slower for ungrammatical sentences (Δ = 174 ms, CrI: [122 ms, 226 ms]) and also for sentences with plural verbs (Δ = 51 ms, CrI: [10 ms, 91 ms]), resulting in a pattern that qualitatively matches that found in the eye-tracking study, as shown in Figure 3.
These additional results cast heavy doubt on the interpretation of the reading patterns observed in the main experiment as being due to an interaction between local coherence and grammaticality. While such an interaction may underlyingly be present, it would need to be tested more thoroughly in a design without the number confound.

Self-Organized Processing and Noisy-Channel Models-The Broader Picture
When interpreting the current findings in relation to the larger theoretical picture, one also needs to keep in mind that the strength of local coherence effects may depend heavily on the type of structure used. Garden-path effects are known to differ in magnitude depending on the kind of ambiguity used (e.g., Sturt et al., 1999), with differences between ambiguity types going beyond what differences in predictability can account for (van Schijndel & Linzen, 2018). In this context, recall that  found more rejections of sentences containing English reduced relative clauses in the presence of local coherence, despite the fact that the sentences' actual grammaticality remained unchanged. It could thus be argued that the self-organized approach would have predicted illusions of ungrammaticality rather than grammaticality for our sentences. Neither illusion was observed in our data. However, unlike our stimuli, which are perfectly natural for German readers, reduced relative clauses are known to be troublesome for English readers (Bever, 1970) and have been argued to be of dubious grammaticality under some conditions (McKoon & Ratcliff, 2003). Given these differences, a self-organized sentence 12 For further reading on how interactions can be recast as main effects, see Vasishth (2018) andWestfall (2015). 13 Further information about the experiment can be found at https://osf.io/f6ntk. processing model with the right parameter settings may be able to account for all the available data from both English and German. As the self-organized approach to sentence processing is still relatively young and implementation details differ between iterations, it remains to be seen whether a consensus model will emerge anytime soon. One key challenge is to explain the data of Levy et al. (2009), which, besides showing some evidence for targeted regressions, also suggest a role of orthographic or phonological similarity to "neighboring" words, as in the case of at versus as. The self-organized model of Kukona et al. (2014), which includes word recognition in addition to parsing and has been used to model fixation patterns in visual world experiments, could conceivably be extended to explain these patterns.
For the noisy-channel model of Levy (2008b), there are also considerable degrees of freedom with regard to the assumed link between error signal strength and eye movements, as well as possible effects of task manipulations on the model's noise parameter, which influences the edit penalties. As our experiment contained ungrammatical sentences, participants may have inferred a higher level of noise in our stimuli compared to the studies of  and Levy et al. (2009) (see also Gibson et al., 2013). Relatedly, a more recent version of the noisychannel account (Futrell et al., 2020) has been argued to account for illusions of grammaticality in complex English sentences, so that evidence for such illusions in locally coherent sentences, had we found any, would not have decisively favored the self-organized approach. However, as the account of Futrell et al. (2020) rests on readers completely forgetting about earlier parts of the sentence, it is unclear whether it can also plausibly account for illusions in short, simple sentences. Overall, collecting more high-quality data from different languages, using different experimental paradigms and different structures, and deriving testable predictions from implemented processing models will help to further disentangle the different existing accounts of local coherence effects.