Abstract

In usage-based linguistic theories, the assumption that high-frequency language strings are mentally represented as unitary chunks has been invoked to account for a wide range of phenomena. However, neurocognitive evidence in support of this assumption is still lacking. In line with Gestalt psychological assumptions, we propose that a language string qualifies as a chunk if the following two conditions are simultaneously satisfied: The perception of the whole string does not involve strong activation of its individual component parts, but the component parts in isolation strongly evoke the whole. Against this background, we explore the relationship between different frequency metrics and the chunk status of derived words (e.g., “government,” “worthless”) in a masked visual priming experiment with two conditions of interest. One condition investigates “whole-to-part” priming (worthless–WORTH), whereas the other one analyzes “part-to-whole” priming (tear–TEARLESS). Both conditions combine mixed-effects regression analyses of lexical decision RTs with a parametric fMRI design. Relative frequency (the frequency of the whole word relative to that of its onset-embedded part) emerges as the only frequency metric to correlate with chunk status in behavioral terms. The fMRI results show that relative frequency modulates activity in regions that have been related to morphological (de)composition or general task performance difficulty (notably left inferior frontal areas) and in regions associated with competition between whole, undecomposed words (right inferior frontal areas). We conclude that relative frequency affects early stages of processing, thereby supporting the usage-based concept of frequency-induced chunks.

INTRODUCTION

In so-called usage-based accounts of language, the assumption that high-frequency expressions like “I don't know” are mentally represented as chunks has been invoked to account for a wide range of phenomena at different levels of linguistic analysis (Bybee, 2010; Diessel, 2007; Goldberg, 2006; Croft & Cruse, 2004; Tomasello, 2003; Ellis, 2002). High-frequency expressions preferentially undergo certain types of language change such as fusion (e.g., “God be with you” > “goodbye”) and reanalysis (e.g., “going to” can be used as a future marker with no sense of motion), while at the same time being distinctively resistant to regularization (e.g., the high-frequency irregular past tense form “went” has been retained, whereas lower-frequency “clomb” has regularized to “climbed”; Bybee, 2006). High-frequency expressions (e.g., “Iwannadoit”) tend to be mastered by children before their component parts (“I,” “want,” “to,” “do,” “it”) can be handled separately (Lieven, Salomo, & Tomasello, 2009). Psycholinguistic studies have demonstrated that high-frequency expressions are processed with greater speed and accuracy than matched low-frequency expressions (for an overview, see Siyanova-Chanturia, 2015), and converging strands of neuropsychological research attest to the fact that novel and ready-made expressions are differentially affected by different types of brain dysfunction or damage (Van Lancker Sidtis, 2012). These phenomena have often been attributed to the purported holistic status of high-frequency expressions (Blumenthal-Dramé, 2012, 2016).

Modeling Linguistic Chunks

In line with Gestalt psychology, which has focused on chunks in the visual modality, usage-based researchers have described chunk status as a continuous property that results from the interplay between complex wholes (or “configurations”) and their component parts (Hay & Baayen, 2005): A configuration is a holistic chunk if it has cognitive precedence over its component parts. The notion of cognitive precedence is typically operationalized in terms of greater ease of access to the whole than to its parts (Pomerantz & Cragin, 2015).

The idea that cognitive precedence is a function of usage frequency directly follows from the usage-based claim that every single string encountered in language use, no matter how complex (e.g., “Iwannadoit” or “governments”), leaves a memory trace in language users' mental network of linguistic knowledge (Abbot-Smith & Tomasello, 2006). Similar strings (e.g., different instances of “government” encountered in language use) are thought to be stored in a superpositional fashion, with autonomous representations for component parts like morphemes (the minimal meaning-bearing units of language such as “govern,” “-ment-,” and “-s”) arising because of partial overlap between different strings (Docherty & Foulkes, 2014).

The stronger the memory trace for a given expression is, the greater is its representational autonomy in the network, that is, the weaker are its connections to other network nodes (Bybee, 2010). As a result, the processing of a high-frequency expression like “I don't know” will involve relatively weak coactivation of nodes representing individual parts of the expression (“I,” “do,” “not,” and “know”) or pertinent grammatical knowledge (e.g., knowledge about negation or word order in declarative sentences). In other words, the expression will be perceived as a fairly self-contained unit that has precedence over its component parts. By contrast, the processing of a lower-frequency string (e.g., “You can't sing”) will involve proportionally stronger support from knowledge represented in connected nodes, which implies greater prominence of the parts.

Challenges and Desiderata

The usage-based prediction that high-frequency complex words such as “harmless” are stored and accessed as units is challenged by a major competing paradigm in theoretical linguistics—distributed morphology. Distributed morphology and related morphological processing accounts, so-called “early decomposition models,” claim that complex words are necessarily derived bottom–up by combining individual morphemes into higher-order structures (Halle & Marantz, 1994). In particular, early decomposition models posit that the first stages of visual word processing involve necessary access to individual morpheme-level constituents, irrespective of whole-word properties such as transparency and frequency (Fruchter & Marantz, 2015; Taft & Nillsen, 2013; Lavric, Elchlepp, & Rastle, 2012; Rastle & Davis, 2008). Thus, advocates of early decomposition assume that the visual recognition of “harmless” requires the combination of [harm] and [less]. Likewise, they assume that the first stages of recognizing “pseudocomplex corner” involve separate access to [corn] and [er], although for “corner,” this route leads to a garden path and has to be subsequently rejected.

The most serious challenge, however, is that the concept of frequency-induced chunk status still awaits comprehensive empirical investigation. Although there have been many psycholinguistic and neurolinguistic studies demonstrating frequency effects on the processing of transparent multimorphemic strings, these studies have not operationalized chunk status and have not explored the extent to which chunk status affects early stages of processing (for general overviews, cf. Blumenthal-Dramé, 2016; Siyanova-Chanturia, 2015; for an overview focusing on complex words, cf. Amenta & Crepaldi, 2012).

Operationalizing Chunk Status

This study aims to test the cognitive plausibility of frequency-related chunk status by bringing together two strands of research that have so far largely proceeded in parallel: psycholinguistic and neurocognitive research on frequency effects in language, and research on holistic Gestalt perception in nonlinguistic domains of cognition (e.g., face perception; Wagemans, 2015). Drawing on firmly established Gestalt psychological principles, it will be tested whether the “chunkedness” (i.e., chunk status, conceptualized as a gradable phenomenon) of transparent bimorphemic words such as “worthless,” “sweetish,” or “speakable” varies as a function of their usage frequency and other potentially relevant variables.

Chunkedness will be operationalized in terms of two chunk features known from the Gestalt psychological literature: weak access to individual component parts (Poljac, de-Wit, & Wagemans, 2012) and completion of “incomplete” percepts (Van Lier & Gerbino, 2015). The first feature is exemplified by so-called Navon figures (Navon, 1977; see Figure 1, left): When confronted with complex hierarchical stimuli such as big letters made up of smaller letters, participants without mental disabilities will not proceed from parts to wholes. Rather, as experiments have shown, the first and only necessary processing stage involves access to the global level (Kimchi, 2015).

Figure 1. 

Illustrations of the principle of global precedence. (Left) When processing a typical Navon letter, people will access the global before the local level. (Right) Under certain conditions, people will complete “incomplete” percepts to match them with a holistic representation.

Figure 1. 

Illustrations of the principle of global precedence. (Left) When processing a typical Navon letter, people will access the global before the local level. (Right) Under certain conditions, people will complete “incomplete” percepts to match them with a holistic representation.

The second chunk feature that will be capitalized on is perceptual completion, which refers to the fact that, under certain conditions, “missing” parts of percepts are filled in to match a holistic mental representation (see Figure 1, right, where participants perceive an illusionary circle). Taken together, these tasks provide complementary perspectives on the principle of cognitive precedence, because they illustrate the fact that chunks exhibit asymmetric part–whole relationships: They weakly activate their component parts, whereas their component parts strongly evoke the whole.

For the purpose of our experiment, it will be assumed that analogous kinds of relationships hold between holistically represented sequences of language and their parts. This operationalization implies that a complex word like “government” will be considered to be highly chunked, if its processing involves weak separate activation of “govern” but, at the same time, the processing of “govern” triggers strong preactivation of “government.” The usage-based prediction that chunkedness correlates with usage frequency will be tested on the basis of priming, which explores how exposure to one stimulus affects reactions to a subsequent stimulus, thereby providing insights into the nature of cognitive relationships within pairs of stimuli. More specifically, we will use masked visual priming with lexical decision, which has proven to be a robust method for testing part–whole relationships at a neural and behavioral level in the context of other neurolinguistic and psycholinguistic debates (Feldman, Milin, Cho, Moscoso del Prado Martín, & O'Connor, 2015; Devlin, Jamison, Matthews, & Gonnerman, 2004). Our study will include two experimental conditions, with the first examining “whole-to-part” (WP) priming (e.g., gauntness–GAUNT) and the second using the reverse order of presentation, that is, “part-to-whole” (PW) priming (e.g., pale–PALENESS).

The role of frequency will be examined by subjecting the RTs for each experimental condition to linear mixed-effects regression analysis and the relevant event-related fMRI data to parametric analysis. The rationale behind the use of functional brain imaging is the hypothesis that, in each condition, the processing of more and less chunk-like representations should be associated with different brain activation patterns. If the individual morphemes of more chunk-like representations are more difficult to access, this difference should translate into differential brain activity in WP priming. Likewise, if more chunk-like representations are less difficult to complete from partial input, this should be reflected by differential brain activity in PW priming. More specifically, for both conditions, we expect a frequency-dependent modulation of BOLD activity in cortical areas that have been found to be activated in morphological processing, with the most likely candidates being the bilateral inferior frontal regions (cf. Bozic, Tyler, Su, Wingfield, & Marslen-Wilson, 2013).

This study builds on prior neuroimaging studies on the visual processing of English derivatives in several respects: It exploits a priming paradigm (Bozic, Marslen-Wilson, Stamatakis, Davis, & Tyler, 2007; Gold & Rastle, 2007; Devlin et al., 2004), and it tests correlations between corpus-derived frequency metrics and neural activation (Fruchter & Marantz, 2015; Lewis, Solomyak, & Marantz, 2011; Solomyak & Marantz, 2009).

Unlike many other studies, however, this study is not designed around comparing priming effects between groups of stimuli (e.g., morphologically related vs. formally and/or semantically related, or morphologically related vs. totally unrelated), but around tracking fine-grained frequency-dependent differences within a group of pairs for which priming effects at early stages of processing have long been established: pairs involving transparent bimorphemic derivatives and their bases (defined below; for reviews, cf. Feldman et al., 2015; Amenta & Crepaldi, 2012; Rastle & Davis, 2008).

METHODS AND RESULTS1

Experimental Design

Participants

Nineteen right-handed native speakers of English with normal or corrected-to normal vision who reported no neurological or developmental disorders were recruited (9 men and 10 women, mean age = 25.9 years, range = 19–61 years, SD = 9.9 years). None of the participants presented contraindications to fMRI scanning. Participants were paid for their participation. All participants gave written informed consent after the experimental procedure had been explained to them and all their questions had been answered in a way they considered to be satisfactory. The study was approved by the ethics committee of Freiburg University.

Stimuli

Two priming conditions were designed, with the first condition examining WP priming (e.g., gauntness–GAUNT) and the second investigating PW priming (e.g., pale–PALENESS). Both conditions focus on so-called derivatives (e.g., “kissable,” “worthless,” or “settlement”), which involve the derivation of a longer word (“kissable”) from a shorter word (“kiss”) via the addition of an affix (“-able”). All derivatives considered in the experiment consist of two morphemes (minimal meaning-bearing units of language), with the first morpheme representing a base (i.e., a word that can occur on its own, e.g., “kiss,” “worth,” “settle”) and the second morpheme being a derivational suffix (i.e., a morpheme attached to the right of bases to change their meaning, e.g., “-ment,” “-able,” or “-less”). The stimuli, along with information on their usage frequency, were extracted from the English version of the CELEX lexical database (Baayen, Piepenbrock, & Gulikers, 1995).

To make sure that the words under investigation could be processed at a single glance, only derivatives of two to three syllables and from 6 to 10 letters were chosen. Items with the following suffixes were considered: -able, -age, -al, -ance, -ant, -dom, -en, -er, -ess, -fold, -ful, -hood, -ify, -ish, -ism, -ist, -less, -let, -like, -ly, -ment, -ness, -or, -ous, -ry, and -y. Stimuli exhibiting any kind of peculiarity in meaning, orthography, or pronunciation were eliminated.

Stimuli spanning the entire surface, base, and relative frequency spectrum were selected. The term “surface frequency” refers to the usage frequency of a derivative across the different forms it can take (e.g., the surface frequency for “government” would be the summed frequency of “government,” “governments,” “government's,” and “governments'”). The term “base frequency” refers to how frequently the base of a derivative occurs on its own and in other word forms (e.g., “govern,” “governed,” “governing,” “governs”). “Relative frequency” is a composite measure that results from dividing surface by base frequency (Lewis, Solomyak, & Marantz, 2011; Solomyak & Marantz, 2009; Hay, 2001).

Other variables known to potentially affect complex word processing were also collected for exploratory purposes (derivative–base letter ratio [DBLR; i.e., the number of letters of a derivative divided by the number of letters of its base]; different kinds of neighborhood variables; measures of length in letters, phonemes, and syllables; bigram counts; productivity metrics and morphological family measures). When these variables were not available from the CELEX database, they were extracted from the Hyperspace Analogue to Language corpus via the search engine of the English Lexicon Project (elexicon.wustl.edu/, Balota et al., 2007) or from other publications (Hay & Baayen, 2002). In compliance with common psycholinguistic practice, logarithmically transformed numerical values were used for most predictor variables (using the natural logarithm).

On the whole, 216 stimuli of interest were selected. These stimuli were split up into two equal groups of 108 items matched as much as possible in terms of suffix, base and surface frequency, and length in letters of base and derivative, with one group being assigned to the WP priming condition (e.g., government–GOVERN; mean surface frequency = 366, SD = 761, range = 1–4483; mean base frequency = 2506, SD = 4739, range = 3–35,874; mean derivative length in letters = 7.72, SD = 1.21, range = 6–10; mean base length = 4.89, SD = 1, range = 3–8) and the other group providing the stimuli for the PW priming condition (e.g., settle–SETTLEMENT; mean surface frequency = 287, SD = 902.3, range = 1–7693; mean base frequency = 2187, SD = 3853, range = 1–26,038; mean derivative length =7.73, SD = 1.17, range = 6–10; mean base length = 4.9, SD = 0.95, range = 3–7). Wilcoxon rank tests revealed no statistically significant differences between both groups for any of these variables (surface frequency: p = .23; base frequency: p = .38; derivative length: p = .82; base length: p = .83).

In addition, to provide for the “no” answers, two lists with nonword targets were constructed, with the first list containing targets with jumbled letters (e.g., sugary–SGUAR, builder–BLUID) and the second list featuring targets with meaningless base–suffix combinations (e.g., ticket–TICKETMENT, pool–POOLAGE). The derivatives on both lists were extracted from CELEX, approximated letter lengths from the conditions of interest, and were representative of the whole surface frequency range. Moreover, the different suffixes used were proportional to their share in the conditions of interest.

Priming Paradigm

Both conditions use a classical masked priming paradigm exploiting the so-called “sandwich technique,” which involves inserting a very briefly displayed prime between a forward mask and a target functioning as a backward mask (Forster & Davis, 1984). At the behavioral level, RTs in priming experiments vary as a function of the degree of mental relatedness between prime and target such that more strongly associated stimuli yield lower RTs. The fMRI correlate of behavioral facilitation is the phenomenon of fMRI adaptation, which refers to increases or decreases in BOLD activity as a function of the degree of prime–target relatedness (for a review, see Segaert, Weber, de Lange, Petersson, & Hagoort, 2013).

The SOA used in this study (60 msec) was somewhat longer than that used in traditional masked priming paradigms (the detailed timing of the experimental trials is provided in Figure 2). At this stage, stimuli are not available for report, and results cannot be distorted by conscious strategies. However, certain participants tend to perceive a kind of flash before the target stimulus. To avoid confusion during the experiment, this was anticipated by informing the participants about the presence of masked primes. The flash-perception prediction was confirmed by at least some participants on a debriefing questionnaire.

Figure 2. 

Schematic representation of a priming trial. Each experimental trial lasted 2560 msec. The forward mask, a row of hash marks, was displayed for 500 msec; the lowercase prime, for 60 msec; and the uppercase target, for maximally 600 msec (shorter if participants respond before). This sequence was replaced with a black screen, which left an additional 1400 msec for responding. The fact that targets disappeared as soon as participants respond was supposed to provide feedback to encourage quick answers. At the same time, the targets were not presented for longer than 600 msec to make sure that neural differences between stimuli triggering low and high RTs could not be attributed to significant differences in visual exposure.

Figure 2. 

Schematic representation of a priming trial. Each experimental trial lasted 2560 msec. The forward mask, a row of hash marks, was displayed for 500 msec; the lowercase prime, for 60 msec; and the uppercase target, for maximally 600 msec (shorter if participants respond before). This sequence was replaced with a black screen, which left an additional 1400 msec for responding. The fact that targets disappeared as soon as participants respond was supposed to provide feedback to encourage quick answers. At the same time, the targets were not presented for longer than 600 msec to make sure that neural differences between stimuli triggering low and high RTs could not be attributed to significant differences in visual exposure.

To familiarize participants with the task, they were given an offline practice session of 40 prime–target pairs (20 WP and 20 PW, with half of the targets being nonwords, respectively). These pairs were similar but not identical to the stimuli used in the real experiment. Once in the scanner, participants completed an additional short practice block in an attempt to help them to get accustomed to the scanner and its noise. Accuracy feedback was provided only during the practice sessions.

Participants were instructed to make lexical decision to the targets by pressing the right or left button of a two-button mouse with the index or middle finger of their right or left hand. They were asked to make their decisions as quickly and accurately as possible. RTs were recorded with tenth of millisecond accuracy from target onset. Stimulus presentation and data recording were controlled by the Presentation software (www.neurobs.com/) running on a PC. The order of presentation of different stimulus types was pseudorandomized in an event-related design (the same for each participant). The intertrial interval (black screen) varied pseudorandomly between 0 and 2190 msec (mean = 1095 msec) to enable fMRI jittering. The stimuli were displayed in Arial font, size 28, in white letters on a black background.

Imaging Parameters

Imaging was performed on a Siemens (Erlangen, Germany)Tim-Trio 3-T scanner at the Freiburg Brain Imaging Laboratory, Freiburg University Hospital. A standard 12-channel head coil with foam padding to restrict head motion was used. Functional images were acquired in interleaved order using a gradient echo-planar T2*-sensitive sequence with 36 oblique axial slices covering the whole brain (slice thickness = 3 mm, interslice distance = 0 mm, matrix size = 64 × 64 pixels, field of view = 192 × 192 mm2, in-plane resolution = 3 × 3 mm2, online motion and distortion correction, repetition time = 2.19 sec, echo time = 30 msec, flip angle = 75°, 3 × 3 × 3 mm3 voxel size). Before the functional scans, a structural high-resolution T1-weighted magnetization prepared rapid gradient echo sequence was collected for anatomical localization and spatial processing of the functional data (1 × 1 × 1 mm3 voxel size, repetition time = 2.2 sec, echo time = 2.15 msec, 176 slices in a 256 × 256 pixel matrix).

Behavioral Analysis and Results

Behavioral Analysis

Our experiment aimed at testing whether chunkedness varies as a function of usage frequency and other psycholinguistic variables known to affect word processing. In line with our operationalization, we assumed that a higher degree of chunkedness goes along with a weaker degree of activation of the component parts when processing the whole and, at the same time, a stronger activation of the whole when processing its parts. Accordingly, only variables coming out as significant RT predictors in both priming directions while at the same time exhibiting mirror-inverted regression slopes were taken to gauge chunk status.

For each priming condition, the relationship between the dependent variable (log-transformed RTs using the natural logarithm) and each variable mentioned in the Stimuli section (including surface, base, and relative frequency) was examined by means of a separate mixed-effects regression model. Analyses were conducted in R (version 3.1.0) with the function lmer from the “lme4” package and the function summary from the “lmerTest” package (R Core Team, 2014; Baayen, Davidson, & Bates, 2008). Incorrect responses were discarded from the analyses. Subject and target item were included into each model as crossed random variables, and likelihood ratio tests showed that they always contributed significantly to the goodness of fit. Outliers with a standardized residual greater than 2.5 SDs from zero were discarded from the models, leading to the removal of 1.29–2.76% of the data points. The assumptions of linearity, homoscedasticity, and normality of residuals and the normality of the random effects were checked on the basis of diagnostic plots. From among the models whose single fixed-effect factor survived the α level of p < .05 in both conditions, those whose coefficients had opposite signs were identified as indexing chunkedness.

Behavioral Results

The behavioral analysis left us with two predictors that, according to our operationalization, must be seen as indexing chunkedness: log-transformed relative frequency (LogRelFreq; p = .0014 in WP priming, p < .001 in PW priming) and DBLR (p = .0088 in WP, p = .005 in PW). Only LogRelFreq passed the Bonferroni-corrected significance level for multiple testing (pcorr = .0018). Importantly, the two chunkedness predictors do not explain the same share of variance (the R-function collin.fnc, which tests for collinearity, yielded a kappa value of 15.93 for the WP condition and 15.32 for the PW condition). The fact that the LogRelFreq predictor exhibits the asymmetric behavioral effects required by our operationalization is illustrated in Figure 3, produced with the function plotLMER.fnc from the “languageR” package in R version 2.13.0.2

Figure 3. 

Effect size for the single independent variable LogRelFreq in a linear mixed-effects model fitted to log RTs in WP priming (left) and PW priming (right). The solid black lines represent predicted log RT values. The dotted black lines visualize the Markov chain Monte Carlo–based highest posterior density intervals for these values. In WP priming, higher LogRelFreqs go along with higher RTs. By contrast, in PW priming, higher LogRelFreqs correlate with lower RTs.

Figure 3. 

Effect size for the single independent variable LogRelFreq in a linear mixed-effects model fitted to log RTs in WP priming (left) and PW priming (right). The solid black lines represent predicted log RT values. The dotted black lines visualize the Markov chain Monte Carlo–based highest posterior density intervals for these values. In WP priming, higher LogRelFreqs go along with higher RTs. By contrast, in PW priming, higher LogRelFreqs correlate with lower RTs.

Imaging Analysis and Results

Imaging Analysis

The major aim of our study was to investigate the relationship between usage frequency and chunkedness. Our behavioral analysis identified LogRelFreq as the most robust and only frequency-related chunkedness predictor; our parametric fMRI analysis therefore tracked potential correlations between this variable and BOLD signal in both conditions of interest (Hauk, Davis, & Pulvermüller, 2008).

The fMRI data were analyzed using SPM (SPM8, www.fil.ion.ucl.ac.uk/spm/) implemented in MATLAB 7.10.0 (R2010a; The MathWorks, Inc., Natick, MA; www.mathworks.com/). Before processing, the first six scans of each session were discarded to equilibrate T1 saturation effects. Slices were corrected for differences in acquisition time by shifting the signal measured in each slice relative to the acquisition of the 18th (middle) slice. The resulting volumes were spatially normalized to the Montreal Neurological Institute (MNI) reference brain by means of normalization parameters estimated by segmenting coregistered T1 anatomical scans (Ashburner & Friston, 2005). To account for interparticipant differences, all normalized images were smoothed using an isotropic 8-mm Gaussian kernel.

At the single-participant level, the two event types of interest were modeled as separate conditions. The prime–target pairs in each condition were coded for LogRelFreq in a parametric modulation adjusting the height of the impulse response marking the target onset. Movement parameters estimated at the realignment stage of preprocessing were included as covariates of no interest. A high-pass filter (cutoff = 128 sec) was applied. The onset of the target stimuli was convolved with the canonical hemodynamic response function. Voxel-wise regression coefficients for each of the conditions were estimated by means of weighted least squares.

At the group level, the parametric effects of LogRelFreq for each condition were estimated by entering the parameter estimates from the single-participant models into random effects analyses (one-sample t tests). Differences in effect size were identified by entering the effects of LogRelFreq in each condition into a paired t test and inclusively marking a contrast of interest (negative correlation between LogRelFreq and BOLD in PW priming > positive correlation between LogRelFreq and BOLD in WP priming; threshold, p < .001) with the negative correlation between LogRelFreq and BOLD in PW priming (uncorrected mask threshold, p < .05), which involved areas showing overlap between the two conditions. Only peak voxels surviving the uncorrected statistical threshold of p < .001 within clusters passing a whole-brain corrected threshold of p < .05 will be reported.

Imaging Results

Parametric effects of LogRelFreq in WP priming

Whole-brain analysis revealed several brain regions exhibiting a positive correlation between BOLD signal and LogRelFreq in WP priming (cf. Table 1; Figure 4). In these regions, prime–target pairs like “harmless–HARM” (high LogRelFreq) elicit stronger activation than pairs like “gauntness–GAUNT” (low LogRelFreq). No effects were observed in the opposite direction (i.e., greater activity for lower LogRelFreqs).

Table 1. 

Regions Where BOLD Activation Increases as LogRelFreq Increases in WP Priming

Regions Cluster Voxel Level MNI Coordinates 
pFWE-corr Extent pFWE-corr t x y z 
L precentral gyrus .000 393 .019 7.35 −39 −4 55 
 L inferior frontal gyrus (p. opercularis) .074 6.64 −36 28 
 L Rolandic operculum .237 6.06 −42 16 
 Cluster extends into L inferior frontal gyrus (p. triangularis) 
L lingual gyrus .001 79 .024 7.24 −15 −73 10 
 L lingual gyrus .066 6.70 −12 −70 
L lingual gyrus .024 44 .865 4.88 −12 −91 −5 
 L fusiform gyrus .998 4.18 −24 −85 −11 
 L fusiform gyrus .998 4.16 −21 −91 −11 
R inferior occipital gyrus .000 106 .162 6.25 36 −76 −5 
 R inferior occipital gyrus .184 6.18 33 −79 −8 
 R lingual gyrus .907 4.77 24 −85 −11 
L inferior occipital gyrus .002 70 .633 5.32 −36 −70 −8 
 L inferior occipital gyrus .831 4.96 −42 −76 −5 
 L fusiform gyrus .917 4.75 −30 −67 −8 
L SMA .000 130 .331 5.90 −6 5 58 
 R SMA .474 5.60 14 52 
 R SMA .742 5.13 11 52 
 Cluster extends into R superior frontal gyrus 
Regions Cluster Voxel Level MNI Coordinates 
pFWE-corr Extent pFWE-corr t x y z 
L precentral gyrus .000 393 .019 7.35 −39 −4 55 
 L inferior frontal gyrus (p. opercularis) .074 6.64 −36 28 
 L Rolandic operculum .237 6.06 −42 16 
 Cluster extends into L inferior frontal gyrus (p. triangularis) 
L lingual gyrus .001 79 .024 7.24 −15 −73 10 
 L lingual gyrus .066 6.70 −12 −70 
L lingual gyrus .024 44 .865 4.88 −12 −91 −5 
 L fusiform gyrus .998 4.18 −24 −85 −11 
 L fusiform gyrus .998 4.16 −21 −91 −11 
R inferior occipital gyrus .000 106 .162 6.25 36 −76 −5 
 R inferior occipital gyrus .184 6.18 33 −79 −8 
 R lingual gyrus .907 4.77 24 −85 −11 
L inferior occipital gyrus .002 70 .633 5.32 −36 −70 −8 
 L inferior occipital gyrus .831 4.96 −42 −76 −5 
 L fusiform gyrus .917 4.75 −30 −67 −8 
L SMA .000 130 .331 5.90 −6 5 58 
 R SMA .474 5.60 14 52 
 R SMA .742 5.13 11 52 
 Cluster extends into R superior frontal gyrus 

Statistics are shown thresholded at .001 (peak level, uncorrected), after the whole-brain analysis. All clusters are significant at p < .05 after statistical correction for multiple comparisons. The highest three peaks within a cluster are shown on subsequent lines, with the most significant in boldface. Anatomical locations identified under “cluster extends into” refer to other voxels passing an uncorrected peak-level threshold of p < .001 within the same clusters. L = left; p. = pars; R = right.

Figure 4. 

Regions displaying a positive correlation between BOLD signal and LogRelFreq in WP priming, projected on sections of the canonical MNI single-participant template, shown at a peak-level threshold of p < .001 (uncorrected). Color bars indicate the range of the relevant voxel-level t values. Statistics and MNI coordinates are provided in Table 1.

Figure 4. 

Regions displaying a positive correlation between BOLD signal and LogRelFreq in WP priming, projected on sections of the canonical MNI single-participant template, shown at a peak-level threshold of p < .001 (uncorrected). Color bars indicate the range of the relevant voxel-level t values. Statistics and MNI coordinates are provided in Table 1.

Parametric effects of LogRelFreq in PW priming

Whole-brain analysis revealed several regions showing a negative linear correlation between BOLD signal and LogRelFreq (cf. Table 2; Figure 5, blue blobs). In these regions, prime–target pairs like “prophet–PROPHETESS” (low LogRelFreq) require more neural activity than pairs like “agree–AGREEMENT” (high LogRelFreq). Moreover, in this condition, three clusters showed a positive linear correlation between BOLD signal and LogRelFreq (cf. Table 3; Figure 5, red blobs). In these regions, prime–target pairs like “doubt–DOUBTFUL” (high LogRelFreq) go along with stronger BOLD signal than pairs like “soft–SOFTISH” (low LogRelFreq).

Table 2. 

Regions Where BOLD Activation Increases as LogRelFreq Decreases in PW Priming

Regions Cluster Voxel Level MNI Coordinates 
pFWE-corr Extent pFWE-corr t x y z 
L inferior frontal gyrus (p. triangularis) .000 1609 .000 11.26 −45 20 25 
 L inferior frontal gyrus (p. triangularis) .000 11.17 −48 17 28 
 L inferior frontal gyrus (p. orbitalis) .000 10.77 −45 23 −8 
 Cluster extends into L claustrum, L precentral gyrus, L temporal pole, and L middle frontal gyrus 
R claustrum .000 262 .003 8.41 30 26 1 
 R claustrum .003 8.32 30 29 −5 
 R insula .382 5.74 42 17 −2 
 Cluster extents into R inferior frontal gyrus (p. triangularis) 
R inferior frontal gyrus (p. opercularis) .001 89 .441 5.62 54 20 28 
 R inferior frontal gyrus (p. opercularis) .996 4.19 42 29 16 
L superior frontal gyrus .000 555 .000 9.60 0 20 55 
 R medial frontal gyrus .009 7.77 20 43 
 L superior frontal gyrus .016 7.44 −3 29 52 
 Cluster extends into R cingulate gyrus, L cingulate gyrus, and L medial frontal gyrus 
L thalamus .001 84 .101 6.48 −9 −16 10 
 L pallidum .951 4.58 −12 −1 
 L pallidum .988 4.34 −15 −4 −2 
 Cluster extends into L caudate nucleus 
R thalamus .000 137 .132 6.35 9 −19 7 
 R caudate nucleus .150 6.29 15 13 
 R caudate nucleus .160 6.25 12 11 
R cuneus .000 540 .069 6.68 12 −82 13 
 R lingual gyrus .071 6.66 −70 
 L lingual gyrus .110 6.44 −9 −73 
 Cluster extends into L declive, L parahippocampal gyrus, L cuneus, and R cuneus 
L fusiform gyrus .004 67 .204 6.13 −42 −55 −11 
 L inferior temporal gyrus .220 6.10 −45 −49 −17 
L inferior parietal lobule .001 94 .261 6.01 −30 −55 43 
 L superior parietal lobule .614 5.31 −30 −61 46 
 L middle occipital gyrus .678 5.20 −30 −64 37 
Regions Cluster Voxel Level MNI Coordinates 
pFWE-corr Extent pFWE-corr t x y z 
L inferior frontal gyrus (p. triangularis) .000 1609 .000 11.26 −45 20 25 
 L inferior frontal gyrus (p. triangularis) .000 11.17 −48 17 28 
 L inferior frontal gyrus (p. orbitalis) .000 10.77 −45 23 −8 
 Cluster extends into L claustrum, L precentral gyrus, L temporal pole, and L middle frontal gyrus 
R claustrum .000 262 .003 8.41 30 26 1 
 R claustrum .003 8.32 30 29 −5 
 R insula .382 5.74 42 17 −2 
 Cluster extents into R inferior frontal gyrus (p. triangularis) 
R inferior frontal gyrus (p. opercularis) .001 89 .441 5.62 54 20 28 
 R inferior frontal gyrus (p. opercularis) .996 4.19 42 29 16 
L superior frontal gyrus .000 555 .000 9.60 0 20 55 
 R medial frontal gyrus .009 7.77 20 43 
 L superior frontal gyrus .016 7.44 −3 29 52 
 Cluster extends into R cingulate gyrus, L cingulate gyrus, and L medial frontal gyrus 
L thalamus .001 84 .101 6.48 −9 −16 10 
 L pallidum .951 4.58 −12 −1 
 L pallidum .988 4.34 −15 −4 −2 
 Cluster extends into L caudate nucleus 
R thalamus .000 137 .132 6.35 9 −19 7 
 R caudate nucleus .150 6.29 15 13 
 R caudate nucleus .160 6.25 12 11 
R cuneus .000 540 .069 6.68 12 −82 13 
 R lingual gyrus .071 6.66 −70 
 L lingual gyrus .110 6.44 −9 −73 
 Cluster extends into L declive, L parahippocampal gyrus, L cuneus, and R cuneus 
L fusiform gyrus .004 67 .204 6.13 −42 −55 −11 
 L inferior temporal gyrus .220 6.10 −45 −49 −17 
L inferior parietal lobule .001 94 .261 6.01 −30 −55 43 
 L superior parietal lobule .614 5.31 −30 −61 46 
 L middle occipital gyrus .678 5.20 −30 −64 37 

Statistics are shown thresholded at .001 (peak level, uncorrected), after the whole-brain analysis. All clusters are significant at p < .05 after statistical correction for multiple comparisons. The highest three peaks within a cluster are shown on subsequent lines, with the most significant in boldface. Anatomical locations identified under “cluster extends into” refer to other voxels passing an uncorrected peak-level threshold of p < .001 within the same clusters.

Figure 5. 

Regions displaying negative (blue blobs) and positive (red blobs) correlations between BOLD signal and LogRelFreq in PW priming, projected on sections of the canonical MNI single-participant template, rendered at a peak-level threshold of p < .001 (uncorrected). Color bars indicate the range of the relevant voxel-level t values. Statistics and MNI coordinates are provided in Tables 2 and 3.

Figure 5. 

Regions displaying negative (blue blobs) and positive (red blobs) correlations between BOLD signal and LogRelFreq in PW priming, projected on sections of the canonical MNI single-participant template, rendered at a peak-level threshold of p < .001 (uncorrected). Color bars indicate the range of the relevant voxel-level t values. Statistics and MNI coordinates are provided in Tables 2 and 3.

Table 3. 

Regions Where BOLD Activation Increases as LogRelFreq Increases in PW Priming

Regions Cluster Voxel Level MNI Coordinates 
pFWE-corr Extent pFWE-corr t x y z 
R anterior cingulate .000 150 .048 6.86 9 50 1 
L precuneus .000 640 .175 6.21 −6 −64 28 
 R middle cingulate cortex .183 6.19 −37 49 
 L precuneus .185 6.18 −12 −64 40 
 Cluster extends into L middle cingulate cortex, L cuneus, and R precuneus 
R angular gyrus .000 229 .195 6.16 57 −55 31 
 R supramarginal gyrus .255 6.02 51 −34 34 
 R supramarginal gyrus .741 5.09 60 −34 43 
 Cluster extends into R superior temporal gyrus 
Regions Cluster Voxel Level MNI Coordinates 
pFWE-corr Extent pFWE-corr t x y z 
R anterior cingulate .000 150 .048 6.86 9 50 1 
L precuneus .000 640 .175 6.21 −6 −64 28 
 R middle cingulate cortex .183 6.19 −37 49 
 L precuneus .185 6.18 −12 −64 40 
 Cluster extends into L middle cingulate cortex, L cuneus, and R precuneus 
R angular gyrus .000 229 .195 6.16 57 −55 31 
 R supramarginal gyrus .255 6.02 51 −34 34 
 R supramarginal gyrus .741 5.09 60 −34 43 
 Cluster extends into R superior temporal gyrus 

Statistics are shown thresholded at .001 (peak level, uncorrected), after the whole-brain analysis. All clusters are significant at p < .05 after statistical correction for multiple comparisons. The highest three peaks within a cluster are shown on subsequent lines, with the most significant in boldface. Anatomical locations identified under “cluster extends into” refer to other voxels passing an uncorrected peak-level threshold of p < .001 within the same clusters.

Differences in effect size between negative parametric LogRelFreq effects in PW priming and positive parametric LogRelFreq effects in WP priming

The analysis revealed two clusters where the negative parametric effects of LogRelFreq in PW priming are larger than the positive parametric effects of LogRelFreq in WP priming, within a mask of negative parametric LogRelFreq effects in PW priming (cf. Table 4, Figure 6).

Table 4. 

Regions Where Negative Parametric LogRelFreq Effects in PW Priming Are Significantly Larger than Positive Parametric LogRelFreq Effects in WP Priming

Regions Cluster Voxel Level MNI Coordinates 
pFWE-corr Extent pFWE-corr t x y z 
L inferior frontal gyrus (p. triangularis) .000 506 .034 7.04 −57 26 7 
 L claustrum .039 6.97 −30 23 1 
 L inferior frontal gyrus (p. orbitalis) .083 5.58 −45 20 −8 
 Cluster extends into L insula, L middle frontal gyrus, and L precentral gyrus 
L superior frontal gyrus .018 48 .0932 4.69 0 35 49 
 L medial frontal gyrus .991 4.32 −9 38 31 
 R medial frontal gyrus .999 4.08 44 37 
Regions Cluster Voxel Level MNI Coordinates 
pFWE-corr Extent pFWE-corr t x y z 
L inferior frontal gyrus (p. triangularis) .000 506 .034 7.04 −57 26 7 
 L claustrum .039 6.97 −30 23 1 
 L inferior frontal gyrus (p. orbitalis) .083 5.58 −45 20 −8 
 Cluster extends into L insula, L middle frontal gyrus, and L precentral gyrus 
L superior frontal gyrus .018 48 .0932 4.69 0 35 49 
 L medial frontal gyrus .991 4.32 −9 38 31 
 R medial frontal gyrus .999 4.08 44 37 

Statistics are shown thresholded at .001 (peak level, uncorrected), within a mask of negative parametric LogRelFreq effects in PW priming (uncorrected mask threshold, p < .05). All clusters are significant at p < .05 after statistical correction for multiple comparisons. The highest three peaks within a cluster are shown on subsequent lines, with the most significant in boldface. Anatomical locations identified under “cluster extends into” refer to other voxels passing an uncorrected peak-level threshold of p < .001 within the same clusters.

Figure 6. 

Regions where the negative correlation between BOLD and LogRelFreq in PW priming has a stronger effect size than the positive correlation between BOLD and LogRelFreq in WP priming, masked by negative parametric LogRelFreq effects in PW priming (uncorrected mask threshold, p < .05). Activation is projected on sections of the canonical MNI single-participant template, rendered at a peak-level threshold of p < .001 (uncorrected). Color bars indicate the range of the relevant voxel-level t values. Statistics and MNI coordinates are provided in Table 4.

Figure 6. 

Regions where the negative correlation between BOLD and LogRelFreq in PW priming has a stronger effect size than the positive correlation between BOLD and LogRelFreq in WP priming, masked by negative parametric LogRelFreq effects in PW priming (uncorrected mask threshold, p < .05). Activation is projected on sections of the canonical MNI single-participant template, rendered at a peak-level threshold of p < .001 (uncorrected). Color bars indicate the range of the relevant voxel-level t values. Statistics and MNI coordinates are provided in Table 4.

DISCUSSION

Chunkedness as a Relational Phenomenon

The behavioral analyses identified LogRelFreq as the only frequency-related chunkedness predictor. DBLR came out as a further, although less robust, predictor. Both of these predictors are inherently relational, which is consistent with the usage-based tenet that chunk status is a graded and context-dependent phenomenon that arises from the interaction between separate entries interconnected along different dimensions in language users' mental network of linguistic knowledge.

More concretely, our results show that items like “frighten” and “government,” which exhibit a relatively high LogRelFreq, are more chunk-like than items like “goodish” or “wearable,” which have a very low LogRelFreq. This means that a derivative that is more frequent than its base is in isolation is perceived as more chunk-like. The results for DBLR show that derivatives involving proportionally shorter suffixes (“gloomy,” “thoroughly”) are more chunk-like than those with longer suffixes (“tenfold,” “illness”). Longer suffixes might be perceptually more salient and make their bases stand out as perceptual units in their own right.

Interlexical Competition, Ease of Decomposition, or General Task Performance Difficulty

WP Priming: Positive Correlation

Left inferior frontal activation

The fMRI analysis identified several clusters where neural activity increases as a function of LogRelFreq in WP priming. The largest activation cluster is located in left frontal regions around the left precentral gyrus and extends into the pars opercularis, the Rolandic operculum, and the pars triangularis. Prior neuroimaging studies have related left inferior frontal (LIF) activity to morphological decomposition (Pliatsikas, Wheeldon, Lahiri, & Hansen, 2014; Bozic et al., 2007; Marslen-Wilson & Tyler, 2007; Ullman, 2007). Our results could therefore be taken to indicate that WP priming triggers morphological decomposition and that this computation is not performed with the same ease across all morphologically transparent words. Rather, derivatives of higher LogRelFreq are more difficult to segment than those of lower LogRelFreq (e.g., “worthless” is more difficult to divide into two parts than “tearless”).

A competing view has attributed LIF activity to aspects of monomorphemic word retrieval. Thus, Hauk et al. (2008) found that, in BA 44, lower-frequency words trigger more activation than higher-frequency words, leading to the conclusion that the pars opercularis and surrounding areas underlie the integration of lexical, phonological, and semantic representations in whole-word recognition. Likewise, Buckner, Koutstaal, Schacter, and Rosen (2000) showed that behavioral priming for repeated monomorphemic items is associated with reduced BOLD activation in BA 44, BA 45, BA 47, and BA 6, indicating that these regions subserve the selection and retrieval of lexical knowledge. This suggests that the left frontal activation observed in the present experimental condition might not reflect difficulty of prime decomposition but rather difficulty of retrieving the target (e.g., “worth” vs. “tear”), given a more or less closely connected but undecomposed prime (“worthless” vs. “tearless”). On this account, WP priming triggers competition between derivatives and their onset-embedded part, with “worthless” impeding access to “worth” more than “tearless” does for “tear.”

A third view has related LIF activation to domain-general processes such as maintaining attention, keeping information in STM, or exerting executive control (Fedorenko, Duncan, & Kanwisher, 2012, 2013). The correlation between log RT, LogRelFreq, and BOLD activation was built into the design of this study. As a consequence, our results would also be readily compatible with an account in terms of BOLD activation reflecting domain-general (rather than language-specific) effects of task performance difficulty. However, this does not pose any problem for our interpretation, because our entrenchment operationalization is noncommittal as to whether increased difficulty in relating wholes to parts engages language-specific or domain-general processes.

Further activation clusters

We also observed four posterior activation clusters, located in the left lingual gyrus (two clusters) as well as the right and left inferior occipital gyri. In this experimental condition, increased BOLD activity in visual regions probably results from increased presentation durations for stimuli eliciting higher RTs (i.e., stimuli of higher LogRelFreq; correlation between stimulus duration and RT in this condition: rs = .5818, p < .001). This effect might have been enhanced by our specific selection of nonword targets, which for this task involved words including low-probability bigrams (e.g., SGUAR), thereby potentially drawing participants' attention to the visual features of stimuli.

PW Priming: Negative Correlation

LIF activation

The parametric analyses for PW priming identified several clusters where neural activity increases as LogRelFreq decreases. The largest cluster, located in left frontal regions, peaks in the pars triangularis and extends into the pars orbitalis, the claustrum, the precentral gyrus, the temporal pole, and the middle frontal gyrus. Again, LIF activity could be related either to composition processes or to competition processes between holistically represented entries, and the relevant cognitive processes could either be domain-general or language-specific.

On the one hand, higher activity in the pars triangularis has been attributed to greater difficulty of semantic integration (Dominey, Hoen, & Inui, 2006; Hagoort, 2005). Applied to the experiment at hand, this suggests that pairs involving relatively “unchunked” targets might put higher semantic unification demands on the processor, which seems plausible, considering that items like “fourfold” or “wolfish” will presumably not be associated with a unitary, precompiled meaning, especially if they have only rarely been encountered in the past. By contrast, the processing of highly chunked entries such as “equipment” or “frighten” is likely to depend on direct, learned associations between whole forms and prepackaged meanings, making semantic unification unnecessary.

On the other hand, activation in this region has also variously been claimed to play a role in processes related to whole-word processing, in particular, lexical retrieval, semantic processing, and semantic working memory (Ullman, 2007; Ungerleider, Courtney, & Haxby, 1998). An alternative interpretation would therefore be that activity around the pars triangularis does not index difficulty of composition, but difficulty of whole-word retrieval. More specifically, it could be the case that the processing of an item like “wolf” does not trigger any preactivation of lower-frequency whole-word competitors like “wolfish,” making them relatively difficult to retrieve when they function as targets. By contrast, the processing of an item like “fright” might go along with strong preactivation of its higher-frequency competitor “frighten,” thereby facilitating its retrieval when it functions as a target.

Right frontal activation clusters

An explanation in terms of competition between whole-word entries at different levels of granularity (i.e., both morphologically complex and simple) would also be suited to account for the contralateral activation regions observed in the right inferior frontal gyrus. Thus, Bozic, Tyler, Ives, Randall, and Marslen-Wilson (2010) showed that monomorphemic spoken words with onset-embedded competitors (e.g., “claim” [clay]) elicit more activity in bilateral inferior frontal regions (BA 45/BA 47) than words with no competitors. More generally, it has been claimed that holistic access to complex words is accomplished by distributed bilateral systems, whereas left fronto-temporal regions are selectively engaged in decompositional access (Bozic et al., 2013). The fact that our data combine left-hemispheric dominance with significant right-hemispheric activation indicates that, in our task, combinatorial and competition processes might not be mutually exclusive. Rather, primes lacking highly predictable targets might trigger stronger competition effects, and the unpredictable targets themselves might require compositional processes.

Left superior frontal activation

The activation cluster peaking in the left superior frontal gyrus and extending into the bilateral medial frontal and cingulate gyri might reflect enhanced attention or response conflict for less predictable targets (i.e., those associated with lower LogRelFreqs and accordingly higher RTs; Aarts, Roelofs, & Van Turennout, 2009; Heekeren, Marrett, & Ungerleider, 2008).

Subcortical activation clusters

Two further clusters that might be related to competition processes are those around the thalami. As far as lexical processing is concerned, thalamic activity has been claimed to underlie inhibition in cases of strong competition between plausible alternatives (e.g., Marangolo & Piras, 2010) and, more generally, to reflect the difficulty of task demands or attentional processes (for a review, cf. Llano, 2013). In line with this, higher BOLD activation for words with lower-frequency bases has been reported in the thalami (Vannest, Newport, Newman, & Bavelier, 2011). However, thalamic activity has also been related to combinatorial processes. Thus, according to Ullman (2006), one of the channels linking the BG to Broca's area via the thalami projects to the pars opercularis and subserves sequential and combinatorial grammatical skills. In our task, competition and combinatorial activity are not necessarily mutually exclusive: Wholes that are more difficult to predict from their parts may give rise to stronger competitive effects, may require compositional operations, or both.

Posterior activation clusters

As in the opposite priming direction, increased activation in visual regions (here, around the left fusiform gyrus and the right cuneus) is likely to have resulted from increased presentation durations for targets eliciting higher RTs (i.e., stimuli of lower LogRelFreq; correlation between stimulus duration and RT in this condition: rs = .2195, p < .001).

PW Priming: Positive Correlation

Moreover, the fMRI analysis identified three clusters with a positive correlation between BOLD signal and LogRelFreq. One line of evidence suggests that stronger activation in these clusters might actually reflect weaker deactivation of the default mode network, that is, lower task demands. The default mode network, which includes the precuneus, the posterior cingulate cortex, and the medial pFC as well as inferior, lateral, and medial parts of the parietal cortex, is assumed to be active in wakeful resting states and to become deactivated when task-related demands increase (Broyd et al., 2009). Alternatively, activity around the cingulate cortex, the precuneus, and the angular gyri might reflect processes of semantic coactivation (Graves, Desai, Humphries, Seidenberg, & Binder, 2009). Both interpretations are plausible in the context of our experimental condition: Higher relative frequency and stronger chunk status imply stronger semantic associations between parts and wholes and, accordingly, lower cognitive demands in this priming direction.

Effect Size Differences between Negative LogRelFreq Effects in PW Priming and Positive LogRelFreq Effects in WP Priming

Our results show that LogRelFreq has stronger effects in PW priming than in the opposite priming direction and localize this difference to two regions: the LIF gyrus with a peak in pars triangularis and the left superior frontal gyrus. Although, as highlighted above, our experiment cannot adjudicate between competing accounts of LogRelFreq effects in terms of interlexical competition, ease of decomposition, or general task performance difficulty, this difference suggests that the two critical conditions rely on at least partially different cognitive processes that go beyond pure RT effects.

Conclusion

All in all, our study has shown that relative frequency modulates the early stages of visual word processing tapped by masked priming. The finding that information pertaining to the whole modulates early morphological processing stages is at odds with early decomposition models and supports usage-based models, which predict that derivatives like “tearless” (low LogRelFreq) should be perceived as being made up of two highly salient subparts ([tear] + [-less]), whereas derivatives like “worthless” (high LogRelFreq) should exhibit a stronger global bias ([worthless]).

It is crucial to highlight that strong global bias does not entail that the parts receive no activation at all throughout the recognition process. Several psycholinguistic studies focusing on different types of arguably holistic expressions (from idiomatic phrasal verbs as in “pull off a robbery” via complex prepositions like “in the hands of” to idiomatic phrases like “give someone the creeps”) have shown that their processing involves separate access both to their component parts and to structural knowledge (Arnon & Cohen Priva, 2014; Molinaro, Canal, Vespignani, Pesciarelli, & Cacciari, 2013; Boulenger, Shtyrov, & Pulvermüller, 2012; Snider & Arnon, 2012; Konopka & Bock, 2009). The assumption that holistic representations do not totally supersede separate representations for their component parts is both consistent with the usage-based claim that transparent high-frequency expressions are cognitively represented in a redundant fashion (i.e., both holistically and compositionally), and it is necessary to accommodate the fact that many idiomatic (and therefore necessarily holistically stored) expressions allow for modification or variation of their parts (e.g., “to spill the beans” → “The beans were finally spilled”; Goldberg & Suttle, 2010; Goldberg, 2006; Croft, 2001, p. 182). We therefore hypothesize that weak base activation in our WP condition reflects weak morphological priming rather than the absence of priming. This prediction will have to be tested by follow-up studies that contrast derivatives exhibiting extremely high relative frequencies against an unrelated baseline and which track the whole time course of the recognition process.

To conclude, our study has demonstrated that the nature of representations accessed at early stages of language processing is affected by the frequency of transparently derived words, with higher relative frequencies leading to stronger global bias. Further studies will be needed to determine the perceptual and memory limits to chunks on the basis of longer language strings and to explore the extent to which our findings generalize to the auditory modality.

Acknowledgments

This paper is based on data also used in a doctoral dissertation published as a monograph (Blumenthal-Dramé, 2012). This research was supported by a grant from the German National Academic Foundation to A.B-D. Mariacristina Musso and Bernd Kortmann contributed equally to this article. We thank Hans-Jörg Mast and Rebecca Sautter for help with data collection and Julia Rochlitz for assistance in the preparation of the manuscript.

Reprint requests should be sent to Alice Blumenthal-Dramé, English Department, University of Freiburg, 79085 Freiburg, Germany, or via e-mail: alice.blumenthal@anglistik.uni-freiburg.de.

Notes

1. 

Because the modulator selected for the parametric fMRI analyses is contingent on the behavioral results, the section Behavioral analysis and results precedes the section Imaging analysis and results. As a consequence, it is impossible to keep the sections Methods and Results separate.

2. 

As expected, the predictor LogRelFreq had a significant positive correlation with the predictor log-transformed surface frequency (LogSurfFreq; rs = 0.7329, p < .001) and a significant negative correlation with the predictor log-transformed base frequency (LogBaseFreq; rs = −0.5592, p < .001). However, LogBaseFreq only emerged as a significant predictor in WP priming (p < .001), whereas LogSurfFreq only did so in PW priming (p < .001; both predictors correlated with reduced RTs). As a result, neither of these predictors could be taken to satisfy our entrenchment operationalization.

REFERENCES

REFERENCES
Aarts
,
E.
,
Roelofs
,
A.
, &
Van Turennout
,
M.
(
2009
).
Attentional control of task and response in lateral and medial frontal cortex: Brain activity and reaction time distributions
.
Neuropsychologia
,
47
,
2089
2099
.
Abbot-Smith
,
K.
, &
Tomasello
,
M.
(
2006
).
Exemplar-learning and schematization in a usage-based account of syntactic acquisition
.
Linguistic Review
,
23
,
275
290
.
Amenta
,
S.
, &
Crepaldi
,
D.
(
2012
).
Morphological processing as we know it: An analytical review of morphological effects in visual word identification
.
Frontiers in Psychology
,
3
,
232
.
Arnon
,
I.
, &
Cohen Priva
,
U.
(
2014
).
Time and again: The changing effect of word and multiword frequency on phonetic duration for highly frequent sequences
.
Mental Lexicon
,
9
,
377
400
.
Ashburner
,
J.
, &
Friston
,
K. J.
(
2005
).
Unified segmentation
.
Neuroimage
,
26
,
839
851
.
Baayen
,
R. H.
,
Davidson
,
D. J.
, &
Bates
,
D. M.
(
2008
).
Mixed-effects modeling with crossed random effects for subjects and items
.
Journal of Memory and Language
,
59
,
390
412
.
Baayen
,
R. H.
,
Piepenbrock
,
R.
, &
Gulikers
,
L.
(
1995
).
The CELEX lexical database (release 2) [cd-rom]
.
Philadelphia, PA
:
Linguistic Data Consortium, University of Pennsylvania
.
Balota
,
D. A.
,
Yap
,
M. J.
,
Hutchison
,
K. A.
,
Cortese
,
M. J.
,
Kessler
,
B.
,
Loftis
,
B.
, et al
(
2007
).
The English Lexicon Project
.
Behavior Research Methods
,
39
,
445
459
.
Blumenthal-Dramé
,
A.
(
2012
).
Entrenchment in usage-based theories: What corpus data do and do not reveal about the mind
.
Berlin, Germany
:
de Gruyter Mouton
.
Blumenthal-Dramé
,
A.
(
2016
).
Entrenchment from a psycholinguistic and neurolinguistic perspective
. In
H.-J.
Schmid
(Ed.),
Entrenchment, memory and automaticity. The psychology of linguistic knowledge and language learning
.
Boston
:
APA and Walter de Gruyter
.
Boulenger
,
V.
,
Shtyrov
,
Y.
, &
Pulvermüller
,
F.
(
2012
).
When do you grasp the idea? MEG evidence for instantaneous idiom understanding
.
Neuroimage
,
59
,
3502
3513
.
Bozic
,
M.
,
Marslen-Wilson
,
W. D.
,
Stamatakis
,
E. A.
,
Davis
,
M. H.
, &
Tyler
,
L. K.
(
2007
).
Differentiating morphology, form, and meaning: Neural correlates of morphological complexity
.
Journal of Cognitive Neuroscience
,
19
,
1464
1475
.
Bozic
,
M.
,
Tyler
,
L. K.
,
Ives
,
D. T.
,
Randall
,
B.
, &
Marslen-Wilson
,
W. D.
(
2010
).
Bihemispheric foundations for human speech comprehension
.
Proceedings of the National Academy of Sciences, U.S.A.
,
107
,
17439
17444
.
Bozic
,
M.
,
Tyler
,
L. K.
,
Su
,
L.
,
Wingfield
,
C.
, &
Marslen-Wilson
,
W. D.
(
2013
).
Neurobiological systems for lexical representation and analysis in English
.
Journal of Cognitive Neuroscience
,
25
,
1678
1691
.
Broyd
,
S. J.
,
Demanuele
,
C.
,
Debener
,
S.
,
Helps
,
S. K.
,
James
,
C. J.
, &
Sonuga-Barke
,
E. J. S.
(
2009
).
Default-mode brain dysfunction in mental disorders: A systematic review
.
Neuroscience & Biobehavioral Reviews
,
33
,
279
296
.
Buckner
,
R. L.
,
Koutstaal
,
W.
,
Schacter
,
D. L.
, &
Rosen
,
B. R.
(
2000
).
Functional MRI evidence for a role of frontal and inferior temporal cortex in amodal components of priming
.
Brain
,
123
,
620
640
.
Bybee
,
J.
(
2006
).
From usage to grammar: The mind's response to repetition
.
Language
,
82
,
711
733
.
Bybee
,
J.
(
2010
).
Language, usage and cognition
.
Cambridge, UK
:
Cambridge University Press
.
Croft
,
W.
(
2001
).
Radical construction grammar: Syntactic theory in typological perspective
.
Oxford
:
Oxford University Press
.
Croft
,
W.
, &
Cruse
,
D. A.
(
2004
).
Cognitive linguistics
.
Cambridge, UK
:
Cambridge University Press
.
Devlin
,
J. T.
,
Jamison
,
H. L.
,
Matthews
,
P. M.
, &
Gonnerman
,
L. M.
(
2004
).
Morphology and the internal structure of words
.
Proceedings of the National Academy of Sciences, U.S.A.
,
101
,
14984
14988
.
Diessel
,
H.
(
2007
).
Frequency effects in language acquisition, language use, and diachronic change
.
New Ideas in Psychology
,
25
,
108
127
.
Docherty
,
G. J.
, &
Foulkes
,
P.
(
2014
).
An evaluation of usage-based approaches to the modelling of sociophonetic variability
.
Lingua
,
142
,
42
56
.
Dominey
,
P. F.
,
Hoen
,
M.
, &
Inui
,
T.
(
2006
).
A neurolinguistic model of grammatical construction processing
.
Journal of Cognitive Neuroscience
,
18
,
2088
2107
.
Ellis
,
N. C.
(
2002
).
Frequency effects in language processing
.
Studies in Second Language Acquisition
,
24
,
143
188
.
Fedorenko
,
E.
,
Duncan
,
J.
, &
Kanwisher
,
N.
(
2012
).
Language-selective and domain-general regions lie side by side within Broca's area
.
Current Biology
,
22
,
2059
2062
.
Fedorenko
,
E.
,
Duncan
,
J.
, &
Kanwisher
,
N.
(
2013
).
Broad domain generality in focal regions of frontal and parietal cortex
.
Proceedings of the National Academy of Sciences, U.S.A.
,
110
,
16616
16621
.
Feldman
,
L. B.
,
Milin
,
P.
,
Cho
,
K. W.
,
Moscoso del Prado Martín
,
F.
, &
O'Connor
,
P. A.
(
2015
).
Must analysis of meaning follow analysis of form? A time course analysis
.
Frontiers in Human Neuroscience
,
9
,
111
.
Forster
,
K. I.
, &
Davis
,
C.
(
1984
).
Repetition priming and frequency attenuation in lexical access
.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
10
,
680
698
.
Fruchter
,
J.
, &
Marantz
,
A.
(
2015
).
Decomposition, lookup, and recombination: MEG evidence for the full decomposition model of complex visual word recognition
.
Brain and Language
,
143
,
81
96
.
Gold
,
B. T.
, &
Rastle
,
K.
(
2007
).
Neural correlates of morphological decomposition during visual word recognition
.
Journal of Cognitive Neuroscience
,
19
,
1983
1993
.
Goldberg
,
A.
, &
Suttle
,
L.
(
2010
).
Construction grammar
.
Wiley Interdisciplinary Reviews: Cognitive Science
,
1
,
468
477
.
Goldberg
,
A. E.
(
2006
).
Constructions at work: The nature of generalization in language
.
Oxford
:
Oxford University Press
.
Graves
,
W. W.
,
Desai
,
R.
,
Humphries
,
C.
,
Seidenberg
,
M. S.
, &
Binder
,
J. R.
(
2009
).
Neural systems for reading aloud: A multiparametric approach
.
Cerebral Cortex
,
20
,
1799
1815
.
Hagoort
,
P.
(
2005
).
On Broca, brain, and binding: A new framework
.
Trends in Cognitive Sciences
,
9
,
416
423
.
Halle
,
M.
, &
Marantz
,
A.
(
1994
).
Some key features of distributed morphology
.
MIT Working Papers in Linguistics
,
21
,
275
288
.
Hauk
,
O.
,
Davis
,
M. H.
, &
Pulvermüller
,
F.
(
2008
).
Modulation of brain activity by multiple lexical and word form variables in visual word recognition: A parametric fMRI study
.
Neuroimage
,
42
,
1185
1195
.
Hay
,
J.
(
2001
).
Lexical frequency in morphology: Is everything relative?
Linguistics
,
39
,
1041
1070
.
Hay
,
J.
, &
Baayen
,
H.
(
2002
).
Parsing and productivity
. In
G.
Booij
&
J. V.
Marle
(Eds.),
Yearbook of morphology 2001
(pp.
203
235
).
Dordrecht, The Netherlands
:
Kluwer Academic Publishers
.
Hay
,
J.
, &
Baayen
,
H.
(
2005
).
Shifting paradigms: Gradient structure in morphology
.
Trends in Cognitive Sciences
,
9
,
342
348
.
Heekeren
,
H. R.
,
Marrett
,
S.
, &
Ungerleider
,
L. G.
(
2008
).
The neural systems that mediate human perceptual decision making
.
Nature Reviews Neuroscience
,
9
,
467
479
.
Kimchi
,
R.
(
2015
).
The perception of hierarchical structure
. In
J.
Wagemans
(Ed.),
The Oxford handbook of perceptual organization
(pp
129
149
).
Oxford
:
Oxford University Press
.
Konopka
,
A.
, &
Bock
,
K.
(
2009
).
Lexical or syntactic control of sentence formulation? Structural generalizations from idiom production
.
Cognitive Psychology
,
58
,
68
101
.
Lavric
,
A.
,
Elchlepp
,
H.
, &
Rastle
,
K.
(
2012
).
Tracking hierarchical processing in morphological decomposition with brain potentials
.
Journal of Experimental Psychology: Human Perception and Performance
,
38
,
811
816
.
Lewis
,
G.
,
Solomyak
,
O.
, &
Marantz
,
A.
(
2011
).
The neural basis of obligatory decomposition of suffixed words
.
Brain and Language
,
118
,
118
127
.
Lieven
,
E.
,
Salomo
,
D.
, &
Tomasello
,
M.
(
2009
).
Two-year-old children's production of multiword utterances: A usage-based analysis
.
Cognitive Linguistics
,
20
,
481
507
.
Llano
,
D. A.
(
2013
).
Functional imaging of the thalamus in language
.
Brain and Language
,
126
,
62
72
.
Marangolo
,
P.
, &
Piras
,
F.
(
2010
).
Language and its interacting components: The right hemisphere hypothesis in derivational morphology
.
Brain Research
,
1320
,
114
122
.
Marslen-Wilson
,
W. D.
, &
Tyler
,
L. K.
(
2007
).
Morphology, language and the brain: The decompositional substrate for language comprehension
.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
362
,
823
836
.
Molinaro
,
N.
,
Canal
,
P.
,
Vespignani
,
F.
,
Pesciarelli
,
F.
, &
Cacciari
,
C.
(
2013
).
Are complex function words processed as semantically empty strings? A reading time and ERP study of collocational complex prepositions
.
Language and Cognitive Processes
,
28
,
762
788
.
Navon
,
D.
(
1977
).
Forest before trees: The precedence of global features in visual perception
.
Cognitive Psychology
,
9
,
353
383
.
Pliatsikas
,
C.
,
Wheeldon
,
L.
,
Lahiri
,
A.
, &
Hansen
,
P. C.
(
2014
).
Processing of zero-derived words in English: An fMRI investigation
.
Neuropsychologia
,
53
,
47
53
.
Poljac
,
E.
,
de-Wit
,
L.
, &
Wagemans
,
J.
(
2012
).
Perceptual wholes can reduce the conscious accessibility of their parts
.
Cognition
,
123
,
308
312
.
Pomerantz
,
J. R.
, &
Cragin
,
A. I.
(
2015
).
Emergent features and feature combination
. In
J.
Wagemans
(Ed.),
The Oxford handbook of perceptual organization
(pp.
88
107
).
Oxford
:
Oxford University Press
.
R Core Team
. (
2014
).
R: A language and environment for statistical computing
.
Vienna
:
R Foundation for Statistical Computing
.
Rastle
,
K.
, &
Davis
,
M. H.
(
2008
).
Morphological decomposition based on the analysis of orthography
.
Language and Cognitive Processes
,
23
,
942
971
.
Segaert
,
K.
,
Weber
,
K.
,
de Lange
,
F. P.
,
Petersson
,
K. M.
, &
Hagoort
,
P.
(
2013
).
The suppression of repetition enhancement: A review of fMRI studies
.
Neuropsychologia
,
51
,
59
66
.
Siyanova-Chanturia
,
A.
(
2015
).
On the “holistic” nature of formulaic language
.
Corpus Linguistics and Linguistic Theory
,
11
,
285
301
.
Snider
,
N.
, &
Arnon
,
I.
(
2012
).
A unified lexicon and grammar? Compositional and non-compositional phrases in the lexicon
. In
S.
Gries
&
D.
Divjak
(Eds.),
Frequency effects in language
(pp.
127
163
).
Berlin, Germany
:
Mouton de Gruyter
.
Solomyak
,
O.
, &
Marantz
,
A.
(
2009
).
Evidence for early morphological decomposition in visual word recognition
.
Journal of Cognitive Neuroscience
,
22
,
2042
2057
.
Taft
,
M.
, &
Nillsen
,
C.
(
2013
).
Morphological decomposition and the transposed-letter (TL) position effect
.
Language and Cognitive Processes
,
28
,
917
938
.
Tomasello
,
M.
(
2003
).
Constructing a language: A usage-based theory of language acquisition
.
Cambridge, MA
:
Harvard University Press
.
Ullman
,
M. T.
(
2006
).
Is Broca's area part of a basal ganglia thalamocortical circuit?
Cortex
,
42
,
480
485
.
Ullman
,
M. T.
(
2007
).
The biocognition of the mental lexicon
. In
M. G.
Gaskell
&
G.
Altmann
(Eds.),
The Oxford handbook of psycholinguistics
(pp.
267
286
).
Oxford
:
Oxford University Press
.
Ungerleider
,
L. G.
,
Courtney
,
S. M.
, &
Haxby
,
J. V.
(
1998
).
A neural system for human visual working memory
.
Proceedings of the National Academy of Sciences, U.S.A.
,
95
,
883
890
.
Van Lancker Sidtis
,
D.
(
2012
).
Formulaic language and language disorders
.
Annual Review of Applied Linguistics
,
32
,
62
80
.
Van Lier
,
R.
, &
Gerbino
,
W.
(
2015
).
Perceptual completions
. In
J.
Wagemans
(Ed.),
The Oxford handbook of perceptual organization
(pp.
294
320
).
Oxford
:
Oxford University Press
.
Vannest
,
J.
,
Newport
,
E. L.
,
Newman
,
A. J.
, &
Bavelier
,
D.
(
2011
).
Interplay between morphology and frequency in lexical access: The case of the base frequency effect
.
Brain Research
,
1373
,
144
159
.
Wagemans
,
J.
(
2015
).
The Oxford handbook of perceptual organization
.
Oxford
:
Oxford University Press
.