Abstract

The cognitive processing of concepts, that is, abstract general ideas, has been mostly studied with language. However, other domains, such as music, can also convey concepts. Koelsch et al. [Koelsch, S., Kasper, E., Sammler, D., Schulze, K., Gunter, T., & Friederici, A. D. Music, language and meaning: Brain signatures of semantic processing. Nature Neuroscience, 7, 302–307, 2004] showed that 10 sec of music can influence the semantic processing of words. However, the length of the musical excerpts did not allow the authors to study the effect of words on musical targets. In this study, we decided to replicate Koelsch et al. findings using 1-sec musical excerpts (Experiment 1). This allowed us to study the reverse influence, namely, of a linguistic context on conceptual processing of musical excerpts (Experiment 2). In both experiments, we recorded behavioral and electrophysiological responses while participants were presented 50 related and 50 unrelated pairs (context/target). Experiments 1 and 2 showed a larger N400 component of the event-related brain potentials to targets following a conceptually unrelated compared to a related context. The presence of an N400 effect with musical targets suggests that music may convey concepts. The relevance of these results for the comprehension of music as a structured set of conceptual units and for the domain specificity of the mechanisms underlying N400 effects are discussed.

INTRODUCTION

The cognitive processing of concepts has been widely addressed through research in language processing. A common assumption is that all word representations (e.g., the semantic, syntactic, and phonological properties) are stored in a long-term “semantic” memory within a “mental lexicon,” a “cognitive dictionary” wherein each word entry refers to all its representations (Ju & Luce, 2006; Elman, 2004; Jacobs & Carr, 1995). The term “semantic” is bound to language, whereas the term “concept” is not. Indeed, a concept is an abstract general idea. This abstract quality implies that it may be verbalizable (e.g., word concepts) or not (e.g., aesthetic concepts). Indeed, a concept need not necessarily emerge from the language domain but can be generated by other sources/domains, for instance, odors (Castle, Van Toller, & Milligan, 2000; Sarfarazi, Cave, Richardson, Behan, & Sedgwick, 1999), sounds (Griffiths & Warren, 2004; Rocchesso & Fontana, 2003; Blauert, 1996; Ballas, 1993; Gaver, 1993; Hartmann, 1983), and music (Koelsch et al., 2004). Whether the storage and retrieval of concepts are independent of the domain they are carried by (e.g., language, music, odors) is still a matter of debate. In this research, we focused on conceptual processing in music as compared to language in order to see whether linguistic and musical concepts are handled by different or partly overlapping cognitive modules.

A full overlap is unlikely given that music might convey different kinds of concepts than language. Indeed, although the concepts of words are understood in relation to an extralinguistic designated space, music is considered mostly self-referential (Boucourechliev, 1993; Kivy, 1991; Jakobson, 1973; Meyer, 1956). The internal sense of music may be conceived as something that goes beyond any objective reference structure and the possibilities of verbal language (Piana, 1991). As stated by Leonard Meyer (1956): “Music means itself. That is, one musical event (be it a tone, a phrase or a whole section) has meaning because it points to and makes us expect another musical event” (p. 35). Note here that the term “meaning” was used where we would prefer “concepts” because the latter is more abstract and general than the former (Barsalou et al., 1993). Another self-referential process has also been proposed as an additional source of musical concepts. In analogy to language, Jackendoff (1991) pointed to an automatic cognitive process referred to as a “parser,” which would analyze the musical structure. This analysis would help to resolve intrinsic points of instability (or tensions) that are inherent to the music, and thus, generate further concepts (for an empirical evidence, see Steinbeis & Koelsch, 2008). A third possible source of musical concepts is the emotional perception. Music could possibly be described as a game (i.e., a research of pleasure): The musician plays an instrument, the composer plays with sounds, and the listener plays with his perceptions and emotions (Leipp, 1977). From this point of view, musical concepts may become close to aesthetic pleasure. More generally, there is a strong link between musical concepts and affective and gesture (bodily) states that human need to communicate (Tagg, 1999).

Thus, musical concepts can apparently originate from several sources (e.g., self-referential expectation, structural analysis, aesthetic pleasure, and other emotional perceptions) that do not necessarily require verbalization. However, musical concepts can also arise through verbal means. Indeed, other sources of concepts are to be taken into account that might involve labelization, that is, associations between the musical material and, for instance, the name of the composer, the musical form or style, as well as the name of objects, people, or other environmental aspects previously perceived and encoded in episodic memory as associated to the musical material. These forms of extramusical references can be direct but also symbolic. For example, in opera, a direct association can be made between a particular theme and the appearance of the hero (Koelsch et al., 2004; Sloboda, 1985). Symbolic association can arise, for instance, from listening to a national anthem that may recall the concept of a nation and the sense of unity.

The communication of concepts can be tested experimentally by looking at the effect of a context (e.g., musical or linguistic sentence, for instance, “The beer was too cold to”) on the brain activity during the processing of a target stimulus (e.g., a word, for instance, “drink”). A conceptual contextual effect of music was first reported by Koelsch et al. (2004). Nonmusicians were presented excerpts (mean duration = 10.5 sec) as musical contexts followed 2 sec later by the display of a conceptually related or unrelated target word. During an event-related potential (ERP) study, subjects were asked to decide whether the musical context and the word were related or not by pressing one of two buttons. The ERPs indicated a larger negativity to unrelated compared to related targets words. The effect was significant between 300 and 500 msec poststimulus onset and was replicated with an indirect (memory) task. Furthermore, using sentences instead of music as context, they observed a similar ERP effect (i.e., larger negativity to unrelated compared to related target words). This effect is known as the N400 effect and it is the modulation of a negative ERP component peaking around 400 msec poststimulus onset and with a centro-parietal distribution (Kutas & Hillyard, 1980). Although widely used to investigate semantic processing in language, the N400 has also been used to test conceptual processing with other kind of stimuli such as pictures (McPherson & Holcomb, 1999; Ganis, Kutas, & Sereno, 1996; Nigam, Hoffman, & Simons, 1992), series of numbers (Fogelson, Loukas, Brown, & Brown, 2004), arithmetic equations (Jost, Hennighausen, & Rösler, 2004), odors (Castle et al., 2000; Sarfarazi et al., 1999), environmental sounds (Orgs, Lange, Dombrowski, & Heil, 2006; Van Petten & Rheinfelder, 1995), and sounds used in concrete music (Schön, Ystad, Kronland-Martinet, & Besson, submitted). It is noteworthy that despite previous attempts (Besson & Faïta, 1995; Paller, McCarthy, & Wood, 1992; Besson & Macar, 1987), including the recent study of Miranda and Ullman (2007), who found a negativity at around 200 msec to tone violations of familiar melodies, no study to date has ever reported an N400 effect to musical targets, even though another negative ERP component peaking around 500 msec, the N500, has been recently correlated with structural manipulations of music that are thought to generate musical concepts (Steinbeis & Koelsch, 2008).

Conceptual context effects can also be demonstrated without ERPs by a simple measure of reaction time (Meyer & Schvaneveldt, 1971). In a behavioral follow-up of the Koelsch et al. study, Poulin-Charronnat, Bock, Grieser, Meyer, and Koelsch (2006) presented musical contexts (with a mean duration of 11.7 sec) either related or unrelated to a target word to 50 nonmusicians who were asked to perform a lexical decision task. Reaction times were significantly faster to related compared to unrelated pairs.

Hence, Poulin-Charronnat et al. (2006) showed a behavioral effect of a conceptual context with music and Koelsch et al. (2004), the corresponding ERP effect. However, these studies could only indirectly conclude for conceptual processing during music perception because they found their effects on words as targets (and musical excerpts as contexts) but did not test musical excerpts as targets. Indeed, because the musical excerpts used in these studies were rather long (around 10 sec), it was very difficult to precisely know when the concept would emerge for each excerpt.

The purpose of the present study is to try to overcome this limitation by using short musical excerpts (1 sec). With this duration, it becomes possible to analyze ERPs to musical excerpts as targets. This analysis allows to investigate the first hundreds of milliseconds of brain activity to the presentation of a target excerpt related or unrelated to a word context. Because it was of interest to compare conceptual relatedness effects when the musical excerpt was the target (with a word context) and when it was a context (with a word target), we ran two experiments: one was a replication of the effect of a musical context on the processing of linguistic targets (as in Koelsch et al., 2004) but with shorter contexts (Experiment 1), and the other was a study on the effect of a linguistic context on the processing of a musical targets (Experiment 2).

We hypothesized that, first, musical excerpts of one second would be enough to influence the processing of a following word. Second, concepts carried by words would influence the processing of a following musical excerpt. We also hypothesized that, in both cases, unrelated contexts would give rise to a larger N400 to targets compared to related context.

METHODS

Participants

Volunteer nonmusicians (i.e., who did not participate in extracurricular music lessons or performance or who did so for less than 10 years) were tested: 16 participants in Experiment 1 (i.e., with musical excerpt contexts and word targets) and 20 other participants in Experiment 2 (i.e., with word contexts and musical excerpt targets). Different groups of subjects were used in order to limit N400 repetition effects (Besson & Kutas, 1993). All were right-handed, neurologically normal, had normal or corrected-to-normal vision, normal audition, and were native French speakers. All participants were paid for their participation in the experiment. Because of artifacts in the ERP data of four participants (2 in Experiment 1 and 2 in Experiment 2), the data of only 14 subjects (age: M = 25.9 ± 1.4 years, education: 16.8 ± 0.5 years, 8 women) were retained for analysis in Experiment 1 and those of 18 subjects (age: M = 23.9 ± 0.9 years, education: 15.9 ± 0.5 years, 15 women) for analysis of Experiment 2.

Stimuli

In order to record ERPs to musical targets, the duration of the excerpts has to be as short as possible. However, the excerpts must be sufficiently “meaningful” for the listener to be able to communicate a concept. Using very short stimuli, for instance, 300 msec as did Orgs et al. (2006) with environmental sounds, would have the advantage of reducing the latency jitter of the excerpt recognition, but would strongly impair the musicality of the stimuli. Within this tradeoff, excerpts of 1-sec duration were chosen.

The material selection procedure followed four steps. First, a list of 100 French words was constructed. All words were nouns, had one or two syllables, and were highly frequent (frequency of occurrence in the oral language estimated with a database of film subtitles >12.4 million from Lexique 3.30; New, Pallier, Ferrand, & Matos, 2001) (e.g., “délire” [madness], “magie” [magic], “courage” [courage]; see other examples in the Appendix). Second, four musical experts (one in each of four musical styles: classic, jazz, pop, and traditional music) were asked to find a highly related 1 sec excerpt for each word (e.g., for the word “magie” [magic], an excerpt of Into a Dream from Pat Metheny interpreted by Pat Metheny and Jim Hall at guitars, 1998; for the word “madness,” an excerpt of Paraguay, Guarani-Nandeva et Ayoreo, traditional music; sound files and excerpt references of these and other examples can be downloaded at: www.incm.cnrs-mrs.fr/pperso/attach/schon/daltrozzo_schon_ex.zip). Thus, each expert received a list of 25 words and found 25 related excerpts. Third, these 100 related pairs were presented to 15 nonmusicians (native French speakers, age: M = 29 years, 8 women). None of these participants were included in the ERP experiments. Participants were asked to rate each pair for their relatedness on a 9-point scale. Based on the average score, we retained only the 50 most related pairs for the ERP experiments (i.e., with a score higher or equal 5). This procedure was similar to the one used by Koelsch et al. (2004), who selected context (excerpts)–target (words) pairs on the basis of a context–target relatedness score rated by nonmusicians. The sound sources of the 50 selected excerpts were all from different musical pieces played with different instruments, for instance, violins (excerpt from Beethoven, Violin Concerto, 1st Movement), synthetic sounds (excerpt from Benjamin Bex, Morsures), saxophone and percussions (e.g., excerpt from Ferreira & Heinhorn, Clouds), guitars (e.g., excerpt from Pat Metheny, Into a Dream). Finally, 50 unrelated pairs were built from this material by matching the stimuli in a different order. The 50 related and the 50 unrelated pairs were presented in a pseudorandom order counterbalanced across participants. Therefore, each target appeared once with a related context and once with an unrelated context (order was counterbalanced across participants).

Procedure

Experiment 1: Musical Excerpt Contexts and Word Targets

Participants were comfortably seated in a Faraday box. The onset of the excerpt (duration range = 972–1299 msec) presentation was followed 800 msec later (i.e., a stimulus onset asynchrony [SOA] of 800 msec) by the presentation of a visual word (duration = 200 msec). The visual words averaged 10 cm long and 1.5 cm high and were presented at about 70 cm, subtending a vertical angle of 1.2° and a mean horizontal angle of 8.1°. The words were displayed in white lowercase on a dark background in the center of a 13-in. computer screen. Subjects were instructed to decide, as quickly as possible, whether the context and the target were related or not by pressing one of two buttons (i.e., the same two-way relatedness task as used by Koelsch et al., 2004). Note that there are no correct or incorrect responses as the conceptual relatedness can vary between subjects. We, nevertheless, computed behavioral accuracies defined by the matching degree between the participants' responses and the expected related and unrelated pairs according to the material selection procedure (see above). The association between hand side (left or right) and response (yes or no) was balanced across participants. The word presentation was followed 2 sec later by the visual presentation of “XXXXX” (duration = 2.7 sec). Participants were instructed that they could blink their eyes during the presentation of this series of “X” and should avoid it at other times.

Experiment 2: Word Contexts and Musical Excerpt Targets

The procedure was identical to the one used in the first experiment in all aspects, except that the SOA between the word context and the excerpt target was 500 msec. This SOA is classical in priming experiments. In the first experiment, the choice of the SOA was a compromise between this classical value and the average duration (about 1100 msec) of the musical stimuli.

Data Acquisition and Analysis

The electroencephalogram (EEG) was recorded from 32 scalp electrodes located at standard left and right hemisphere positions over frontal, central, parietal, occipital, and temporal areas (International 10–20 System sites: Fz, Cz, Pz, Oz, Fp1, Fp2, AF3, AF4, F3, F4, C3, C4, P3, P4, PO3, PO4, O1, O2, F7, F8, T3, T4, T5, T6, FC5, FC1, FC2, FC6, CP5, CP1, CP2, and CP6). The data were then re-referenced off-line to the algebraic average of the left and right mastoids. Trials containing ocular artifacts, movement artifacts, or amplifier saturation were excluded from the averaged ERP waveforms (visual inspection). The EEG was amplified by BioSemi amplifiers (ActiveTwo System) with a band-pass filter of 0–102.4 Hz and was digitized at 512 Hz.

ERP data were analyzed by computing the mean amplitude, starting 100 msec before the onset of the word presentation and ending 1000 msec after. Averages over related and unrelated pairs were based on the participants' responses. Repeated measure analysis of variance (ANOVA) was used for statistical assessment. To test the distribution of the effects, six regions of interest (ROIs) were selected as levels of a topographic within factor: left (FC1, F3, F7) and right (FC2, F4, F8) frontal, left (C3, CP1, T7) and right (C4, CP2, T8) central, and left (CP5, P3, P7) and right (CP6, P4, P8) parietal. Note that ANOVAs including the midline electrodes were also performed. However, because no major differences were found between these two types of analyses, we only reported those including ROIs. We used latency windows of 50 msec in the 0–1000 msec range. All p values reported below were adjusted with the Greenhouse–Geisser correction for nonsphericity, when appropriate. Sidak tests were used in post hoc comparisons. The reported eta squared (η2) is a measure of effect size for ANOVAs (Olejnik & Algina, 2003). The statistical analyses were conducted with Cleave.

RESULTS

Behavioral Results

Experiment 1: Musical Excerpt Contexts and Word Targets

The participants judged 48.7% (±1.4) of the pairs to be related. Their average accuracy was 80.3% [i.e., 79.0% (±2.4) for related and 81.6% (± 1.5) for unrelated], that is, significantly above the 50% chance level (two-tailed Wilcoxon signed-rank test corrected for ties: W = 406, p < .0001, n = 28). Although there was a tendency for faster responses when the participants judged the pairs to be related (1044 ± 59 msec) compared to unrelated (1065 ± 52 msec), this difference did not reach significance.

Experiment 2: Word Contexts and Musical Excerpt Targets

The participants judged 49.4% (±2.6) of the pairs to be related. Their average accuracy was 77.9% [i.e., 77.6% (±3.6) for related and 78.3% (±2.1) for unrelated], that is, significantly above the 50% chance level (two-tailed Wilcoxon signed-rank test corrected for ties: W = 630, p < .0001, n = 36). These accuracies were not significantly different from those of the first experiment (unrelated: U = 96.5, ns; related: U = 121.5, ns; two-tailed Mann–Whitney U test). Although there was a tendency for faster responses when the participants judged the pairs to be related (1558 ± 74 msec) compared to unrelated (1583 ± 76 msec), this difference did not reach significance.

Event-related Brain Potentials Results

The grand-averaged ERP to word targets with a musical excerpt context (Experiment 1) are presented in Figure 1. The visual word presentation elicited different ERP in the two experimental conditions (i.e., when the pairs were judged related or unrelated). An N1 is clearly evident at most electrode sites, peaking around 90 msec. It is followed by a positive (150 msec)–negative (180 msec) complex (P2–N2). A negative component is then elicited. Most importantly, the amplitude of this component is larger for unrelated compared to related words and this effect has a bilateral parietal distribution (Figure 1), starts around 250 msec and lasts until 600 msec.

Figure 1. 

Grand-averaged ERPs to related (thin lines) and unrelated (thick lines) word targets according the participants' responses (n = 14 participants; vertical unit: μV; horizontal unit: msec) and isopotential map of the difference waves (grand-averaged ERPs to unrelated minus related targets according the participants' responses) 450 msec poststimulus onset (unit: μV).

Figure 1. 

Grand-averaged ERPs to related (thin lines) and unrelated (thick lines) word targets according the participants' responses (n = 14 participants; vertical unit: μV; horizontal unit: msec) and isopotential map of the difference waves (grand-averaged ERPs to unrelated minus related targets according the participants' responses) 450 msec poststimulus onset (unit: μV).

The grand-averaged ERP to musical targets with a word context (Experiment 2) are presented in Figure 2. The musical excerpt presentation elicited also different ERPs in the two experimental conditions, depending upon whether they were judged as related or unrelated. An N1 is clearly evident at most electrode sites, peaking around 120 msec. It is followed by a positive (200 msec)–negative (280 msec) complex (P2–N2). A negative component is then elicited. Most importantly, the amplitude of this component is larger for unrelated compared to related excerpts and this effect has a bilateral parieto-temporal distribution (Figure 2), starts around 250 msec and lasts until 550 msec.

Figure 2. 

Grand-averaged ERPs to related (thin lines) and unrelated (thick lines) excerpt targets according the participants' responses (n = 18 participants; vertical unit: μV; horizontal unit: msec) and isopotential map of the difference waves (grand-averaged ERPs to unrelated minus related targets according the participants' responses) 450 msec poststimulus onset (unit: μV).

Figure 2. 

Grand-averaged ERPs to related (thin lines) and unrelated (thick lines) excerpt targets according the participants' responses (n = 18 participants; vertical unit: μV; horizontal unit: msec) and isopotential map of the difference waves (grand-averaged ERPs to unrelated minus related targets according the participants' responses) 450 msec poststimulus onset (unit: μV).

To analyze in detail how these components were modulated by the independent variables, we computed repeated measure ANOVA with target type (word/excerpt) as between factor and relatedness (related/unrelated), anteroposterior (frontal, central, and parietal ROIs), and hemisphere (left/right) as within factors using 50 msec windows (Table 1). Although the main effect of relatedness was not significant, the interaction between the relatedness and the anteroposterior factors was significant within the 250 to 650 msec latency range [F(2, 60) = 15.6, p = .0001, η2 = 0.0045]. Post hoc comparisons indicated that this was due to a larger negativity to unrelated compared to related targets in the parietal region (M = −0.914 μV, p = .0004) but not in the central (M = −0.548 μV, ns) and the frontal regions (M = 0.322 μV, ns). The The Target × Relatedness × Anteroposterior interaction was not significant, that is to say that the relatedness by anteroposterior interaction did not differ across the two experiments.

Table 1. 

Time Course of the Relatedness Effect

Time Range (msec)
Relatedness × Anteroposterior
250–300 
300–350 
350–400 
400–450 ** 
450–500 ** 
500–550 ** 
550–600 ** 
600–650 ** 
650–700 
700–750 
750–800 
800–850 
Time Range (msec)
Relatedness × Anteroposterior
250–300 
300–350 
350–400 
400–450 ** 
450–500 ** 
500–550 ** 
550–600 ** 
600–650 ** 
650–700 
700–750 
750–800 
800–850 

Significance threshold at ** = .01 or at * = .05.

DISCUSSION

The purposes of the present study were, first, to replicate the effect of a musical context on the processing of a word target (Koelsch et al., 2004) but with excerpts of only 1 sec (Experiment 1) instead of 10.5 sec in Koelsch et al., and second, to analyze the influence of concepts carried by a word context on the processing of a following musical target (Experiment 2). We recorded ERPs to word targets in Experiment 1 and to musical targets in Experiment 2 and hypothesized that unrelated contexts would elicit a larger N400 to targets compared to related contexts. ERP responses showed a conceptual relatedness effect with a bilateral parietal distribution between 250 and 650 msec that did not differ significantly between Experiments 1 and 2. These latencies and topography, as well as the independent variable manipulated in our experiments (conceptual relatedness), suggest that this ERP effect is a modulation of the N400.

We will discuss here four different issues: (1) behavioral results, (2) previous attempts to find an N400 effect to musical stimuli, (3) the relevance of the present results for the comprehension of music in terms of a structured set of “atoms of musical concepts,” and (4) the underlying mechanisms that reflected the N400 to music as compared to other types of targets.

Behavioral Results

The attempt to replicate the Koelsch et al. (2004) N400 effect to word targets using a very similar two-alternative relatedness judgment task and shorter musical contexts (Experiment 1) showed very close accuracies: an average of 80.5% (i.e., 78% for related and 82% for unrelated targets) in the Koelsch et al. ERP experiment and an average of 80.3% (i.e., 79.0% for related and 81.6% for unrelated targets) in the present study even though the musical context duration was reduced from 10.5 sec (Koelsch et al.) to 1 sec.

The reaction time data showed a nonsignificant trend for faster responses in related compared to unrelated pairs. This lack of effect is not so surprising insofar as the relatedness judgment is not the standard task used to observe conceptual context effects with reaction time. An effect would probably be found with a lexical decision task (Meyer & Schvaneveldt, 1971). Indeed, the behavioral follow-up of the Koelsch et al. (2004) study (Poulin-Charronnat et al., 2006) using musical contexts (duration = 11.7 sec) and word targets found a reaction time relatedness effect with a lexical decision task. Unlike the behavioral data, the ERP provided evidence of a conceptual relatedness effect with a modulation of the N400 component.

Previous Attempts to Find a N400 Effect to Musical Stimuli

The aim of Experiment 2 was not the first attempt to find an N400 effect to musical stimuli. Similar research had been carried out by Paller et al. (1992) and Besson and Macar (1987). The assumption was that, if Kutas and Hillyard (1980) found an N400 to unexpected sentence final words, an N400 would also be seen to unexpected melody final tones. In other words, it was hypothesized that the N400 correlated with expectancy violation. These two studies reported P300-like late positivities to expectancy deviation but no N400 effect. The lack of N400 effect might be explained in two nonexclusive manners. First, in those studies (and several of the following studies manipulating the musical structure), the ERPs were time-locked to the last note or chord of a musical sequence and each note was played with the same timbre, mostly synthetic piano. Therefore, even though a set of notes or chords may carry some concepts, it might be that the processing of a single note or chord within a highly controlled (monotonous and musically poor) sequence does not, by itself, contain conceptual information. Instead, the use of real, although short, recordings in the present study might have brought participants to a more global processing of music, which included conceptual processing. The second reason may rely on the type of experimental manipulation applied in most of the previous studies describing a P300-like effect. Indeed, the independent variable was the harmonic/tonal structure. Thus, the manipulation was probably more at a syntactic than at a conceptual level, although the two are probably not independent.

Atoms of Musical Concepts?

Our results, obtained with short duration excerpts, give rise to the following question: How much time does one need to extract concepts from music? The N400 effect observed in Experiment 1 indicates that a musical context of 1 sec is able to influence the processing of a following word. Accepting the view that the N400 effect is an index of conceptual processing (Barrett, Rugg, & Perrett, 1988), this result would mean that only 1 sec of music can be enough to communicate a concept, which would then influence the conceptual processing of the word. Moreover, the N400 effect found in the second experiment suggests an even stronger conclusion. Indeed, even less than half a second of music seems to be enough to communicate a concept. This is because the effect of conceptual relatedness on the excerpt processing begins as early as 250 msec poststimulus onset. In other words, the minimum duration to convey a concept with music seems to fall to a few hundred milliseconds. A related question has been investigated by Bigand, Filipic, and Lalitte (2005) and Peretz, Gagnon, and Bouchard (1998). Indeed, their studies provide estimates of the smallest amount of music able to communicate an emotion. Peretz et al. (1998) found that subjects were able to assess the happy–sad character of musical excerpts lasting only 250 msec. Similarly, Bigand et al. (2005), by asking subjects to perform emotional judgments, concluded from their results that “250 msec of music may be enough to induce strong or weak feelings of emotion in listeners, whatever the musical style being played” (p. 434). Given that there seems to be a link between musical concepts and emotional perception (Patel, 2008; Koelsch & Siebel, 2005; Koelsch et al., 2004; Sollberge, Rebe, & Eckstein, 2003; Tagg, 1999; Swain, 1997; Meyer, 1956), the results of Bigand et al. might be indicative of the minimal duration of a musical excerpt for conveying concepts.

It is noteworthy that Orgs et al. (2006) succeeded in finding an N400 effect to environmental sounds lasting 300 msec. Therefore, our results call for further experiments to replicate the present findings with even shorter excerpts, with the goal to estimate what is the minimal duration of a musical excerpt for the communication of a concept. The research of this minimal piece of musical information has already been a subject of interest and was earlier referred to as a “museme.” It has been defined as the “minimal unit of musical discourse that is recurrent and meaningful in itself within the framework of any one musical genre” (Tagg, 1999, p. 32). Although it might seem speculative to discuss this issue, a better knowledge of the characteristics of a museme may have an explanatory power on the results of earlier experiments and important consequences on future experimental designs.

For instance, the fact that a museme could be as short as a few hundred of milliseconds points to the fact that the grammatical structure of music might not be necessary in order to convey some concepts. With this, we do not claim that musical grammar (harmony) cannot convey concepts in music, rather that there must be other musical aspects that convey concepts in a very short lapse of time. It is difficult to know what precisely may explain our results and those in the previously cited studies (Bigand et al., 2005; Peretz et al., 1998). A probable candidate is the timbre, or, from a more psychological and global perspective, the energy, tension, and arousal carried by a sound, or series of sounds (Sloboda, 2005). The processing of timbre is particularly fast. For instance, Warren, Gardner, Brubaker, and Bashford (1991) found that sequences of 10-msec tones with identical pitch but different timbres can be distinguished from comparison sequences with the same tones played in a different order. Similarly, Robinson and Patterson (1995) found a fast timbre discrimination with a four-category instrument identification task using segments of musical notes of comparable durations. Furthermore, the timbre seems to be particularly relevant for the identification of very short musical excerpts. Indeed, Schellenberg, Iverson, and McKinnon (1999) showed that when sound dynamics and spectral information are preserved, listeners are able to match 200 msec (and to a minor extent 100 msec) excerpts with song titles and artist names. Therefore, it is reasonable to assume that a similar cueing from the excerpt has influenced the participants' relatedness judgments. More precisely, we propose that the conceptual relatedness effects reported here with short excerpts are due to a matching of the concepts carried by emotions to the excerpt timbre and the concepts elicited by the word. Indeed, it is likely that concepts can be carried by emotional perception to music (Patel, 2008; Koelsch & Siebel, 2005; Koelsch et al., 2004; Sollberge et al., 2003; Tagg, 1999; Swain, 1997; Meyer, 1956).

Of course, conceptual processing is probably encouraged by the explicit and direct relatedness judgment task we used. This may have strengthened the matching process, hence, increased the N400 effects, via expectation-based mechanisms or retrospective checking processes (e.g., McNamara, 2005). However, if general mismatching mechanisms were at work, it would be difficult to explain why such a “mismatch” engendered an N400 (typically described for conceptual processing) and not another component such as mismatch negativity, early right anterior negativity, right anterior temporal negativity, N500 or P600 described in the literature for musical “mismatch” (for a review, see Koelsch & Siebel, 2005). It seems that in order to explain the N400 effect, participants had access to concepts that were possibly carried by emotions to music.

N400 to Music: The Question of the Domain Specificity of Conceptual Processing

Concerning the debate on the modularity/domain specificity of conceptual processing recalled in the Introduction, our data argue more in favor of an aspecific cognitive module of conceptual processing or at least to a strong overlapping of the cognitive modules of linguistic and musical concepts. Indeed, although N400 effects in Experiments 1 and 2 seem to be slightly different in size, latency, and topography (Figures 1 and 2), they did not differ significantly. Although the present data add to the body of studies of conceptual processing that also reported a parietal ERP effect (e.g., Schön et al., submitted; Orgs et al., 2006; Koelsch et al., 2004; Federmeier & Kutas, 2001; Castle et al., 2000), other studies with pictures (Hamm, Blake, & Ian, 2002; West & Holcomb, 2002; McPherson & Holcomb, 1999; Ganis et al., 1996), odors (Sarfarazi et al., 1999), and occasionally with words (Van Petten & Rheinfelder, 1995) found N400 effects with a rather frontal distribution. Overall, although there seems to be a partial overlapping of the N400 effect across studies, some differences are also present. On one end, these differences may not be due to separate underlying networks involved in conceptual processing but, for instance, to different tasks, different subjects, or different modalities. On the other end, they might reflect the presence of both domain aspecific (i.e., independent of the source of the concept) and domain specific (i.e., different if the concept comes from words, sounds, odors, or other media) generators of the N400 effect.

In our study, the similarity of the topography of the N400 to musical and linguistic target suggests that conceptual processing is rather aspecific because the N400 reflects, among other mechanisms, the lexical access (Pylkkänen & Marantz, 2003). Our data are in agreement with an overlapping lexical access of linguistic and musical concepts. This result can hardly be explained with the musical lexicon as defined by Peretz and Coltheart (2003), whereby lexical entries (to representations) are only melodies to which one has been exposed during life. Indeed, in our study, melodies were mostly unfamiliar to the participants. Therefore, we propose an extended view of the musical lexicon that may account for conceptual processing of unfamiliar melodies. In this view, the musical lexicon does not simply contain the representations of known melodies but also stores representations of musical features, for instance, the timbre. In such a way, when one hears a novel melody, this can activate a composite pattern of musical features in the mental lexicon, and thus, activate the corresponding representations. To account for the conceptual processing of music described in our study, we propose that one of these representations is at the conceptual level. Note that this is suggested by analogy with word representations (e.g., conceptual, phonological, syntactic) (Ju & Luce, 2006; Elman, 2004; Jacobs & Carr, 1995). Within this extended view of the musical lexicon, the overlapping lexical access of linguistic and musical concepts would be due to overlapping representations. Accordingly, our data would reflect the following: (1) the word entry in the linguistic lexicon would point to related concepts; (2) the excerpt timbre entry in the musical lexicon would point to (emotional) concepts; (3) the concepts issued from the words and excerpts were or were not related; and (4) these two types of concepts were processed within the same or highly overlapping cognitive modules.

The present study confirms that the processing of words is influenced by its conceptual relatedness with a musical context even if the duration of this context lasts only for 1 sec. Furthermore, the data show, for the first time, that concepts carried by words can influence the processing of a following musical excerpt and suggest that 250 msec might be enough to communicate musical concepts.

APPENDIX

Examples of the material of words and related excerpts (sound files of these examples can be downloaded at www.incm.cnrs-mrs.fr/pperso/attach/schon/daltrozzo_schon_ex.zip):

Words
Related Excerpt References
magie (magic) Into a Dream Pat Metheny interpreted by Pat Metheny et Jim Hall, guitars (1998) 
délire (madness) Paraguay, Guarani-Nandeva et Ayoreo Ocora. Ayoreo people, Chaco désert, Paraguay 
blessant (wounding) Schnittke Concerto Grosso 
courage (courage) Szàszcsàvàs band 3 Szekely Verbunk. Thermal Confort, Budapest. 
heureux (happy) The Days Of Wine And Roses Marcini Mercer interpreted by Dexter Gordon, saxophone (1975) 
laid (ugly) Mr PC John Coltrane interpreted by John Coltrane, saxophone (1965) 
hâte (hasten) Donna Lee Charlie Parker interpreted by Bireli Lagrene, guitar (1994) 
léger (light) Young One Jim Hall interpreted by Jim Hall, guitar, & Gil Goldstein, piano (1989) 
Words
Related Excerpt References
magie (magic) Into a Dream Pat Metheny interpreted by Pat Metheny et Jim Hall, guitars (1998) 
délire (madness) Paraguay, Guarani-Nandeva et Ayoreo Ocora. Ayoreo people, Chaco désert, Paraguay 
blessant (wounding) Schnittke Concerto Grosso 
courage (courage) Szàszcsàvàs band 3 Szekely Verbunk. Thermal Confort, Budapest. 
heureux (happy) The Days Of Wine And Roses Marcini Mercer interpreted by Dexter Gordon, saxophone (1975) 
laid (ugly) Mr PC John Coltrane interpreted by John Coltrane, saxophone (1965) 
hâte (hasten) Donna Lee Charlie Parker interpreted by Bireli Lagrene, guitar (1994) 
léger (light) Young One Jim Hall interpreted by Jim Hall, guitar, & Gil Goldstein, piano (1989) 

Acknowledgments

This research was supported by a grant from the French National Agency for Research (ANR 2005-8 “Music & Memory”). We thank Emmanuel Bigand, Bénédicte Poulin-Charronnat, and Barbara Tillmann for helpful discussions on the design preparation and Mitsuko Aramaki, Mireille Besson, Emmanuel Bigand, Bénédicte Poulin-Charronnat, Séverine Samson, Barbara Tillmann, and three anonymous reviewers for their valuable comments on a previous version of the manuscript.

Reprint requests should be sent to Jérôme Daltrozzo, INCM-CNRS, 31 Chemin Joseph Aiguier, 13402 Marseille cedex 20, France, or via e-mail: daltrozzo@incm.cnrs-mrs.fr.

REFERENCES

Ballas
,
J. A.
(
1993
).
Common factors in the identification of an assortment of brief everyday sounds.
Journal of Experimental Psychology: Human Perception and Performance
,
9
,
250
267
.
Barrett
,
S. E.
,
Rugg
,
M. D.
, &
Perrett
,
D. I.
(
1988
).
Event related potentials and the matching of familiar and unfamiliar faces.
Neuropsychologia
,
26
,
105
117
.
Barsalou
,
L. W.
,
Yeh
,
W.
,
Luka
,
B. J.
,
Olseth
,
K. L.
,
Mix
,
K. S.
, &
Wu
,
L.
(
1993
).
Concepts and meaning.
In K. Beals, G. Cooke, D. Kathman, K. E. McCullough, S. Kita, & D. Testen (Eds.),
Chicago Linguistics Society 29: Papers from the parasession on conceptual representations
(pp.
23
61
).
Chicago
:
University of Chicago, Chicago Linguistics Society
.
Besson
,
M.
, &
Faïta
,
F.
(
1995
).
An event-related potential (ERP) study of musical expectancy: Comparisons of musicians with non-musicians.
Journal of Experimental Psychology: Human Perception and Performance
,
21
,
1278
1296
.
Besson
,
M.
, &
Kutas
,
M.
(
1993
).
An event-related potential (ERP) analysis of the effect of sentence context on the repetition of ambiguous words.
In H. J. Heinze, T. F. Munte, & G. R. Mangun (Eds.),
New development in event-related potentials
(pp.
17
24
).
Boston
:
Birkhäuser
.
Besson
,
M.
, &
Macar
,
F.
(
1987
).
An event-related potential analysis of incongruity in music and other non-linguistic contexts.
Psychophysiology
,
24
,
14
25
.
Bigand
,
E.
,
Filipic
,
S.
, &
Lalitte
,
P.
(
2005
).
The time course of emotional responses to music.
Annals of the New York Academy of Sciences
,
1060
,
429
437
.
Blauert
,
J.
(
1996
).
Spatial hearing: The psychophysics of human sound localization
(revised ed.).
Cambridge, MA
:
MIT Press
.
Boucourechliev
,
A.
(
1993
).
Le langage musical. Collections les chemins de la musique
.
Paris
:
Fayard
.
Castle
,
P. C.
,
Van Toller
,
S.
, &
Milligan
,
G. J.
(
2000
).
The effect of odour priming on cortical EEG and visual ERP responses.
International Journal of Psychophysiology
,
36
,
123
131
.
Elman
,
J. L.
(
2004
).
An alternative view of the mental lexicon.
Trends in Cognitive Sciences
,
8
,
301
306
.
Federmeier
,
K. D.
, &
Kutas
,
M.
(
2001
).
Meaning and modality: Influences of context, semantic memory organization, and perceptual predictability on picture processing.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
27
,
202
224
.
Fogelson
,
N.
,
Loukas
,
C.
,
Brown
,
J.
, &
Brown
,
P.
(
2004
).
A common N400 EEG component reflecting contextual integration irrespective of symbolic form.
Clinical Neurophysiology
,
115
,
1349
1358
.
Ganis
,
G.
,
Kutas
,
M.
, &
Sereno
,
M. I.
(
1996
).
The search for “common sense”: An electrophysiological study of the comprehension of words and pictures in reading.
Journal of Cognitive Neuroscience
,
8
,
89
106
.
Gaver
,
W. W.
(
1993
).
What in the world do we hear? An ecological approach to auditory source perception.
Ecological Psychology
,
5
,
1
29
.
Griffiths
,
T. D.
, &
Warren
,
J. D.
(
2004
).
What is an auditory object?
Nature Reviews Neuroscience
,
5
,
887
892
.
Hamm
,
J. P.
,
Blake
,
W. J.
, &
Ian
,
K.
(
2002
).
Comparison of the N300 and N400 ERPs to picture stimuli in congruent and incongruent contexts.
Clinical Neurophysiology
,
113
,
1339
1350
.
Hartmann
,
W. M.
(
1983
).
Localization of sound in rooms.
Journal of the Acoustical Society of America
,
74
,
1380
1391
.
Jackendoff
,
R.
(
1991
).
Musical parsing and musical affect.
Music Perception
,
9
,
199
230
.
Jacobs
,
A. M.
, &
Carr
,
T. H.
(
1995
).
Mind mappers and cognitive modelers: Toward cross-fertilization.
Behavioral Brain Sciences
,
18
,
362
363
.
Jakobson
,
R.
(
1973
).
Essais de linguistique générale: II. Rapports internes et externes du langage.
Paris
:
Minuit
.
Jost
,
K.
,
Hennighausen
,
E.
, &
Rösler
,
F.
(
2004
).
Comparing arithmetic and semantic fact retrieval: Effects of problem size and sentence constraint on event-related brain potentials.
Psychophysiology
,
41
,
46
59
.
Ju
,
M.
, &
Luce
,
P. A.
(
2006
).
Representational specificity of within-category phonetic variation in the long-term mental lexicon.
Journal of Experimental Psychology: Human Perception and Performance
,
32
,
120
138
.
Kivy
,
P.
(
1991
).
Music alone. Philosophical reflection on the purely musical experience.
Ithaca, NY
:
Cornell University Press
.
Koelsch
,
S.
,
Kasper
,
E.
,
Sammler
,
D.
,
Schulze
,
K.
,
Gunter
,
T.
, &
Friederici
,
A. D.
(
2004
).
Music, language and meaning: Brain signatures of semantic processing.
Nature Neuroscience
,
7
,
302
307
.
Koelsch
,
S.
, &
Siebel
,
W.
(
2005
).
Towards a neural basis of music perception.
Trends in Cognitive Sciences
,
9
,
578
584
.
Kutas
,
M.
, &
Hillyard
,
S. A.
(
1980
).
Reading senseless sentences: Brain potentials reflect semantic incongruity.
Science
,
204
,
203
205
.
Leipp
,
E.
(
1977
).
La machine à écouter: Essais de psychoacoustique.
Paris
:
Masson
.
McNamara
,
T. P.
(
2005
).
Semantic priming: Perspectives from memory and word recognition
,
New York
:
Psychology Press
.
McPherson
,
W. B.
, &
Holcomb
,
P. J.
(
1999
).
An electrophysiological investigation of semantic priming with pictures of real objects.
Psychophysiology
,
36
,
53
65
.
Meyer
,
D. E.
, &
Schvaneveldt
,
R. W.
(
1971
).
Facilitation in recognizing pairs of words: Evidence of a dependence between retrieval operations.
Journal of Experimental Psychology
,
90
,
227
234
.
Meyer
,
L.
(
1956
).
Emotion and meaning in music.
Chicago
:
University of Chicago Press
.
Miranda
,
R. A.
, &
Ullman
,
M. T.
(
2007
).
Double dissociation between rules and memory in music: An event-related potential study.
Neuroimage
,
38
,
331
345
.
New
,
B.
,
Pallier
,
C.
,
Ferrand
,
L.
, &
Matos
,
R.
(
2001
).
Une base de données lexicales du français contemporain sur internet: LEXIQUE [A lexical data base of contemporary French on internet: LEXIQUE].
L'Année Psychologique
,
101
,
447
462
.
Nigam
,
A.
,
Hoffman
,
J. E.
, &
Simons
,
R. F.
(
1992
).
N400 to semantically anomalous pictures and words.
Journal of Cognitive Neuroscience
,
4
,
15
22
.
Olejnik
,
S.
, &
Algina
,
J.
(
2003
).
Generalized eta and omega squared statistics: Measures of effect size for some common research designs.
Psychological Methods
,
8
,
434
447
.
Orgs
,
G.
,
Lange
,
K.
,
Dombrowski
,
J.
, &
Heil
,
M.
(
2006
).
Conceptual priming for environmental sounds and words: An ERP study.
Brain and Cognition
,
62
,
267
272
.
Paller
,
K. A.
,
McCarthy
,
G.
, &
Wood
,
C. C.
(
1992
).
Event-related potentials elicited by deviant endings to melodies.
Psychophysiology
,
29
,
202
206
.
Patel
,
A.
(
2008
).
Music, language, and the brain.
New York
:
Oxford University Press
.
Peretz
,
I.
, &
Coltheart
,
M.
(
2003
).
Modularity of music processing.
Nature Neuroscience
,
6
,
688
691
.
Peretz
,
I.
,
Gagnon
,
L.
, &
Bouchard
,
B.
(
1998
).
Music and emotion: Perceptual determinants, immediacy and isolation after brain damage.
Cognition
,
68
,
111
141
.
Piana
,
G.
(
1991
).
Filosofia della musica.
Milan, Italy
:
Guerini e Associati
.
Poulin-Charronnat
,
B.
,
Bock
,
B.
,
Grieser
,
J.
,
Meyer
,
K.
, &
Koelsch
,
S.
(
2006
).
More about music, language and meaning: The follow-up of Koelsch et al. (2004).
In M. Baroni, A. R. Addessi, R. Caterina, & M. Costa (Eds.),
Proceedings of the 9th International Conference on Music Perception and Cognition (ICMPC9), Bologna/Italy, August 22–26 2006
(p.
1855
).
Pylkkänen
,
L.
, &
Marantz
,
A.
(
2003
).
Tracking the time course of word recognition with MEG.
Trends in Cognitive Sciences
,
7
,
187
189
.
Robinson
,
K.
, &
Patterson
,
R. D.
(
1995
).
The duration required to identify the instrument, the octave, or the pitch chroma of a musical note.
Music Perception
,
13
,
1
15
.
Rocchesso
,
D.
, &
Fontana
,
F.
(
2003
).
The sounding object.
Retrieved January 28, 2008 from www.soundobject.org/.
Sarfarazi
,
M.
,
Cave
,
B.
,
Richardson
,
A.
,
Behan
,
J.
, &
Sedgwick
,
E. M.
(
1999
).
Visual event related potentials modulated by contextually relevant and irrelevant olfactory primes.
Chemical Senses
,
24
,
145
154
.
Schellenberg
,
E. G.
,
Iverson
,
P.
, &
McKinnon
,
M. C.
(
1999
).
Name that tune: Identifying popular recordings from brief excerpts,
Psychonomic Bulletin & Review
,
6
,
641
646
.
Schön
,
D.
,
Ystad
,
S.
,
Kronland-Martinet
,
R.
, &
Besson
,
M.
(
submitted
).
The evocative power of sounds: Conceptual relationship between words and non-identifiable nonverbal sounds.
Sloboda
,
J. A.
(
1985
).
The musical mind. The cognitive psychology of music.
Oxford
:
Clarendon Press
.
Sloboda
,
J. A.
(
2005
).
Exploring the musical mind. Cognition, emotion, ability, function.
Oxford
:
Oxford University Press
.
Sollberge
,
B.
,
Rebe
,
R.
, &
Eckstein
,
D.
(
2003
).
Musical chords as affective priming context in a word-evaluation task.
Music Perception
,
20
,
263
282
.
Steinbeis
,
N.
, &
Koelsch
,
S.
(
2008
).
Shared neural resources between music and language indicate semantic processing of musical tension–resolution patterns.
Cerebral Cortex
,
18
,
1169
1178
.
Swain
,
J.
(
1997
).
Musical languages.
New York
:
Norton
.
Tagg
,
P.
(
1999
).
Introductory notes to the semiotics of music.
Retrieved December 4, 2007 from www.tagg.org/xpdfs/semiotug.pdf.
Van Petten
,
C.
, &
Rheinfelder
,
H.
(
1995
).
Conceptual relationships between spoken words and environmental sounds: Event-related brain potential measures.
Neuropsychologia
,
33
,
485
508
.
Warren
,
R. M.
,
Gardner
,
D. A.
,
Brubaker
,
B. S.
, &
Bashford
,
J. A.
(
1991
).
Melodic and nonmelodic sequences of tones: Effects of duration on perception.
Music Perception
,
8
,
277
289
.
West
,
W. C.
, &
Holcomb
,
P. J.
(
2002
).
Event-related potentials during discourse-level semantic integration of complex pictures.
Cognitive Brain Research
,
13
,
363
375
.