Abstract

Music is a peculiar human behavior, yet we still know little as to why and how music emerged. For centuries, the study of music has been the sole prerogative of the humanities. Lately, however, music is being increasingly investigated by psychologists, neuroscientists, biologists, and computer scientists. One approach to studying the origins of music is to empirically test hypotheses about the mechanisms behind this structured behavior. Recent lab experiments show how musical rhythm and melody can emerge via the process of cultural transmission. In particular, Lumaca and Baggio (2017) tested the emergence of a sound system at the boundary between music and language. In this study, participants were given random pairs of signal-meanings; when participants negotiated their meaning and played a “game of telephone” with them, these pairs became more structured and systematic. Over time, the small biases introduced in each artificial transmission step accumulated, displaying quantitative trends, including the emergence, over the course of artificial human generations, of features resembling properties of language and music. In this Note, we highlight the importance of Lumaca and Baggio's experiment, place it in the broader literature on the evolution of language and music, and suggest refinements for future experiments. We conclude that, while psychological evidence for the emergence of proto-musical features is accumulating, complementary work is needed: Mathematical modeling and computer simulations should be used to test the internal consistency of experimentally generated hypotheses and to make new predictions.

Language and music are peculiar human behaviors. We spend a large portion of our lives speaking, reading, processing speech, performing music, and listening to tunes. At the same time, we still know very little about why and how these structured behaviors emerged in our species. For the particular case of music, the mystery is even greater than language. Music is a widespread human behavior that does not seem to confer any evolutionary advantage. A possible approach to studying the origins of music is to hypothesize and empirically test the mechanisms behind this structured behavior [18, 28]. For language, potential mechanisms were first tested in silico [17, 20], showing how random pairs of signals and meanings become more structured and systematic when artificial agents play a “game of telephone” with them. These results were replicated with human participants evolving a language-like system [19], confirming the importance of computer simulations in testing hypotheses on the cultural evolution of human behavior. Finally, recent work applied this approach to musical rhythm [27], showing that musical structures can indeed emerge via cultural transmission.

Lumaca and Baggio (2017) adopted this methodological approach, testing the emergence of a sound system at the boundary between music and language [23]. Similar to communication systems found in humans, in other animals, and in in-silico experiments, a meaning space was paired with a signal space. The meaning space coincided with a set of pictures showing different facial emotional expressions. The signal space was a set of five-note patterns. Crucially, the experimenters randomly paired meanings to signals, which were in turn randomly structured. These random pairings of emotional expressions and random note sequences were then used in signaling games, where pairs of participants used note sequences to communicate emotional states. The resulting signal-meaning pairs, with all their human-introduced variations, were then used in new signaling games with new participants. Over time, the small biases introduced in each artificial transmission step accumulated, displaying quantitative trends. In particular, Lumaca and Baggio found the emergence, over the course of artificial human generations, of features resembling some properties of language and music.

A number of methodological solutions make that article quite valuable. For instance, an obvious potential confound when mapping proto-musical structures to emotional states is that some of these mappings are already ingrained and universal in human cognition [10]. The authors circumvent this potential confound by using the Bohlen-Pierce (BP) scale [24], to which the average human being is never exposed, rather than the common 12-tone equal temperament scale (the black and white piano keys).

For centuries, the study of music has been the sole prerogative of the humanities. Lately, however, music research is increasingly performed by psychologists, neuroscientists, biologists, and computer scientists [2, 69, 1114, 16, 25, 26]. Scientists need rigorous and operational definitions. One way to define music is to look at the most recurrent properties of musical cultures around the world [1, 29]. The resulting properties, called (statistical) musical universals, are features of music that appear above chance in all world cultures. Recently, comparative musicologists compiled a list of 32 potential universals. Then, 304 music recordings from all over the world were coded and compared for presence or absence of each of these features. Across human cultures, 21 features qualified as global universals, spanning melody, rhythm, social context, performance style, and so on [29].

How do these universals arise? Some hypothesized that the process of cultural transmission might slowly shape random signals into proto-musical systems [31]. Indeed, research tracing how rhythmic patterns are learnt and transmitted in the lab showed convergence towards all six universals previously found for rhythm [27]. Random sequences of durations are shaped into music-like rhythmic sequences [9]. Is this also the case for melodic properties? Lumaca and Baggio's article did not tackle this question directly [23], but might still provide insights into the emergence of melodic universals in human minds. Although the original focus of the article was not on musical universals, it did investigate the intergenerational transmission of melodic structures. On applying the musical universals framework to the results of Lumaca and Baggio's signaling games, the signals change, seemingly approaching some of the melodic universals (see Table 1). In particular, the data provides indirect evidence for the emergence of melodic-like patterns that (a) exhibit arched contours, (b) span small frequency intervals, and (c) possibly repeat within the system.

Table 1. 

How melodic universals in [29] compare to evolved melodies in [23].

Melodic universal adapted from [29Evidence from [23
(1) Sound systems show discrete pitches. The emergence of this universal cannot be shown from data in [23], as the pitches were discrete to start with. 
(2) Sound elements are organized in scales of few (≤7) elements per octave. This universal cannot be shown from data in [23], as the unique elements were five or less to start with. 
(3) Sound elements are distributed over non-equidistant frequencies. This universal cannot be shown from data in [23], as the frequency distance between sound elements was experimentally fixed. 
(4) Evolved melodies show descending or arched contours. There is an increase of mirror patterns over generations, suggesting the emergence of arched contours [23, Section 3.5, p. 416]. No evidence for descending contours is provided. 
(5) Melody contours span small frequency intervals (≤750 cents, i.e., a musical fifth). The mean interval size decreases over generations [23, Section 3.3, p. 415]. 
(6) Sound sequences show presence of motivic patterns. This universal cannot be shown from data in [23], as the duration of tones was fixed. 
(7) Sound sequences consist of short phrases (≤9 seconds). This universal cannot be shown from data in [23], as the length of phrases was experimentally constrained. 
(8) Phrases are repeated. The tone system becomes increasingly compressible. Increased compressibility could be achieved by having sequences that are either probabilistically or deterministically more predictable. The latter case would correspond to repeating phrases, hence indirectly suggesting that some motifs might repeat across patterns [23, Section 3.8, p. 416]. Notice, however, that the universal refers to patterns repeating within one unit of the system, not across units of the system. 
Melodic universal adapted from [29Evidence from [23
(1) Sound systems show discrete pitches. The emergence of this universal cannot be shown from data in [23], as the pitches were discrete to start with. 
(2) Sound elements are organized in scales of few (≤7) elements per octave. This universal cannot be shown from data in [23], as the unique elements were five or less to start with. 
(3) Sound elements are distributed over non-equidistant frequencies. This universal cannot be shown from data in [23], as the frequency distance between sound elements was experimentally fixed. 
(4) Evolved melodies show descending or arched contours. There is an increase of mirror patterns over generations, suggesting the emergence of arched contours [23, Section 3.5, p. 416]. No evidence for descending contours is provided. 
(5) Melody contours span small frequency intervals (≤750 cents, i.e., a musical fifth). The mean interval size decreases over generations [23, Section 3.3, p. 415]. 
(6) Sound sequences show presence of motivic patterns. This universal cannot be shown from data in [23], as the duration of tones was fixed. 
(7) Sound sequences consist of short phrases (≤9 seconds). This universal cannot be shown from data in [23], as the length of phrases was experimentally constrained. 
(8) Phrases are repeated. The tone system becomes increasingly compressible. Increased compressibility could be achieved by having sequences that are either probabilistically or deterministically more predictable. The latter case would correspond to repeating phrases, hence indirectly suggesting that some motifs might repeat across patterns [23, Section 3.8, p. 416]. Notice, however, that the universal refers to patterns repeating within one unit of the system, not across units of the system. 

Lumaca and Baggio's work is an important and innovative piece of science on many levels. Notably, it provides some indirect, preliminary evidence for how the process of cultural transmission can make some acoustic features converge towards melodic universals. At the same time, a few experimental decisions make this work difficult to interpret robustly in light of all musical universals. First of all, the introduction of meaning in a work on musical structure is likely to divide readers. In fact, the field of music cognition is split between those arguing that music has no referential meaning attached, and others who believe music and meaning are indissoluble [4, 21, 30]. Moreover, even those who assume a large role for meaning in music would likely not think of that role as functioning in the way it is presented in the article, where melodic signals are negotiated to refer to, and help a partner guess, a set of very specific referents [21]. Second, the experimental design puts strong constraints on the types of pattern seen by first-generation participants. Musical tones are already discretized, and limited in duration and number of elements (exactly five tones per pattern). These a priori experimental constraints limit the dimensions along which the sound system could evolve, precluding in turn the possibility to test the emergence of most universals, because these universals are already built into the system [15, 27].

These remarks might be useful to inform future research on the evolution of melodic universals. In particular, we suggest that an experimental design closer to Verhoef's work might be more appropriate to study the evolution of melodic universals [5, 32, 33]. If signal-meaning associations are removed and universals still emerge, the presence of meaning will be shown unnecessary for music to appear [27]; in fact, previous work by Lumaca and Baggio suggests that meaning should not play a role [22]. Similarly, if first-generation participants are exposed to patterns free to vary in duration and not already discretized, emergence of all melodic universals—including the transition from continuous to discrete pitches, organization of scales, and constraints on phrase length—will become empirically testable [32, 33]. Finally, future experiments could replace system learning for immediate recall. In other words, instead of having participants learn a whole system and then transmit it to the next participant, participants could learn and transmit individual patterns [3, 27]. Transmitting individual patterns (immediate recall) makes it harder for self-consistent systems of signals to emerge. Hence, if a system emerges nonetheless when employing the more conservative immediate recall method, a stronger pressure for systematicity must exist.

To conclude, a number of recent experiments have found direct or suggestive evidence for the emergence of proto-musical features in the lab [9, 15, 23, 27, 32, 33]. Given this amount of experimental human work, closed-form mathematical modeling and computer simulations are now needed to both test the internal consistency of experimentally generated hypotheses and make new predictions.

Acknowledgments

We are grateful to Bart de Boer and Massimo Lumaca for useful comments on previous versions of this manuscript. This project has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 665501 with the research Foundation Flanders (FWO) ([PEGASUS]2 Marie Curie fellowship 12N5517N awarded to AR), a visiting fellowship in Language Evolution from the Max Planck Society (awarded to AR), ERC grant 283435 ABACUS (awarded to Bart de Boer), and NWO Veni grant 275-89-031 (awarded to TV).

References

1
Brown
,
S.
, &
Jordania
,
J.
(
2013
).
Universals in the world's musics
.
Psychology of Music
,
41
(
2
),
229
248
.
2
Cason
,
N.
, &
Schön
,
D.
(
2012
).
Rhythmic priming enhances the phonological processing of speech
.
Neuropsychologia
,
50
(
11
),
2652
2658
.
3
Cornish
,
H.
,
Smith
,
K.
, &
Kirby
,
S.
(
2013
).
Systems from sequences: An iterated learning account of the emergence of systematic structure in a non-linguistic task
. In
M.
Knauff
(Ed.),
Proceedings of the 35th Annual Meeting of the Cognitive Science Society
(pp.
340
345
).
New York
:
Curran Associates
.
4
Cunningham
,
J. G.
, &
Sterling
,
R. S.
(
1988
).
Developmental change in the understanding of affective meaning in music
.
Motivation and Emotion
,
12
(
4
),
399
413
.
5
Eryilmaz
,
K.
, &
Little
,
H.
(
2016
).
Using leap motion to investigate the emergence of structure in speech and language
.
Behavior Research Methods
,
49
(
5
),
1748
1768
.
6
Fitch
,
W. T.
(
2009
).
The biology and evolution of rhythm: Unraveling a paradox
. In
P.
Rebuschat
,
M.
Rohmeier
,
J. A.
Hawkins
, &
I.
Cross
(Eds.),
Language and music as cognitive systems
(pp.
73
95
).
Oxford, UK
:
Oxford University Press
.
7
Fitch
,
W. T.
(
2013
).
Rhythmic cognition in humans and animals: Distinguishing meter and pulse perception
.
Frontiers in Systems Neuroscience
,
7
,
1
16
.
8
Fitch
,
W. T.
(
2015
).
Four principles of bio-musicology
.
Philosophical Transactions of the Royal Society of London B: Biological Sciences
,
370
(
1664
),
20140091
.
9
Fitch
,
W. T.
(
2017
).
Cultural evolution: Lab-cultured musical universals
.
Nature Human Behaviour
,
1
,
1
2
.
10
Fritz
,
T.
,
Jenschke
,
S.
,
Gosselin
,
N.
,
Sammler
,
D.
,
Peretz
,
I.
,
Turner
,
R.
,
Friederici
,
A. D.
, &
Koelsch
,
S.
(
2009
).
Universal recognition of three basic emotions in music
.
Current Biology
,
19
(
7
),
573
576
.
11
Hagen
,
E. H.
, &
Hammerstein
,
P.
(
2009
).
Did Neanderthals and other early humans sing? Seeking the biological roots of music in the territorial advertisements of primates, lions, hyenas, and wolves
.
Musicae Scientiae
,
13
(
2 suppl
),
291
320
.
12
Hoeschele
,
M.
,
Merchant
,
H.
,
Kikuchi
,
Y.
,
Hattori
,
Y.
, &
ten Cate
,
C.
(
2015
).
Searching for the origins of musicality across species
.
Philosophical Transactions of the Royal Society B: Biological Sciences
,
370
(
1664
),
20140094
.
13
Honing
,
H.
,
Merchant
,
H.
,
Háden
,
G. P.
,
Prado
,
L.
, &
Bartolo
,
R.
(
2012
).
Rhesus monkeys (Macaca mulatta) detect rhythmic groups in music, but not the beat
.
PLoS ONE
,
7
(
12
),
e51369
.
14
Honing
,
H.
,
ten Cate
,
C.
,
Peretz
,
I.
, &
Trehub
,
S. E.
(
2015
).
Without it no music: Cognition, biology and evolution of musicality
.
Philosophical Transactions of the Royal Society of London B: Biological Sciences
,
370
(
1664
),
1
8
.
15
Jacoby
,
N.
, &
McDermott
,
J. H.
(
2017
).
Integer ratio priors on musical rhythm revealed cross-culturally by iterated reproduction
.
Current Biology
,
27
(
3
),
359
370
.
16
Kanduri
,
C.
,
Kuusi
,
T.
,
Ahvenainen
,
M.
,
Philips
,
A. K.
,
Lädesmäki
,
H.
, &
Järvelä
,
I.
(
2015
).
The effect of music performance on the transcriptome of professional musicians
.
Scientific Reports
,
5
,
1
7
.
17
Kirby
,
S.
(
2002
).
Natural language from artificial life
.
Artificial Life
,
8
(
2
),
185
215
.
18
Kirby
,
S.
(
2017
).
Culture and biology in the origins of linguistic structure
.
Psychonomic Bulletin & Review
,
24
(
1
),
118
137
.
19
Kirby
,
S.
,
Cornish
,
H.
, &
Smith
,
K.
(
2008
).
Cumulative cultural evolution in the laboratory: An experimental approach to the origins of structure in human language
.
Proceedings of the National Academy of Sciences of the U.S.A.
,
105
(
31
),
10681
10686
.
20
Kirby
,
S.
, &
Hurford
,
J. R.
(
2002
).
The emergence of linguistic structure: An overview of the iterated learning model
. In A. Cangelosi &
D.
Parisi
(Eds.),
Simulating the evolution of language
(pp.
121
147
).
London
:
Springer
.
21
Koelsch
,
S.
,
Kasper
,
E.
,
Sammler
,
D.
,
Schulze
,
K.
,
Gunter
,
T.
, &
Friederici
,
A. D.
(
2004
).
Music, language and meaning: Brain signatures of semantic processing
.
Nature Neuroscience
,
7
(
3
),
302
307
.
22
Lumaca
,
M.
, &
Baggio
,
G.
(
2016
).
Brain potentials predict learning, transmission and modification of an artificial symbolic system
.
Social Cognitive and Affective Neuroscience
,
11
(
12
),
1970
1979
.
23
Lumaca
,
M.
, &
Baggio
,
G.
(
2017
).
Cultural transmission and evolution of melodic structures in multi-generational signaling games
.
Artificial Life
,
23
(
3
),
406
423
.
24
Mathews
,
M. V.
, &
Pierce
,
J. R.
(
1988
).
Theoretical and experimental explorations of the Bohlen–Pierce scale
.
The Journal of the Acoustical Society of America
,
84
(
4
),
1214
1222
.
25
Merker
,
B. H.
,
Madison
,
G. S.
, &
Eckerdal
,
P.
(
2009
).
On the role and origin of isochrony in human rhythmic entrainment
.
Cortex
,
45
(
1
),
4
17
.
26
Miranda
,
E. R.
(
2003
).
On computational models of the evolution of music: From the origins of musical taste to the emergence of grammars
.
Contemporary Music Review
,
22
(
3
),
91
111
.
27
Ravignani
,
A.
,
Delgado
,
T.
, &
Kirby
,
S.
(
2016
).
Musical evolution in the lab exhibits rhythmic universals
.
Nature Human Behaviour
,
1
,
1
7
.
28
Ravignani
,
A.
,
Honing
,
H.
, &
Kotz
,
S. A.
(
2017
).
The evolution of rhythm cognition: Timing in music and speech
.
Frontiers in Human Neuroscience
,
11
,
1
8
.
29
Savage
,
P. E.
,
Brown
,
S.
,
Sakai
,
E.
, &
Currie
,
T. E.
(
2015
).
Statistical universals reveal the structures and functions of human music
.
Proceedings of the National Academy of Sciences of the U.S.A.
,
112
(
29
),
8987
8992
.
30
Tolbert
,
E.
(
2001
).
Music and meaning: An evolutionary story
.
Psychology of Music
,
29
(
1
),
84
94
.
31
Trehub
,
S. E.
(
2015
).
Cross-cultural convergence of musical features
.
Proceedings of the National Academy of Sciences of the U.S.A.
,
112
(
29
),
8809
8810
.
32
Verhoef
,
T.
(
2012
).
The origins of duality of patterning in artificial whistled languages
.
Language and Cognition
,
4
(
4
),
357
380
.
33
Verhoef
,
T.
,
Kirby
,
S.
, &
de Boer
,
B.
(
2014
).
Emergence of combinatorial structure and economy through iterated learning with continuous acoustic signals
.
Journal of Phonetics
,
43
,
57
68
.

Author notes

See Lumaca & Baggio (2017) [23].