Semantics is commonly defined as “the study of meaning” and pragmatics is generally referred to as “the study of meaning in context.” In other words, semantics deal with the sentence meaning (e.g., literal meaning of “How are you?”), whereas pragmatics is more concerned with the speaker/utterance meaning (e.g., the greeting meaning of “How are you?”). Both fields interact with each other and with other lower layers of language such as phonology, morphology, and syntax in many ways. Semantics and pragmatics have been established as research fields a long time ago and been widely studied ever since, as also evidenced by a number of introductory text books dating from the 1980s to today (Levinson 1983; Cruse 2000; Kadmon 2001; Birner 2012/2013; Kroeger 2018).
Previous introductory books are remarkably rich with linguistic theorems and provide a comprehensive introduction to the topic. However, the majority offer a linguistic perspective that may be hard to understand for young generations of NLP researchers because of the changing dynamics of the NLP field. As the field has been dominated with end-to-end data driven models, the linguistic phenomena that may help NLP researchers in various ways (e.g., more insightful analysis, more sensible evaluation techniques, more informed model designs, more diverse data collection/annotation schemes) have been mostly neglected. This book, authored by Emily Bender and Alex Lascarides, aims to fill this gap and intends to create a common language between linguists and today’s NLP researchers to ease collaboration. The book covers most of the key issues in semantics and pragmatics, ranging from “meaning of words” to “meaning of utterances in dialogue,” including the discourse structure and coherence relations for which not many resources were available. The book contains 14 chapters and 101 essentials/sections. Each chapter is organized as a collection of self-explanatory sections with sentence headers like “#10 There is ambiguity at all levels,”1 such that a summary of the book can easily be generated by concatenating the headers. The book is organized cleverly: Even though the chapters are kept connected throughout the book, most sections can still be grasped when read separately. Plenty of examples are provided in a variety of languages for each concept, leaving the reader with a feeling of appreciation and an increased level of awareness for linguistic diversity. The examples are chosen intelligently: They either emphasize the complexity or the predictability of the linguistic phenomena. Sections mostly provide hints for NLP researchers on how and when to use the linguistic property, or the specific cases that NLP researchers should be aware of. The book is written elegantly, however it would greatly benefit from using simpler language. I believe the following cases may complicate the understanding process in some parts of the book: (1) dense information in one single section, (2) long sentences that sometimes span four lines, and (3) undefined linguistic terminology. The audience of the book is broad, ranging from advanced undergraduate to graduate level readers who have a basic linguistic background and a genuine interest in natural language processing. As an NLP researcher with a computer science background, I really enjoyed reading the book and found it informative.
Chapter 1 introduces the concepts of semantics and pragmatics; and guides the readers on how semantics and pragmatics can help NLP researchers to build better Natural Language Understanding (NLU) and Natural Language Generation (NLG) systems. Chapter 2 starts with how “meaning” can be modeled with formal semantics, then introduces the three layers of “meaning.” The first layer is denoted as a layer purely derivable from the linguistic form (i.e., relying only on lexical and syntactic tools). The second layer contains discourse coherence and commonsense reasoning, namely, positioning the sentence in the given context via (1) resolving its discourse relation to previous utterances and (2) performing reasoning with commonsense knowledge. The final layer takes the cognitive states of the speaker and the interlocutor into account. Chapter 2 extends this definition of linguistic meaning to include emotional and social content and draws attention to the complex interactions between non-linguistics perception such as posture and linguistic meaning.
Chapter 3 starts with an overview of the field lexical semantics, which deals with the meaning of open class, that is, non-logic words that cannot be exploited by formal semantics. The subfields of lexical semantics—word senses, semantic roles, and multiword expressions—are further investigated in Chapters 4, 5, and 6 accordingly. Chapter 4 gives an overall picture of word senses and various ways they interact with each other, such as regular polysemy (e.g., fish as animal and food; book as the physical object and the content) and homonymy (e.g., bank as institution or mound of earth). It discusses a wide range of challenges that are raised by word senses such as sense changes through time, extensions via metaphors, and blocking of predictable changes by high frequency words (e.g., pig vs. pork). The chapter further introduces two not so commonly known but important linguistic processes: meaning transfer and defeasible dimensions of word senses. First is the shift in meaning due to the relation between semantic arguments of the predicate (e.g., we referring to people instead of cars in We are parked out back); second is the default interpretation for the arguments such as alcohol in I drank all night. Finally, the chapter argues the impacts of the phenomena on distributional semantics such as antonyms not being distinguished, confusion caused by high frequency words, and noise introduced by meaning transfer. Chapter 5 briefly introduces semantic roles and some of the available schemes that define them with different granularity such as VerbNet, FrameNet, and PropBank. The realization of semantic roles such as soft constraints (e.g., dance takes an animate argument) and implicit realization (e.g., marking the predicate with person information via inflection) are then discussed. Chapter 6 provides definitions for collocations and Multi Word Expressions (MWEs) along with some key properties such as being dependent on word forms (e.g., strong tea but not powerful tea) and being less ambiguous than individual word forms. Then, the authors make the connection to Chapter 4, since MWEs inherit many of the properties of individual words, such as having multiple senses and predictable patterns of new MWEs and meaning shift. Some of the challenges are then listed such as varying syntactic flexibility (i.e., some have fixed word order, some not) and representing the relation between the idiom and its parts.
Chapter 7 starts by defining the area of compositional semantics around predicate–argument structures and their derivational mechanisms using Boolean formulae, exemplifying how it helps resolve some of the syntactic ambiguities. The chapter discusses the challenges posed by quantifiers and other scopal operators like negation or adverbs, such as the difficulties inherent in resolving scope ambiguity and the variety of ways they are encoded in different languages. Furthermore, the authors elaborate on comparatives and coordinate structures, which are central to the “sentence meaning,” and discusses whether syntax provides enough clues to solve the issues they raise. Throughout the chapter, the links to previous chapters on lexical semantics are provided to explain how these two fields interact. The final subsection is dedicated to the relatively recent literature on distributional semantics approaches to “composing meaning,” ranging from the studies that solely rely on lexical information to works that make use of grammar theory.
Chapter 8 discusses how compositional semantics is not just made up of predicate–argument structures, but contains concepts that are realized within the grammar such as Tense, Aspect, Evidentiality, and Politeness. The authors provide plenty of examples in a variety of languages for each concept, with a historical overview when necessary. One learns about verbal inflections to denote the “habitual aspect” in Wambaya, an extensive evidentiality system in Foe language (different inflections for visual, non-visual, and four other evidence types) and the Japanese politeness markers that encode the social distance between the speaker and the addressee.
Chapter 9 goes beyond the sentences, and starts with challenges and the necessary elements of extracting meaning in discourse. The authors discuss how coherence relations structure the discourse and how lexical semantics interferes with discourse (e.g., an explanation sentence is expected after a psych verb such as annoy). Finally, the need for dynamic interpretation of discourse semantics (e.g., in cases when commonsense knowledge or logical deduction is required) is emphasized.
Chapter 10 starts with a general definition for reference resolution, that is, extending the common co-reference pronoun resolution definition to other types like reflexives and events and widening the range of relation between the expression and the referent from identical to semantically related (e.g., meal-salmon, car-engine). The authors demonstrate how and why reference resolution is crucial for NLP applications such as machine translation, information extraction, and dialogue systems. The challenges of reference resolution are discussed in detail: how the type of the referent, grammatical features of the expression, the logical form of the sentence, or the discourse structure affect the process. Chapter 11 introduces “presuppositions” by clearly defining their relation to entailment, another type of implied information, and they introduce a test to identify presuppositions. The chapter continues with various types of “presupposition triggers,” namely, linguistic phrases that commonly introduce presuppositions such as implicative verbs (e.g., to lie presupposes a saying event), proper names (e.g., Kim slept presupposes the existence of Kim), and many more. It then discusses the specific cases of complex sentences such as the situations when a presupposition is completely or partially shared or not shared at all among sub-sentences. Finally, the strong ties between presupposition and anaphora as well as presupposition and discourse coherence are discussed in detail.
Chapter 12 tackles the topic of “information status,” which can be defined in simpler terms as the ratio of “newness” of the information. The authors discuss the variety of ways the “information status” is reflected by means of morphology and syntax among a diverse set of languages. The chapter then introduces the term “information structure,” and how it distinguishes itself from “information status,” that is, status deals with referents (e.g., a dog-the dog) whereas the structure handles propositional content (e.g., who gave the talk-a professor from Darmstadt). They then dive deeper into the “information structure” and describe its basic components such as topic, background, focus, and contrast. Similar to previous chapters, the authors draw attention to the various ways the structure is marked in different languages such as lexical markers, syntactic positioning, and intonation. The chapter ends with links to formal semantics.
Chapter 13 goes beyond the explicit expressions, that is, “sentence meaning,” and starts tackling the underlying implied meaning, that is, “speaker meaning.” The authors then define the gap between the speaker and the sentence meaning as an “implicature” and later elaborate on two types of implicatures: conversational and conventional. They then discuss the role of the speaker’s cognitive state in recognizing implicatures (i.e., when it interferes) by providing a historical background on the topic. The authors then emphasize that the ability to construct valid semantic representations does not guarantee entailment, that is, these representations require consistency checks in some particular cases. The authors discuss that implicatures can be rejected or accepted, and the rejections and agreements can be explicit or implicit (e.g., via silence, intonation, or other implicatures).
Chapter 14 introduces a wide range of multilingual resources available in literature on the covered topics, such as lexical resources for word senses and semantic roles, as well as resources with sentence-level semantic annotations and various corpora with discourse information. The chapter not only provides references to these resources, but also discusses their similarities and differences in a chronological order. Furthermore, the authors introduce some of the available tools and software packages for semantic parsing.
To summarize, the book by Bender and Lascarides is a one-of-a-kind reference book for NLP researchers, containing most of the fundamental phenomena in semantics and pragmatics. It serves a purpose of raising linguistic awareness and providing entry points to complex topics for NLP researchers. It is also worth noting that the book contains valuable references for further reading. I share the same hope with the authors that this book will “facilitate the collaboration between linguists and NLP researchers.”
The section indexing starts from zero, giving the readers with computer science background a feeling of familiarity.