Marie-Catherine de Marneffe: Do you know that there’s still a chance? Identifying speaker commitment for natural language understanding
Ohio State University
When we communicate, we infer a lot beyond the literal meaning of the words we hear or read. In particular, our understanding of an utterance depends on assessing the extent to which the speaker stands by the event she describes. An unadorned declarative like “The cancer has spread” conveys firm speaker commitment of the cancer having spread, whereas “There are some indicators that the cancer has spread” imbues the claim with uncertainty. It is not only the absence vs. presence of embedding material that determines whether or not a speaker is committed to the event described: from (1) we will infer that the speaker is committed to there being war, whereas in (2) we will infer the speaker is committed to relocating species not being a panacea, even though the clauses that describe the events in (1) and (2) are both embedded under “(s)he doesn’t believe”.
(1) The problem, I’m afraid, with my colleague here, he really doesn’t believe that it’s war.
(2) Transplanting an ecosystem can be risky, as history shows. Hellmann doesn’t believe that relocating species threatened by climate change is a panacea.
In this talk, I will first illustrate how looking at pragmatic information of what speakers are committed to can improve NLP applications. Previous work has tried to predict the outcome of contests (such as the Oscars or elections) from tweets. I will show that by distinguishing tweets that convey firm speaker commitment toward a given outcome (e.g., “Dunkirk will win Best Picture in 2018”) from ones that only suggest the outcome (e.g., “Dunkirk might have a shot at the 2018 Oscars”) or tweets that convey the negation of the event (“Dunkirk is good but not academy level good for the Oscars”), we can outperform previous methods. Second, I will evaluate current models of speaker commitment, using the CommitmentBank, a dataset of naturally occurring discourses developed to deepen our understanding of the factors at play in identifying speaker commitment. We found that a linguistically informed model outperforms a LSTM-based one, suggesting that linguistic knowledge is needed to achieve robust language understanding. Both models however fail to generalize to the diverse linguistic constructions present in natural language, highlighting directions for improvement.
Marie-Catherine de Marneffe is an Associate Professor in Linguistics at The Ohio State University. She received her PhD from Stanford University in December 2012 under the supervision of Christopher D. Manning. She is developing computational linguistic methods that capture what is conveyed by speakers beyond the literal meaning of the words they say. Primarily she wants to ground meanings in corpus data, and show how such meanings can drive pragmatic inference. She has also worked on Recognizing Textual Entailment and contributed to defining the Stanford Dependencies and the Universal Dependencies representations. She is the recipient of a Google Research Faculty award, NSF CRII award and recently a NSF CAREER award. She serves as a member of the NAACL board.
Learning to communicate in natural language in one of the unique human abilities which are at the same time extraordinarily important and extraordinarily difficult to reproduce in silico. Substantial progress has been achieved in some specific data-rich and constrained cases such as automatic speech recognition or machine translation. However the general problem of learning to use natural language with weak and noisy supervision in a grounded setting is still open. In this talk I will present recent work which addresses this challenge using deep recurrent neural network models. I will then focus on analytical methods which allow us to better understand the nature and localization of representations emerging in such architectures.
Grzegorz is an Assistant Professor at the Department of Cognitive Science and Artificial Intelligence at Tilburg University. Previously he did postdoctoral research at the Spoken Language Systems group at Saarland University. He received his doctoral degree from the School of Computing at Dublin City University in 2008. In his recent research he has focused on computational models of language learning from multimodal signals such as speech and vision and on the analysis and interpretability of representations emerging in multilayer recurrent neural networks. He regularly serves on program committees of major NLP and AI conferences, workshops and journals. He was an area chair at ACL 2017 (Machine Learning) and at EMNLP 2018 (Multimodal NLP and Speech), a general chair for Benelearn 2018, and co-organizer of BlackboxNLP 2018 (workshop on analyzing and interpreting neural networks for NLP).