Invited Talk 1 at CTTS-2021: Aline Villavicencio

Title: What if the whole is greater than the sum of the parts? Modelling Complex (Multiword) Expressions

Abstract

Multiword Expressions (MWEs) such as idioms (make ends meet), light verb constructions (give a sigh), verb particle constructions (shake up) and noun compounds (loan shark), are an integral part of the mental lexicon of native speakers often used to express complex ideas in a simple and conventionalised way accepted by a given linguistic community. As they may display a wealth of idiosyncrasies, from lexical, syntactic and semantic to statistical, they have represented a real challenge for natural language processing. However, their accurate integration has the potential for improving the precision, naturalness and fluency of downstream tasks like text simplification. In this talk I will present an overview of advances in the identification and modelling of MWEs. I will concentrate on techniques for identifying their degree of idiomaticity and approximating their meaning, as their interpretation often needs more knowledge than can be gathered from their individual components and their combinations to differentiate combinations whose meaning can be (partly) inferred from their parts (as apple juice: juice made of apples) from those that cannot (as dark horse: an unknown candidate who unexpectedly succeeds). In particular, I will discuss results obtained with the use of contextualised word representation models, which have been successfully used for capturing different word usages, and therefore could provide an attractive alternative for representing idiomaticity in language.

Short Bio

Aline Villavicencio is the Chair in Natural Language Processing at the Department of Computer Science, University of Sheffield (UK) and affiliated to the Institute of Informatics, Federal University of Rio Grande do Sul (Brazil). She received her PhD from the University of Cambridge (UK) in 2001, and held postdoc positions at the University of Cambridge and University of Essex (UK). She was a Visiting Scholar at the Massachusetts Institute of Technology (USA, 2011-2012 and 2014-2015), at the École Normale Supé­rieure (France, 2014), an Erasmus-Mundus Visting Scholar at Saarland University (Germany in 2012/2013) and at the University of Bath (UK, 2006-2009). She held a Research Fellowship from the Brazilian National Council for Scientific and Technological Development (Brazil, 2009-2017). She is a member of the editorial board of Computational Linguistics, TACL and of JNLE, is the PC Co-Chair of ACL 2022, and was the PC chair of CoNLL-2019, Senior Area Chair for ACL-2020 and ACL-2019 among others and General co-chair for the 2018 International Conference on Computational Processing of Portuguese. She is also a member of the NAACL board, SIGLEX board and of the program committees of various *ACL and AI conferences, and has co-chaired several *ACL workshops on Cognitive Aspects of Computational Language Acquisition and on Multiword Expressions. Her research interests include lexical semantics, multilinguality, multiword expressions and cognitively motivated NLP, and has co-edited special issues and books dedicated to these topics.