Workshop on Multilingual Surface Realization

Melbourne, July 19th-20th 2018



The first workshop on multilingual surface realization aims at bringing together people who are interested in surface-oriented Natural Language Generation problems such as word order determination, inflection, functional word determination, paraphrasing, etc. It will accomodate for the presentation of the results of the Surface Realization Shared Task 2018 and of a number of technichal papers on the topic.

The workshop will be held at ACL'18 in Melbourne, Australia, on July 19th-20th 2018.

Call for papers

Natural Language Generation (NLG) is in the ascendant both as a stand-alone data-to-text or text-to-text task and as part of downstream applications (see, e.g., abstractive summarization, dialogue-based interaction, question answering, etc.). Only in 2017, three “deep” NLG shared tasks that focused on language generation from abstract semantic representations have been organized (although for English only): WebNLG, SemEval Task 9 , E2E. However, when compared to, e.g., parsing or machine translation, NLG still lags behind in terms of theoretical advances. Thus, while recent years witnessed a shift of the processing paradigm in these areas from traditional supervised machine learning techniques to deep learning techniques, NLG did not arrive there fully yet. Similarly, NLG still does not make full use of the available resources in the way, e.g., parsing does. For instance, the multilingual Universal Dependencies (UD) dataset has already been used for the CoNLL'17 parsing shared task. This dataset, which currently consists of 102 treebanks covering about 60 languages and can be downloaded freely, facilitates the development of large scale applications that work potentially across all of the UD treebank languages in a uniform fashion.

MSR-WS aims to change the situation and put NLG, and, in particular, surface generation, onto the main stream research agenda of Computational Linguistics, bringing together communities that hardly collaborated so far. It will provide a forum for the presentation of the results of the currently open multilingual Surface Realization Shared Task 2018 (SR’18) and of high quality papers on surface realization and related topics. SR’18 focuses on multilingual surface generation starting from UD treebanks. Since UDs are structures with a degree of abstraction that is targeted by state-of- the-art parsing, such that that the challenge to reverse neural network parsing algorithms for generation becomes a plausible research question, SR’18 solicits, apart from genuine generation approaches, contributions by the parsing community. SR’18 also aims to attract participants from other areas such as Computer Assisted Language Learning (and, in particular, grammatical error correction, since one of the tracks of the SR’18 is the generation of functional words such as bound prepositions and auxiliaries, whose correct introduction/omission is one of the primary challenges for language learners).

To complement the presentation of the SR’18 results, MSR-WS solicits contributions on all topics that are related to surface realization in NLG. Sought are presentations of cutting edge approaches that address problems of surface-oriented generation such as grammatical and/or information structure-driven word order determination, inflection, functional word determination, paraphrasing, etc. The presented works are expected to be a clear contribution to the progress in robust multilingual surface generation, i.e., be language-independent or easily portable from one language to another and clearly scalable. The topics of interest include, but are not limited to:

Publication: To encourage inclusiveness and the presentation of speculative and recent work, inclusion in the conference proceedings will be made optional. The author’s preference should be indicated with the final submission.

Shared Task (endorsed by SIGGEN)

Details for the Shared Task can be fond on the Task Page.

Important Dates


Programme Committee (to be completed)


Programme

The workshop will consist of technical presentations, a poster session with WS papers and ST systems, the presentation of the shared task results, a round table and an invited talk.

Invited talk: Hadar Shemtov – Generation and Dialog Specialist at Google, Head of NLG, dialog and summarization groups at Google

Registration

To be announced.

Organizers

Simon Mille TALN Pompeu Fabra University,
Barcelona, Spain
Bernd Bohnet Google Research,
London, UK
Leo Wanner TALN Pompeu Fabra University and ICREA,
Barcelona, Spain
Anya Belz University of Brighton
Brighton, UK
Emily Pitler Google Research,
New-York, NY, USA