-->
Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)
Fully Virtual Workshop(Workshop Program)The Text Simplification, Accessibility, and Readability (TSAR) workshop aims at bringing together researchers, developers and industries of assistive technologies, public organizations representatives, and other parties interested in the problem of making information more accessible to all citizens. We will discuss recent trends and developments in the area of automatic text simplification, automatic readability assessment, language resources and evaluation for text simplification, etc.. The workshop will be an online event or hybrid event (depending on the evolution of the COVID pandemic) held during the EMNLP-2022 conference on 8 of December, 2022.
Web provides an abundance of knowledge and information that can reach large populations. However, the way in which a text is written (vocabulary, syntax, or text organization/structure), or presented, can make it inaccessible for many people, especially for non-native speakers, people with low literacy, and people with some type of cognitive or linguistic impairments. The results of Adult Literacy Survey (OECD, 2023) indicate that approximately 16.7% of adult population (averaged over 24 highly-developed countries) requires lexical, 50% syntactic, and 89.4% conceptual simplification of everyday texts (Štajner, 2021).
Research on automatic text simplification (TS), textual accessibility, and readability thus have the potential to improve social inclusion of marginalized populations. These related research areas have increasingly attracted more and more attention in the past ten years, evidenced by the growing number of publications in NLP conferences. While only about 300 articles in Google Scholar mentioned TS in 2010, this number has increased to about 600 in 2015 and greater than 1000 in 2020 (Štajner, 2021).
Recent research in automatic text simplification has mostly focused on proposing the use of methods derived from the deep learning paradigm (Glavaš and Štajner, 2015; Paetzold and Specia, 2016; Nisioi et al., 2017; Zhang and Lapata, 2017; Martin et al., 2020; Maddela et al., 2021; Sheang and Saggion, 2021). However, there are many important aspects of the automatic text simplification that need the attention of our community: the design of appropriate evaluation metrics, the development of context-aware simplification solutions, the creation of appropriate language resources to support research and evaluation, the deployment of simplification in real environments for real users, the study of discourse factors in text simplification, the identification of factors affecting the readability of a text, etc. To overcome those issues, there is a need for collaboration of CL/NLP researchers, machine learning and deep learning researchers, UI/UX and Accessibility professionals, as well as public organizations representatives (Štajner, 2021).
The proposed TSAR workshop builds upon the recent success of several regional workshops that covered a subset of our topics of interest, including the SEPLN 2021 Current Trends in Text Simplification (CTTS) and the SimpleText workshop at CLEF 2021, as well as the birds-of-a-feather event on Text Simplification at NAACL 2021 (over 50 participants).
The TSAR workshop aims to foster collaboration among all parties interested in making information more accessible to all people. Through the two invited talks, a shared task on lexical simplification, the round table discussion, regular oral and poster presentations of workshop papers, we will discuss recent trends and developments in the area of automatic text simplification, text accessibility, automatic readability assessment, language resources and evaluation for text simplification, etc.
We welcome two types of papers: long papers and short papers. Submissions should be made to the Softconf submission management system: https://softconf.com/emnlp2022/tsar. The papers should present novel research. The review will be double blind and thus all submissions should be anonymized.
Rochester Institute of Technology (RIT)
National Research Council of Canada
Thursday, December 8, 2022 (GMT+4 - Abu Dhabi time zone) | |
09:30 - 09:45 | Opening Remarks |
09:45 - 10:30 | Session 1 |
09:45 - 10:00 |
Parallel Corpus Filtering for Japanese Text Simplification |
10:00 - 10:15 |
Patient-friendly Clinical Notes: Towards a new Text Simplification Dataset |
10:15 - 10:30 |
IrekiaLF_es: a New Open Benchmark and Baseline Systems for Spanish AutomaticText Simplification |
10:30 - 11:00 | Coffee Break |
11:00 - 12:30 | Session 2 |
11:00 - 11:15 |
Lexically Constrained Decoding with Edit Operation Prediction for Controllable Text Simplification |
11:15 - 11:30 |
(Psycho-)Linguistic Features Meet Transformer Models for Improved Explainable and Controllable Text Simplification |
11:30 - 11:45 |
A Dataset of Word-Complexity Judgements from Deaf and Hard-of-Hearing Adults for Text Simplification |
11:45 - 12:00 |
Eye-tracking based classification of Mandarin Chinese readers with and without dyslexia using neural sequence models |
12:00 - 12:15 |
Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification |
12:15 - 12:30 |
UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical Simplification? |
12:30 - 14:00 | Lunch Break |
14:00 - 15:30 | Session 3 (Posters) |
15:30 - 16:00 | Coffee Break |
16:00 - 16:30 | Round Table |
Rémi Cardon, CENTAL, Université catholique de Louvain | |
Fernando Alva-Manchego, Cardiff NLP, Cardiff University | |
Sian Gooding, Natural Language and Information Processing group, University of Cambridge |
|
16:30 - 17:30 | Invited talk 1: Matt Huenerfauth - Abstract and Bio |
17:30 - 17:45 | Coffee Break |
17:45 - 18:45 | Invited talk 2: Sowmya Vajjala - Abstract and Bio |
18:45 - 19:00 | Closing Statements |
Workshop Papers | |
The Fewer Splits are Better: Deconstructing Readability in Sentence Splitting (Poster) |
|
Parallel Corpus Filtering for Japanese Text Simplification (Oral) |
|
Patient-friendly Clinical Notes: Towards a new Text Simplification Dataset (Oral) |
|
Target-Level Sentence Simplification as Controlled Paraphrasing (Poster) |
|
Conciseness: An Overlooked Language Task (Poster) |
|
Revision for Concision: A Constrained Paraphrase Generation Task (Poster) |
|
Controlling Japanese Machine Translation Output by Using JLPT Vocabulary Levels (Poster) |
|
IrekiaLF_es: a New Open Benchmark and Baseline Systems for Spanish Automatic Text Simplification (Oral) |
|
Lexical Simplification in Foreign Language Learning: Creating Pedagogically Suitable Simplified Example Sentences (Poster) |
|
Eye-tracking based classification of Mandarin Chinese readers with and without dyslexia using neural sequence models (Oral) |
|
A Dataset of Word-Complexity Judgements from Deaf and Hard-of-Hearing Adults for Text Simplification (Oral) |
|
(Psycho-)Linguistic Features Meet Transformer Models for Improved Explainable and Controllable Text Simplification (Oral) |
|
Lexically Constrained Decoding with Edit Operation Prediction for Controllable Text Simplification (Oral) |
|
An Investigation into the Effect of Control Tokens on Text Simplification (Poster) |
|
Divide-and-Conquer Text Simplification by Scalable Data Enhancement (Poster) |
|
Improving Text Simplification with Factuality Error Detection (Poster) |
|
JADES: New Text Simplification Dataset in Japanese Targeted at Non-Native Speakers (Poster) |
|
A Benchmark for Neural Readability Assessment of Texts in Spanish (Poster) |
|
Controllable Lexical Simplification for English (Poster) |
|
Shared Task Papers | |
Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification (Oral) |
|
CILS at TSAR-2022 Shared Task: Investigating the Applicability of Lexical Substitution Methods for Lexical Simplification (Poster) |
|
PresiUniv at TSAR-2022 Shared Task: Generation and Ranking of Simplification Substitutes of Complex Words in Multiple Languages (Poster) |
|
UoM&MMU at TSAR-2022 Shared Task: Prompt Learning for Lexical Simplification (Poster) |
|
PolyU-CBS at TSAR-2022 Shared Task: A Simple, Rank-Based Method for Complex Word Substitution in Two Steps (Poster) |
|
CENTAL at TSAR-2022 Shared Task: How Does Context Impact BERT-Generated Substitutions for Lexical Simplification? (Poster) |
|
teamPN at TSAR-2022 Shared Task: Lexical Simplification using Multi-Level and Modular Approach (Poster) |
|
MANTIS at TSAR-2022 Shared Task: Improved Unsupervised Lexical Simplification with Pretrained Encoders (Poster) |
|
UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical Simplification? (Oral) |
|
RCML at TSAR-2022 Shared Task: Lexical Simplification With Modular Substitution Candidate Ranking (Poster) |
|
GMU-WLV at TSAR-2022 Shared Task: Evaluating Lexical Simplification Models (Poster) |
Sanja Štajner
NLP Researcher, Germany
Horacio Saggion
Chair in Computer Science and Artificial Intelligence and Head of the LaSTUS Lab in the TALN-DTIC, Universitat Pompeu Fabra
Wei Xu
Assistant Professor at School of Interactive Computing, Georgia Institute of Technology
Marcos Zampieri
Assistant Professor at the Rochester Institute of Technology
Matthew Shardlow
Senior Lecturer at Manchester Metropolitan University
Daniel Ferrés
Post-Doctoral Research Assistant at LaSTUS Lab. at TALN-DTIC, Universitat Pompeu Fabra
Kai North
Ph.D. student at the Rochester Institute of Technology
Kim Cheng Sheang
PhD student at LaSTUS Lab. at TALN-DTIC, Universitat Pompeu Fabra