Invited Talk 2 at TSAR-2022: Sowmya Vajjala

Title: Beyond the state-of-the-art models: What is complex text, and what are we simplifying?


We have seen over two decades of NLP research on readability assessment and text simplification by now. But, what do we really mean by “readability”, and how is a “simplified” text different from an unsimplified one? In this talk, I will try to explore this question by looking into relevant literature in education and psychology research, and attempt to connect them with NLP research. I will also explore whether the current explainable AI research will help in addressing this question. Through this **non-technical** talk, I hope to initiate a discussion on what else should we be doing apart from building state of the art readability and simplification models with standard datasets.

Relevant Readings

1. Sowmya Vajjala. 2022. Trends, Limitations and Open Challenges in Automatic Readability Assessment Research. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5366–5377, Marseille, France. European Language Resources Association.
2. Sanja Stajner. 2021.Automatic Text Simplification for Social Good: Progress and Challenges. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2637–2652, Online. Association for Computational Linguistics.

Short Bio

Sowmya Vajjala is a researcher in the Multilingual Text Processing group , within the Digital Technologies Research Center at National Research Council, Canada. She has worked extensively on automatic readability assessment in the past, and is currently interested in developing and studying methods to understand the generalizability of NLP systems. She is also a co-author of “Practical Natural Language Processing”, published by O’Reilly Media (2020).