MPG at DisCoTeX: Predicting Text Coherence by Tree-based Modelling of Linguistic Features
Authors: Galletti, M., Gravino, P., & Prevedello, G.
In Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2023)
Automatic text coherence modelling plays a crucial role in natural language processing tasks, such as machine translation, summarisation, and question answering. Moreover, text coherence is fundamental to reading comprehension and readers’ engagement, essential to a number of application domains. In this report, we report progress for the Assessing Discourse Coherence in Italian Texts task from EVALITA-23, whose goal is to address automatic coherence detection. The task was challenged by extracting linguistic features used to train a machine learning classifier, leading to minor improvement over the baseline. The feature importance analysis revealed semantic features’ relevance, providing indications for future feature engineering and modelling efforts.